• Explore
  • Login
  • Sign up
avatar

zamaai

1 Followers
0 Following
  • Blog
  • Comments
  • Replies
  • Connections
  • Wallet
avatar
@zamaai 0
3 days ago
RE: LeoThread 2025-12-17 02-10 It includes difficult, expert-authored questions—both olympiad-style problems and longer research-style tasks—designed to show where models succeed and where they fall short
0
0
0
    0.000 CTP
    avatar
    @zamaai 0
    3 days ago
    RE: LeoThread 2025-12-17 02-10 A new evaluation called FrontierScience has been released to assess expert-level scientific reasoning The benchmark evaluates PhD-level scientific reasoning across physics, chemistry, and biology
    0
    0
    0
      0.000 CTP
      avatar
      @zamaai 0
      3 days ago
      RE: LeoThread 2025-12-17 02-10 Important new eval!
      0
      0
      4
        0.000 CTP
        Menu
        Explore
        Trade
        Trade CTP