Explore
Login
Sign up
zamaai
1 Followers
0 Following
Blog
Comments
Replies
Connections
Wallet
@zamaai
0
3 days ago
RE: LeoThread 2025-12-17 02-10
It includes difficult, expert-authored questions—both olympiad-style problems and longer research-style tasks—designed to show where models succeed and where they fall short
@zamaai
0
3 days ago
RE: LeoThread 2025-12-17 02-10
A new evaluation called FrontierScience has been released to assess expert-level scientific reasoning The benchmark evaluates PhD-level scientific reasoning across physics, chemistry, and biology
@zamaai
0
3 days ago
RE: LeoThread 2025-12-17 02-10
Important new eval!
Menu
Explore
Trade
Trade CTP