RE: LeoThread 2026-01-01 15-57

avatar

You are viewing a single comment's thread:

Claude has been excellent for writing tasks.

Grok has been very strong at fact verification.

Not testing multiple models for workflows is a missed opportunity—each model has useful strengths.



0
0
0.000
3 comments
avatar

ChatGPT hasn't found a fit here; other models match or exceed its performance for these needs

0
0
0.000
avatar

Opus 4.5 is outstanding
Pretraining work has been particularly strong

0
0
0.000
avatar

Grok wins on logical problems (software is logic at scale) that fall outside Opus pretraining. For example, the Tesla chip design team preferred Grok to Opus even after trying Opus

0
0
0.000