we're launching 🤗 yourbench today, an open source tool for custom benchmarking and synthetic data generation from ANY of your documents. it's a big step towards improving how model evaluations work
early access link in replies!
(1/8)
Nous Research makes some of the best tasting models ever.
It's singular in how "alive" it feels. The only model that comes close is the original claude 3 opus.
Best of all? It's open.
I encourage you all to try it for yourselves. Nebius has a generous free tier.
imo okay to release 8 - 405B, and maybe keep 2T to themselves, but abandoning is really really bad
its a +1 to the OpenAI, Anthropic, Google list and its hard to differentiate
In just ONE line of code, you can use the
new gpt-oss model
to turn
your messy raw data (pdf, word, xlsx)
into a clean, strong eval set (to test any LLM!)
with yourbench!
(link in comments)!!
They’re basically selling to the cope market, which wants to hear that AI is useless.
They’ll be left in the dust when people with a competitive advantage due to stronger models become the norm.
I don’t know why you think this is a bad thing. Those who know models, will know.