We present JanusFlow, a powerful framework that unifies image understanding and generation in a single model. JanusFlow introduces a minimalist architecture that integrates autoregressive language models with rectified flow, a state-of-the-art method in generative modeling. Our key finding demonstrates that rectified flow can be straightforwardly trained within the large language model framework, eliminating the need for complex architectural modifications.
新活是一个支持图像多模态的 LLM,成功把图像生成和理解在单个模型中统一起来(不像其它大模型生成图片都调用什么 SD Flux 啥的其它模型)
训练方式是传统 预训练 & SFT,没有用强化学习。这个模型比较小,只有 7b 参数量,大家可以随意本地运行,看这个 Series 估计先 PoC 以后后面再搞个大的。看技术报告里面全面打爆同参数量模型,技术报告还没上传,传了再看。
现在预定的链接:
线上 Playground(还没做):Chat With Janus-Pro-7B - a Hugging Face Space by deepseek-ai
技术报告(还没传):Janus/janus_pro_tech_report.pdf at main · deepseek-ai/Janus · GitHub
DeepSeek 到底在干嘛,除夕也有新活,这也卷??感觉可以给 DS 磕两个
再这样下去别人的新模型就要比不上baseline了
