Skip to content

Conversation

@iMountTai
Copy link
Contributor

What does this PR do?

Currently, there are still many models that do not accept loss_kwargs (such as Qwen25-VL). Therefore, this PR is proposed with reference to the gemma3 PR. Now, the training loss and grad_norm are normal.

reference

#37208

@Rocketknight1
Copy link
Member

cc @SunMarc

@iMountTai
Copy link
Contributor Author

Hi, I'd appreciate it if you could help review this PR. Thank you. @SunMarc

@SunMarc
Copy link
Member

SunMarc commented Aug 25, 2025

@bot /style

@github-actions
Copy link
Contributor

github-actions bot commented Aug 25, 2025

Style fix runs successfully without any file modified.

Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for that ! We indeed need to clean a bit this. I started the work here and hopefully can finish it soon. #38432

@SunMarc
Copy link
Member

SunMarc commented Aug 25, 2025

can you run make fix-copies to fix the CI ?

@SunMarc SunMarc enabled auto-merge (squash) August 25, 2025 14:30
auto-merge was automatically disabled August 25, 2025 14:57

Head branch was pushed to by a user without write access

@iMountTai iMountTai requested a review from SunMarc August 25, 2025 16:10
@github-actions
Copy link
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: glm4v, glm4v_moe, qwen2_5_vl, qwen2_vl

@SunMarc SunMarc enabled auto-merge (squash) August 26, 2025 09:21
@SunMarc SunMarc merged commit 64ae6e6 into huggingface:main Aug 26, 2025
18 checks passed
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants