-
Notifications
You must be signed in to change notification settings - Fork 1.9k
[TRTLLM-8814][feat] AutoDeploy: Use TRTLLM kernels for FP8 linear #8820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TRTLLM-8814][feat] AutoDeploy: Use TRTLLM kernels for FP8 linear #8820
Conversation
📝 WalkthroughWalkthroughThe Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15–25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (3)**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
**/*.py📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
🧠 Learnings (6)📓 Common learnings📚 Learning: 2025-10-20T16:54:09.824ZApplied to files:
📚 Learning: 2025-10-20T17:09:21.560ZApplied to files:
📚 Learning: 2025-09-23T15:13:48.819ZApplied to files:
📚 Learning: 2025-09-19T21:28:13.751ZApplied to files:
📚 Learning: 2025-08-27T16:59:12.325ZApplied to files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
looks like it also fixes #8814 . Can we confirm? |
edit: never mind. I confused your comment to be about #8811 |
Signed-off-by: Chenghao Zhang <[email protected]>
Signed-off-by: Chenghao Zhang <[email protected]>
Signed-off-by: Chenghao Zhang <[email protected]>
Signed-off-by: Lucas Liebenwein <[email protected]>
Signed-off-by: Lucas Liebenwein <[email protected]>
42d293a to
8c487c9
Compare
Signed-off-by: Lucas Liebenwein <[email protected]>
lucaslie
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's think a bit more about this PR before we merge it, see discussion here as well: https://nvidia.slack.com/archives/C08T55LHSG4/p1761886176295439?thread_ts=1761854055.288739&cid=C08T55LHSG4
Let's think a bit more about this PR before we merge it, see discussion here as well: https://nvidia.slack.com/archives/C08T55LHSG4/p1761886176295439?thread_ts=1761854055.288739&cid=C08T55LHSG4
Signed-off-by: nvchenghaoz <[email protected]>
Signed-off-by: nvchenghaoz <[email protected]>
|
/bot run |
|
PR_Github #23427 [ run ] triggered by Bot. Commit: |
|
PR_Github #23427 [ run ] completed with state |
tests/unittest/_torch/auto_deploy/unit/singlegpu/custom_ops/test_quant.py
Show resolved
Hide resolved
|
/bot run |
|
PR_Github #23529 [ run ] triggered by Bot. Commit: |
|
PR_Github #23529 [ run ] completed with state |
|
/bot run |
1 similar comment
|
/bot run |
|
PR_Github #23678 [ run ] triggered by Bot. Commit: |
|
PR_Github #23678 [ run ] completed with state |
Summary by CodeRabbit