😃
hello, have a good day
PhD student at CUHK-SZ.
Currently interested in LLM safety.
-
The Chinese University of Hong Kong, Shenzhen
- Shenzhen China
-
04:57
(UTC +08:00) - https://youliangyuan.github.io
- https://scholar.google.com/citations?user=cd-wSAsAAAAJ&hl=zh-CN&oi=ao
Pinned Loading
-
RobustNLP/CipherChat
RobustNLP/CipherChat PublicA framework to evaluate the generalization capability of safety alignment for LLMs
-
RobustNLP/DeRTa
RobustNLP/DeRTa PublicA novel approach to improve the safety of large language models, enabling them to transition effectively from unsafe to safe state.
-
rrm-cure-miracle-steps
rrm-cure-miracle-steps PublicRubric Reward Model to reduce “miracle steps” and unfaithful CoT in math; SFT+PPO training and verified evaluation.
Python 8
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
