


default search action
Lei Xie 0001
Person information
- affiliation: Northwestern Polytechnical University, School of Computer Science, Xi'an, China
- affiliation (2006 - 2007): The Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Hong Kong
- affiliation (2004 - 2006): City University of Hong Kong, School of Creative Media, Hong Kong
- affiliation (PhD 2004): Northwestern Polytechnical University, Xi'an, China
- affiliation (2001 - 2002): Vrije Universiteit Brussel, Department of Electronics and Information Processing, Belgium
Other persons with the same name
- Lei Xie — disambiguation page
- Lei Xie 0002 — Xi'an Jiaotong University, China
- Lei Xie 0003
— Zhejiang University, College of Information Science and Electronic Engineering, Hangzhou, China - Lei Xie 0004
— Nanjing University, State Key Laboratory for Novel Software Technology, China - Lei Xie 0005
— Delft University of Technology, Laboratory of Computer Engineering, The Netherlands - Lei Xie 0006
— City University of New York, Department of Computer Science, Hunter College, NY, USA (and 1 more) - Lei Xie 0007
— Zhejiang University, State Key Laboratory of Industrial Control Technology, Hangzhou, China (and 2 more) - Lei Xie 0008
— Air Force Engineering University, Institute of Aeronautics Engineering, Department of weapon science and technology, China - Lei Xie 0009
— Hong Kong University of Science and Technology, Department of Electronic and Computer Science, Hong Kong (and 1 more) - Lei Xie 0010
— Central South University, School of Geosciences and Info-Physics, Changsha, China (and 2 more)
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2026
[j86]Lei Xie
, Jiahao Huang, Jiawei Zhang
, Jianzhong He, Yiang Pan, Guoqiang Xie, Mengjun Li, Qingrun Zeng
, Mingchu Li
, Yuanjing Feng
:
Automated Mapping the Pathways of Cranial Nerve II, III, V, and VII/VIII: A Multi-Parametric Multi-Stage Diffusion Tractography Atlas. IEEE Trans. Biomed. Eng. 73(2): 891-901 (2026)
[c322]Kangxiang Xia, Xinfa Zhu, Jixun Yao, Wenjie Tian, Wenhao Li, Lei Xie:
KALL-E: Autoregressive Speech Synthesis with Next-Distribution Prediction. AAAI 2026: 34016-34024
[i271]Zhixian Zhao, Shuiyuan Wang, Guojian Li, Hongfei Xue, Chengyou Wang, Shuai Wang, Longshuai Xiao, Zihan Zhang, Hui Bu, Xin Xu, Xinsheng Wang, Hexin Liu, Eng Siong Chng, Hung-yi Lee, Haizhou Li, Lei Xie:
The ICASSP 2026 HumDial Challenge: Benchmarking Human-like Spoken Dialogue Systems in the LLM Era. CoRR abs/2601.05564 (2026)
[i270]Guobin Ma, Yuxuan Xia, Jixun Yao, Huixin Xue, Hexin Liu, Shuai Wang, Hao Liu, Lei Xie:
The ICASSP 2026 Automatic Song Aesthetics Evaluation Challenge. CoRR abs/2601.07237 (2026)
[i269]Chengyou Wang, Mingchen Shao, Jingbin Hu, Zeyu Zhu, Hongfei Xue, Bingshen Mu, Xin Xu, Xingyi Duan, Binbin Zhang, Pengcheng Zhu, Chuang Ding, Xiaojun Zhang, Hui Bu, Lei Xie:
WenetSpeech-Wu: Datasets, Benchmarks, and Models for a Unified Chinese Wu Dialect Speech Processing Ecosystem. CoRR abs/2601.11027 (2026)
[i268]Wenjie Tian, Bingshen Mu, Guobin Ma, Xuelong Geng, Zhixian Zhao, Lei Xie:
dLLM-ASR: A Faster Diffusion LLM-based Framework for Speech Recognition. CoRR abs/2601.17902 (2026)
[i267]Bingshen Mu, Xian Shi, Xiong Wang, Hexin Liu, Jin Xu, Lei Xie:
LLM-ForcedAligner: A Non-Autoregressive and Accurate LLM-Based Forced Aligner for Multilingual and Long-Form Speech. CoRR abs/2601.18220 (2026)
[i266]Zhixian Zhao, Wenjie Tian, Xiaohai Tian, Jun Zhang, Lei Xie:
Integrating Fine-Grained Audio-Visual Evidence for Robust Multimodal Emotion Reasoning. CoRR abs/2601.18321 (2026)- 2025
[j85]Lei Xie, Huajun Zhou, Junxiong Huang, Qingrun Zeng, Jiahao Huang, Jianzhong He, Jiawei Zhang, Baohua Fan, Mingchu Li, Guoqiang Xie, Hao Chen, Yuanjing Feng
:
An arbitrary-modal fusion network for volumetric cranial nerves tract segmentation. Comput. Medical Imaging Graph. 125: 102635 (2025)
[j84]Qingrun Zeng, Lin Yang, Yongqiang Li, Lei Xie, Yuanjing Feng:
RGVPSeg: multimodal information fusion network for retinogeniculate visual pathway segmentation. Medical Biol. Eng. Comput. 63(5): 1397-1411 (2025)
[j83]Qingrun Zeng, Ze Xia, Jiahao Huang, Lei Xie, Jiawei Zhang, Shengwei Huang, Zhengqiu Xing, Qichuan ZhuGe, Yuanjing Feng
:
Corticospinal tract reconstruction with tumor by using a novel direction filter based tractography method. Medical Biol. Eng. Comput. 63(10): 2889-2901 (2025)
[j82]Zige Wang, Yashuai Wang
, Tianyu Liu
, Peng Zhang
, Lei Xie
, Yangming Guo
:
Audio-Driven Talking Face Generation With Segmented Static Facial References for Customized Health Device Interactions. IEEE Trans. Consumer Electron. 71(2): 5404-5413 (2025)
[c321]Ziqian Ning, Shuai Wang, Yuepeng Jiang, Jixun Yao, Lei He, Shifeng Pan, Jie Ding, Lei Xie:
Drop the Beat! Freestyler for Accompaniment Conditioned Rapping Voice Generation. AAAI 2025: 24966-24974
[c320]Jixun Yao, Yuguang Yang, Yu Pan, Ziqian Ning, Jianhao Ye, Hongbin Zhou, Lei Xie:
StableVC: Style Controllable Zero-Shot Voice Conversion with Conditional Flow Matching. AAAI 2025: 25669-25677
[c319]Yuguang Yang, Yu Pan, Jixun Yao, Xiang Zhang, Jianhao Ye, Hongbin Zhou, Lei Xie, Lei Ma, Jianjun Zhao:
Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling. ACL (1) 2025: 1731-1742
[c318]Boyi Kang, Xinfa Zhu, Zihan Zhang, Zhen Ye, Mingshuai Liu, Ziqian Wang, Yike Zhu, Guobin Ma, Jun Chen, Longshuai Xiao, Chao Weng, Wei Xue, Lei Xie:
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement. ACL (1) 2025: 13292-13305
[c317]Hanke Xie, Dake Guo, Chengyou Wang, Yue Li, Wenjie Tian, Xinfa Zhu, Xinsheng Wang, Xiulin Li, Guanqiong Miao, Bo Liu, Lei Xie:
Dialospeech: Dual-Speaker Dialogue Generation with LLM and Flow Matching. APSIPA 2025: 1-6
[c316]He Wang
, Xucheng Wan, Naijun Zheng, Kai Liu, Huan Zhou, Guojian Li, Lei Xie:
CAMEL: Cross-Attention Enhanced Mixture-of-Experts and Language Bias for Code-Switching Speech Recognition. ICASSP 2025: 1-5
[c315]Qing Wang, Jixun Yao, Zhaokai Sun, Pengcheng Guo, Lei Xie, John H. L. Hansen:
DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification. ICASSP 2025: 1-5
[c314]Xinfa Zhu, Lei He, Yujia Xiao, Xi Wang, Xu Tan, Sheng Zhao, Lei Xie:
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training. ICASSP 2025: 1-5
[c313]Jixun Yao, Hexin Liu, Chen Chen, Yuchen Hu, Eng Siong Chng, Lei Xie:
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling. ICLR 2025
[c312]Xiong Wang, Yangze Li, Chaoyou Fu, Yike Zhang, Yunhang Shen, Lei Xie, Ke Li, Xing Sun, Long Ma:
Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM. ICML 2025
[c311]Runduo Han, Yanxin Hu, Yihui Fu, Zihan Zhang, Yukai Jv, Li Chen, Lei Xie:
CabinSep: IR-Augmented Mask-Based MVDR for Real-Time In-car Speech Separation with Distributed Heterogeneous Arrays. INTERSPEECH 2025
[c310]Longhao Li, Yangze Li, Hongfei Xue, Jie Liu, Shuai Fang, Kai Wang, Lei Xie:
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR. INTERSPEECH 2025
[c309]Mingchen Shao, Xinfa Zhu, Chengyou Wang, Bingshen Mu, Hai Li, Ying Yan, Junhui Liu, Danming Xie, Lei Xie:
Weakly Supervised Data Refinement and Flexible Sequence Compression for Efficient Thai LLM-based ASR. INTERSPEECH 2025
[c308]Ziqian Wang, Zikai Liu, Xinfa Zhu, Yike Zhu, Mingshuai Liu, Jun Chen, Longshuai Xiao, Chao Weng, Lei Xie:
FlowSE: Efficient and High-Quality Speech Enhancement via Flow Matching. INTERSPEECH 2025
[c307]Ziqian Wang, Xianjun Xia, Xinfa Zhu, Lei Xie:
U-SAM: An Audio Language Model for Unified Speech, Audio, and Music Understanding. INTERSPEECH 2025
[c306]Tianyi Xu, Hongjie Chen, Qing Wang, Hang Lv, Jian Kang, Jie Li, Zhennan Lin, Yongxiang Li, Lei Xie:
Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis. INTERSPEECH 2025
[c305]Hongfei Xue, Yufeng Tang, Jun Zhang, Xuelong Geng, Lei Xie:
Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty. INTERSPEECH 2025
[c304]Jixun Yao, Hexin Liu, Eng Siong Chng, Lei Xie:
EASY: Emotion-aware Speaker Anonymization via Factorized Distillation. INTERSPEECH 2025
[c303]Yiang Pan, Yuanjing Feng, Jianzhong He, Jianan Cui, Lei Xie, Zan Chen, Qingrun Zeng, Jiawei Zhang, Yan Yu, Konstantin K. Kukanov, Ye Wu:
A Novel Deep Learning Approach for Tissue Microstructure Estimation Using Spherical Mean Technique. ISBI 2025: 1-5
[c302]Lei Xie
, Junxiong Huang
, Yuanjing Feng
, Qingrun Zeng
:
Tractography-Guided Dual-Label Collaborative Learning for Multi-Modal Cranial Nerves Parcellation. ACM Multimedia 2025: 1872-1879
[c301]Wenjie Tian
, Xinfa Zhu
, Haohe Liu
, Zhixian Zhao
, Zihao Chen
, Chaofan Ding
, Xinhan Di
, Junjie Zheng
, Lei Xie
:
DualDub: Video-to-Soundtrack Generation via Joint Speech and Background Audio Synthesis. ACM Multimedia 2025: 10671-10680
[c300]Hongfei Xue
, Yufeng Tang
, Hexin Liu
, Jun Zhang
, Xuelong Geng
, Lei Xie
:
Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning. ACM Multimedia 2025: 10984-10993
[i265]Xinfa Zhu, Lei He, Yujia Xiao, Xi Wang, Xu Tan
, Sheng Zhao, Lei Xie:
ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training. CoRR abs/2501.04416 (2025)
[i264]Hanzhao Li, Yuke Li, Xinsheng Wang, Jingbin Hu, Qicong Xie, Shan Yang, Lei Xie:
FleSpeech: Flexibly Controllable Speech Generation with Various Prompts. CoRR abs/2501.04644 (2025)
[i263]Qing Wang, Jixun Yao, Zhaokai Sun, Pengcheng Guo, Lei Xie, John H. L. Hansen:
DiffAttack: Diffusion-based Timbre-reserved Adversarial Attack in Speaker Identification. CoRR abs/2501.05127 (2025)
[i262]Li Zhang, Jiyao Liu, Lei Xie:
Adaptive Data Augmentation with NaturalSpeech3 for Far-field Speaker Verification. CoRR abs/2501.08691 (2025)
[i261]Xuelong Geng, Kun Wei, Qijie Shao, Shuiyun Liu, Zhennan Lin, Zhixian Zhao, Guojian Li, Wenjie Tian, Peikun Chen, Yangze Li, Pengcheng Guo, Mingchen Shao, Shuiyuan Wang, Yuang Cao, Chengyou Wang, Tianyi Xu, Yuhang Dai, Xinfa Zhu, Yue Li, Li Zhang, Lei Xie:
OSUM: Advancing Open Speech Understanding Models with Limited Resources in Academia. CoRR abs/2501.13306 (2025)
[i260]Qijie Shao, Linhao Dong, Kun Wei, Sining Sun, Lei Xie:
DQ-Data2vec: Decoupling Quantization for Multilingual Speech Recognition. CoRR abs/2501.13497 (2025)
[i259]Xinfa Zhu, Wenjie Tian, Xinsheng Wang, Lei He, Xi Wang, Sheng Zhao, Lei Xie:
CosyAudio: Improving Audio Generation with Confidence Scores and Synthetic Captions. CoRR abs/2501.16761 (2025)
[i258]Jixun Yao, Hexin Liu, Chen Chen, Yuchen Hu, Chng Eng Siong, Lei Xie:
GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling. CoRR abs/2502.02942 (2025)
[i257]Jixun Yao, Yuguang Yang, Yu Pan, Yuan Feng, Ziqian Ning, Jianhao Ye, Hongbin Zhou, Lei Xie:
Fine-grained Preference Optimization Improves Zero-shot Text-to-Speech. CoRR abs/2502.02950 (2025)
[i256]Zhen Ye, Xinfa Zhu, Chi-Min Chan, Xinsheng Wang, Xu Tan
, Jiahe Lei, Yi Peng, Haohe Liu, Yizhu Jin, Zheqi Dai, Hongzhan Lin, Jianyi Chen, Xingjian Du, Liumeng Xue, Yunlin Chen, Zhifei Li, Lei Xie, Qiuqiang Kong, Yike Guo, Wei Xue:
Llasa: Scaling Train-Time and Inference-Time Compute for Llama-based Speech Synthesis. CoRR abs/2502.04128 (2025)
[i255]Zhixian Zhao, Xinfa Zhu, Xinsheng Wang, Shuiyuan Wang, Xuelong Geng, Wenjie Tian, Lei Xie:
Steering Language Model to Stable Speech Emotion Recognition via Contextual Perception and Chain of Thought. CoRR abs/2502.18186 (2025)
[i254]Boyi Kang, Xinfa Zhu, Zihan Zhang, Zhen Ye, Mingshuai Liu, Ziqian Wang, Yike Zhu, Guobin Ma, Jun Chen, Longshuai Xiao, Chao Weng, Wei Xue, Lei Xie:
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement. CoRR abs/2503.00493 (2025)
[i253]Xinsheng Wang, Mingqi Jiang, Ziyang Ma, Ziyu Zhang, Songxiang Liu, Linqin Li, Zheng Liang, Qixi Zheng, Rui Wang, Xiaoqin Feng, Weizhen Bian, Zhen Ye, Sitong Cheng, Ruibin Yuan, Zhixian Zhao, Xinfa Zhu, Jiahao Pan, Liumeng Xue, Pengcheng Zhu, Yunlin Chen, Zhifei Li, Xie Chen, Lei Xie, Yike Guo
, Wei Xue:
Spark-TTS: An Efficient LLM-Based Text-to-Speech Model with Single-Stream Decoupled Speech Tokens. CoRR abs/2503.01710 (2025)
[i252]Hongfei Xue, Yufeng Tang, Hexin Liu, Jun Zhang, Xuelong Geng, Lei Xie:
Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning. CoRR abs/2504.20835 (2025)
[i251]Lei Xie, Huajun Zhou, Junxiong Huang, Jiahao Huang, Qingrun Zeng, Jianzhong He, Jiawei Zhang, Baohua Fan, Mingchu Li, Guoqiang Xie, Hao Chen, Yuanjing Feng:
An Arbitrary-Modal Fusion Network for Volumetric Cranial Nerves Tract Segmentation. CoRR abs/2505.02385 (2025)
[i250]Ziqian Wang, Xianjun Xia, Xinfa Zhu, Lei Xie:
U-SAM: An audio language Model for Unified Speech, Audio, and Music Understanding. CoRR abs/2505.13880 (2025)
[i249]Jixun Yao, Hexin Liu, Eng Siong Chng, Lei Xie:
EASY: Emotion-aware Speaker Anonymization via Factorized Distillation. CoRR abs/2505.15004 (2025)
[i248]Hongfei Xue, Yufeng Tang, Jun Zhang, Xuelong Geng, Lei Xie:
Selective Invocation for Multilingual ASR: A Cost-effective Approach Adapting to Speech Recognition Difficulty. CoRR abs/2505.16168 (2025)
[i247]Alou Diakite, Cheng Li, Lei Xie, Yuanjing Feng, Ruoyou Wu, Jianzhong He, Hairong Zheng, Shanshan Wang:
Cross-Sequence Semi-Supervised Learning for Multi-Parametric MRI-Based Visual Pathway Delineation. CoRR abs/2505.19733 (2025)
[i246]Tianyi Xu, Hongjie Chen, Qing Wang, Hang Lv, Jian Kang, Jie Li, Zhennan Lin, Yongxiang Li, Lei Xie:
Leveraging LLM and Self-Supervised Training Models for Speech Recognition in Chinese Dialects: A Comparative Analysis. CoRR abs/2505.21138 (2025)
[i245]Mingchen Shao, Xinfa Zhu, Chengyou Wang, Bingshen Mu, Hai Li, Ying Yan, Junhui Liu, Danming Xie, Lei Xie:
Weakly Supervised Data Refinement and Flexible Sequence Compression for Efficient Thai LLM-based ASR. CoRR abs/2505.22063 (2025)
[i244]Longhao Li, Yangze Li, Hongfei Xue, Jie Liu, Shuai Fang, Kai Wang, Lei Xie:
Delayed-KD: Delayed Knowledge Distillation based CTC for Low-Latency Streaming ASR. CoRR abs/2505.22069 (2025)
[i243]Yuhang Dai, He Wang, Xingchen Li, Zihan Zhang, Shuiyuan Wang, Lei Xie, Xin Xu, Hongxiao Guo, Shaoji Zhang, Hui Bu, Wei Chen:
AISHELL-5: The First Open-Source In-Car Multi-Channel Multi-Speaker Speech Dataset for Automatic Speech Diarization and Recognition. CoRR abs/2505.23036 (2025)
[i242]Zhennan Lin, Kaixun Huang, Wei Ren, Linju Yang, Lei Xie:
Contextualized Automatic Speech Recognition with Dynamic Vocabulary Prediction and Activation. CoRR abs/2505.23077 (2025)
[i241]Jiahao Huang, Ruifeng Li, Wenwen Yu, Anan Li, Xiangning Li, Mingchao Yan, Lei Xie, Qingrun Zeng, Xueyan Jia, Shuxin Wang, Ronghui Ju, Feng Chen, Qingming Luo, Hui Gong, Andrew Zalesky, Xiaoquan Yang, Yuanjing Feng, Zheng Wang:
Convergent and divergent connectivity patterns of the arcuate fasciculus in macaques and humans. CoRR abs/2506.19266 (2025)
[i240]Dake Guo, Jixun Yao, Linhan Ma, He Wang, Lei Xie:
StreamFlow: Streaming Flow Matching with Block-wise Guided Attention Mask for Speech Token Decoding. CoRR abs/2506.23986 (2025)
[i239]He Wang
, Linhan Ma, Dake Guo, Xiong Wang, Lei Xie, Jin Xu, Junyang Lin:
ContextASR-Bench: A Massive Contextual Speech Recognition Benchmark. CoRR abs/2507.05727 (2025)
[i238]Bingshen Mu, Kun Wei, Pengcheng Guo, Lei Xie:
Mixture of LoRA Experts with Multi-Modal and Multi-Granularity LLM Generative Error Correction for Accented Speech Recognition. CoRR abs/2507.09116 (2025)
[i237]Wenjie Tian, Xinfa Zhu, Haohe Liu, Zhixian Zhao, Zihao Chen, Chaofan Ding, Xinhan Di, Junjie Zheng, Lei Xie:
DualDub: Video-to-Soundtrack Generation via Joint Speech and Background Audio Synthesis. CoRR abs/2507.10109 (2025)
[i236]Huakang Chen, Yuepeng Jiang, Guobin Ma, Chunbo Hao, Shuai Wang, Jixun Yao, Ziqian Ning, Meng Meng, Jian Luan
, Lei Xie:
DiffRhythm+: Controllable and Flexible Full-Length Song Generation with Preference Optimization. CoRR abs/2507.12890 (2025)
[i235]Lei Xie, Jiahao Huang, Jiawei Zhang, Jianzhong He, Yiang Pan, Guoqiang Xie, Mengjun Li, Qingrun Zeng, Mingchu Li, Yuanjing Feng:
Automated Mapping the Pathways of Cranial Nerve II, III, V, and VII/VIII: A Multi-Parametric Multi-Stage Diffusion Tractography Atlas. CoRR abs/2507.23245 (2025)
[i234]Bingshen Mu, Hexin Liu, Hongfei Xue, Kun Wei, Lei Xie:
Hearing More with Less: Multi-Modal Retrieval-and-Selection Augmented Conversational LLM-Based ASR. CoRR abs/2508.01166 (2025)
[i233]Bingshen Mu, Yiwen Shao, Kun Wei, Dong Yu, Lei Xie:
Efficient Scaling for LLM-based ASR. CoRR abs/2508.04096 (2025)
[i232]Wenjie Tian, Xinfa Zhu, Hanke Xie, Zhen Ye, Wei Xue, Lei Xie:
Llasa+: Free Lunch for Accelerated and Streaming Llama-Based Speech Synthesis. CoRR abs/2508.06262 (2025)
[i231]Shuai Wang, Zhaokai Sun, Zhennan Lin, Chengyou Wang, Zhou Pan, Lei Xie:
MSU-Bench: Towards Understanding the Conversational Multi-talker Scenarios. CoRR abs/2508.08155 (2025)
[i230]Xuelong Geng, Qijie Shao, Hongfei Xue, Shuiyuan Wang, Hanke Xie, Zhao Guo, Yi Zhao, Guojian Li, Wenjie Tian, Chengyou Wang, Zhixian Zhao, Kangxiang Xia, Ziyu Zhang, Zhennan Lin, Tianlun Zuo, Mingchen Shao, Yuang Cao, Guobin Ma, Longhao Li, Yuhang Dai, Dehui Gao, Dake Guo, Lei Xie:
OSUM-EChat: Enhancing End-to-End Empathetic Spoken Chatbot via Understanding-Driven Spoken Dialogue. CoRR abs/2508.09600 (2025)
[i229]Kangxiang Xia, Xinfa Zhu, Jixun Yao, Lei Xie:
MPO: Multidimensional Preference Optimization for Language Model-based Text-to-Speech. CoRR abs/2509.00685 (2025)
[i228]Runduo Han, Yanxin Hu, Yihui Fu, Zihan Zhang, Yukai Jv, Li Chen, Lei Xie:
CabinSep: IR-Augmented Mask-Based MVDR for Real-Time In-Car Speech Separation with Distributed Heterogeneous Arrays. CoRR abs/2509.01399 (2025)
[i227]Longhao Li, Zhao Guo, Hongjie Chen, Yuhang Dai, Ziyu Zhang, Hongfei Xue, Tianlun Zuo, Chengyou Wang, Shuiyuan Wang, Jie Li, Jian Kang, Xin Xu, Hui Bu, Binbin Zhang, Ruibin Yuan, Ziya Zhou, Wei Xue, Lei Xie:
WenetSpeech-Yue: A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation. CoRR abs/2509.03959 (2025)
[i226]Bingshen Mu, Pengcheng Guo, Zhaokai Sun, Shuai Wang, Hexin Liu, Mingchen Shao, Lei Xie, Eng Siong Chng, Longshuai Xiao, Qiangze Feng, Daliang Wang:
Summary on The Multilingual Conversational Speech Language Model Challenge: Datasets, Tasks, Baselines, and Methods. CoRR abs/2509.13785 (2025)
[i225]Mingchen Shao, Bingshen Mu, Chengyou Wang, Haizhou Li, Ying Yan, Zhonghua Fu, Lei Xie:
Towards Building Speech Large Language Models for Multitask Understanding in Low-Resource Languages. CoRR abs/2509.14804 (2025)
[i224]Yuhang Dai, Ziyu Zhang, Shuai Wang, Longhao Li, Zhao Guo, Tianlun Zuo, Shuiyuan Wang, Hongfei Xue, Chengyou Wang, Qing Wang, Xin Xu, Hui Bu, Jie Li, Jian Kang, Binbin Zhang, Lei Xie:
WenetSpeech-Chuan: A Large-Scale Sichuanese Corpus with Rich Annotation for Dialectal Speech Processing. CoRR abs/2509.18004 (2025)
[i223]Binbin Zhang, Chengdong Liang, Shuai Wang, Xuelong Geng, Zhao Guo, Haoyu Li, Hao Yin, Xipeng Yang, Pengshen Zhang, Changwei Ma, Lei Xie:
WEST: LLM based Speech Toolkit for Speech Understanding, Generation, and Interaction. CoRR abs/2509.19902 (2025)
[i222]Yike Zhu, Boyi Kang, Ziqian Wang, Xingchen Li, Zihan Zhang, Wenjie Li, Longshuai Xiao, Wei Xue, Lei Xie:
MeanFlowSE: One-Step Generative Speech Enhancement via MeanFlow. CoRR abs/2509.23299 (2025)
[i221]Guojian Li, Chengyou Wang, Hongfei Xue, Shuiyuan Wang, Dehui Gao, Zihan Zhang, Yuke Lin, Wenjie Li, Longshuai Xiao, Zhonghua Fu, Lei Xie:
Easy Turn: Integrating Acoustic and Linguistic Modalities for Robust Turn-Taking in Full-Duplex Spoken Dialogue Systems. CoRR abs/2509.23938 (2025)
[i220]Ziyu Zhang, Hanzhao Li, Jingbin Hu, Wenhao Li, Lei Xie:
HiStyle: Hierarchical Style Embedding Predictor for Text-Prompt-Guided Controllable Speech Synthesis. CoRR abs/2509.25842 (2025)
[i219]Hanke Xie, Dake Guo, Chengyou Wang, Yue Li, Wenjie Tian, Xinfa Zhu, Xinsheng Wang, Xiulin Li, Guanqiong Miao, Bo Liu, Lei Xie:
DialoSpeech: Dual-Speaker Dialogue Generation with LLM and Flow Matching. CoRR abs/2510.08373 (2025)
[i218]Guobin Ma, Jixun Yao, Ziqian Ning, Yuepeng Jiang, Lingxin Xiong, Lei Xie, Pengcheng Zhu:
MeanVC: Lightweight and Streaming Zero-Shot Voice Conversion via Mean Flows. CoRR abs/2510.08392 (2025)
[i217]Zhao Guo, Ziqian Ning, Guobin Ma, Lei Xie:
SynthVC: Leveraging Synthetic Data for End-to-End Low Latency Streaming Voice Conversion. CoRR abs/2510.09245 (2025)
[i216]Jun Chen, Shichao Hu, Jiuxin Lin, Wenjie Li, Zihan Zhang, Xingchen Li, JinJiang Liu, Longshuai Xiao, Chao Weng, Lei Xie, Zhiyong Wu:
LSZone: A Lightweight Spatial Information Modeling Architecture for Real-time In-car Multi-zone Speech Separation. CoRR abs/2510.10687 (2025)
[i215]Guojian Li, Qijie Shao, Zhixian Zhao, Shuiyuan Wang, Zhonghua Fu, Lei Xie:
Serial-Parallel Dual-Path Architecture for Speaking Style Recognition. CoRR abs/2510.11732 (2025)
[i214]Hanke Xie, Haopeng Lin, Wenxiao Cao, Dake Guo, Wenjie Tian, Jun Wu, Hanlin Wen, Ruixuan Shang, Hongmei Liu, Zhiqi Jiang, Yuepeng Jiang, Wenxi Chen, Ruiqi Yan, Jiale Qian, Yichao Yan, Shunshun Yin, Ming Tao, Xie Chen, Lei Xie, Xinsheng Wang:
SoulX-Podcast: Towards Realistic Long-form Podcasts with Dialectal and Paralinguistic Diversity. CoRR abs/2510.23541 (2025)
[i213]Junjie Zheng, Chunbo Hao, Guobin Ma, Xiaoyu Zhang, Gongyu Chen, Chaofan Ding, Zihao Chen, Lei Xie:
YingMusic-Singer: Zero-shot Singing Voice Synthesis and Editing with Annotation-free Melody Guidance. CoRR abs/2512.04779 (2025)
[i212]Zhanxun Liu, Yifan Duan, Mengmeng Wang, Pengchao Feng, Haotian Zhang, Xiaoyu Xing, Yijia Shan, Haina Zhu, Yuhang Dai, Chaochao Lu, Xipeng Qiu, Lei Xie, Lan Wang, Nan Yan, Zilong Zheng, Ziyang Ma, Kai Yu, Xie Chen:
X-Talk: On the Underestimated Potential of Modular Speech-to-Speech Dialogue System. CoRR abs/2512.18706 (2025)- 2024
[j81]Caiyun Wen
, Qingrun Zeng, Ronghui Zhou, Lei Xie
, Jiangli Yu, Chengzhe Zhang, Jingqiang Wang, Yan Yu, Yixin Gu, Guoquan Cao, Yuanjing Feng, Meihao Wang:
Characterization of local white matter microstructural alterations in Alzheimer's disease: A reproducible study. Comput. Biol. Medicine 179: 108750 (2024)
[j80]Yuanjing Feng, Lei Xie
, Jingqiang Wang
, Qiyuan Tian
, Jianzhong He, Qingrun Zeng, Fei Gao:
Bundle-specific tractogram distribution estimation using higher-order streamline differential equation. NeuroImage 298: 120766 (2024)
[j79]Li Zhang
, Ning Jiang, Qing Wang, Yue Li, Quan Lu, Lei Xie:
Whisper-SV: Adapting Whisper for low-data-resource speaker verification. Speech Commun. 163: 103103 (2024)
[j78]Bingshen Mu
, Xucheng Wan, Naijun Zheng
, Huan Zhou, Lei Xie
:
MMGER: Multi-Modal and Multi-Granularity Generative Error Correction With LLM for Joint Accent and Speech Recognition. IEEE Signal Process. Lett. 31: 1940-1944 (2024)
[j77]Runduo Han
, Weiming Xu
, Zihan Zhang, Mingshuai Liu, Lei Xie
:
Distil-DCCRN: A Small-Footprint DCCRN Leveraging Feature-Based Knowledge Distillation in Speech Enhancement. IEEE Signal Process. Lett. 31: 2075-2079 (2024)
[j76]Zhichao Wang
, Yuanzhe Chen, Xinsheng Wang
, Lei Xie
, Yuping Wang:
StreamVoice+: Evolving Into End-to-End Streaming Zero-Shot Voice Conversion. IEEE Signal Process. Lett. 31: 3000-3004 (2024)
[j75]Qingrun Zeng
, Jiahao Huang, Jianzhong He, Shengwei Chen, Lei Xie, Zan Chen
, Wenlong Guo, Sun Yao
, Mengjun Li, Mingchu Li, Yuanjing Feng
:
Automated Identification of the Retinogeniculate Visual Pathway Using a High-Dimensional Tractography Atlas. IEEE Trans. Cogn. Dev. Syst. 16(3): 818-827 (2024)
[j74]Qijie Shao
, Pengcheng Guo
, Jinghao Yan
, Pengfei Hu
, Lei Xie
:
Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 32: 459-470 (2024)
[j73]Xinfa Zhu
, Yi Lei
, Tao Li
, Yongmao Zhang
, Hongbin Zhou
, Heng Lu
, Lei Xie
:
METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer. IEEE ACM Trans. Audio Speech Lang. Process. 32: 1506-1518 (2024)
[j72]Kun Wei
, Bei Li
, Hang Lv
, Quan Lu, Ning Jiang
, Lei Xie
:
Conversational Speech Recognition by Learning Audio-Textual Cross-Modal Contextual Representation. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2432-2444 (2024)
[j71]Zhichao Wang
, Liumeng Xue
, Qiuqiang Kong, Lei Xie
, Yuanzhe Chen, Qiao Tian
, Yuping Wang:
Multi-Level Temporal-Channel Speaker Retrieval for Zero-Shot Voice Conversion. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2926-2937 (2024)
[j70]Jixun Yao
, Qing Wang
, Pengcheng Guo
, Ziqian Ning, Lei Xie
:
Distinctive and Natural Speaker Anonymization via Singular Value Transformation-Assisted Matrix. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2944-2956 (2024)
[j69]Tao Li
, Zhichao Wang
, Xinfa Zhu
, Jian Cong, Qiao Tian
, Yuping Wang, Lei Xie
:
U-Style: Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning. IEEE ACM Trans. Audio Speech Lang. Process. 32: 4026-4035 (2024)
[c299]Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang:
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion. ACL (1) 2024: 7328-7338
[c298]Zihan Zhang, Jiayao Sun, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
Bs-Plcnet: Band-Split Packet Loss Concealment Network with Multi-Task Learning Framework and Multi-Discriminators. ICASSP Workshops 2024: 23-24
[c297]Runduo Han, Xiaopeng Yan, Weiming Xu, Pengcheng Guo, Jiayao Sun, He Wang
, Quan Lu, Ning Jiang, Lei Xie:
An Audio-Quality-Based Multi-Strategy Approach For Target Speaker Extraction in the Misp 2023 Challenge. ICASSP Workshops 2024: 27-28
[c296]Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
Rad-Net: A Repairing and Denoising Network for Speech Signal Improvement. ICASSP Workshops 2024: 49-50
[c295]He Wang
, Pengcheng Guo, Yue Li, Ao Zhang, Jiayao Sun, Lei Xie, Wei Chen, Pan Zhou, Hui Bu, Xin Xu, Binbin Zhang, Zhuo Chen, Jian Wu, Longbiao Wang, Eng Siong Chng, Sun Li:
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge. ICASSP Workshops 2024: 63-64
[c294]He Wang
, Pengcheng Guo, Pan Zhou, Lei Xie:
MLCA-AVSR: Multi-Layer Cross Attention Fusion Based Audio-Visual Speech Recognition. ICASSP 2024: 8150-8154
[c293]Jixun Yao, Yuguang Yang, Yi Lei, Ziqian Ning, Yanni Hu, Yu Pan, Jingjing Yin, Hongbin Zhou, Heng Lu, Lei Xie:
Promptvc: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts. ICASSP 2024: 10571-10575
[c292]Ziqian Ning, Yuepeng Jiang, Pengcheng Zhu, Shuai Wang, Jixun Yao, Lei Xie, Mengxiao Bi:
Dualvc 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion. ICASSP 2024: 11106-11110
[c291]Bingshen Mu, Pengcheng Guo, Dake Guo, Pan Zhou, Wei Chen, Lei Xie:
Automatic Channel Selection and Spatial Feature Integration for Multi-Channel Speech Recognition Across Various Array Topologies. ICASSP 2024: 11396-11400
[c290]Ziqian Wang, Xinfa Zhu, Zihan Zhang, Yuanjun Lv, Ning Jiang, Guoqing Zhao, Lei Xie:
SELM: Speech Enhancement using Discrete Tokens and Language Models. ICASSP 2024: 11561-11565
[c289]Hanzhao Li, Xinfa Zhu, Liumeng Xue, Yang Song, Yunlin Chen, Lei Xie:
Spontts: Modeling and Transferring Spontaneous Style for TTS. ICASSP 2024: 12171-12175
[c288]He Wang
, Pengcheng Guo, Xucheng Wan, Huan Zhou, Lei Xie:
Enhancing Lip Reading with Multi-Scale Video and Multi-Encoder. ICME Workshops 2024: 1-6
[c287]Hongfei Xue
, Qijie Shao, Kaixun Huang, Peikun Chen, Jie Liu, Lei Xie:
SSHR: Leveraging Self-supervised Hierarchical Representations for Multilingual Automatic Speech Recognition. ICME 2024: 1-6
[c286]Xinfa Zhu, Yuke Li, Yi Lei, Ning Jiang, Guoqing Zhao, Lei Xie:
Boosting Multi-Speaker Expressive Speech Synthesis with Semi-Supervised Contrastive Learning. ICME 2024: 1-6
[c285]Rong Gong, Hongfei Xue
, Lezhi Wang, Xin Xu, Qisheng Li, Lei Xie, Hui Bu, Shaomei Wu, Jiaming Zhou, Yong Qin, Binbin Zhang, Jun Du, Jia Bin, Ming Li:
AS-70: A Mandarin stuttered speech dataset for automatic speech recognition and stuttering event detection. INTERSPEECH 2024
[c284]Dake Guo, Xinfa Zhu, Liumeng Xue, Yongmao Zhang, Wenjie Tian, Lei Xie:
Text-aware and Context-aware Expressive Audiobook Speech Synthesis. INTERSPEECH 2024
[c283]Yangze Li, Xiong Wang, Songjun Cao, Yike Zhang, Long Ma, Lei Xie:
A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition. INTERSPEECH 2024
[c282]Hanzhao Li, Liumeng Xue, Haohan Guo, Xinfa Zhu, Yuanjun Lv, Lei Xie, Yunlin Chen, Hao Yin, Zhifei Li:
Single-Codec: Single-Codebook Speech Codec towards High-Performance Speech Generation. INTERSPEECH 2024
[c281]Mingshuai Liu, Zhuangqi Chen, Xiaopeng Yan, Yuanjun Lv, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
RaD-Net 2: A causal two-stage repairing and denoising speech enhancement network with knowledge distillation and complex axial self-attention. INTERSPEECH 2024
[c280]Linhan Ma, Dake Guo, Kun Song, Yuepeng Jiang, Shuai Wang, Liumeng Xue, Weiming Xu, Huan Zhao, Binbin Zhang, Lei Xie:
WenetSpeech4TTS: A 12, 800-hour Mandarin TTS Corpus for Large Speech Generation Model Benchmark. INTERSPEECH 2024
[c279]Linhan Ma, Xinfa Zhu, Yuanjun Lv, Zhichao Wang, Ziqian Wang, Wendi He, Hongbin Zhou, Lei Xie:
Vec-Tok-VC+: Residual-enhanced Robust Zero-shot Voice Conversion with Progressive Constraints in a Dual-mode Training Strategy. INTERSPEECH 2024
[c278]Ziqian Ning, Shuai Wang, Pengcheng Zhu, Zhichao Wang, Jixun Yao, Lei Xie, Mengxiao Bi:
DualVC 3: Leveraging Language Model Generated Pseudo Context for End-to-end Low Latency Streaming Voice Conversion. INTERSPEECH 2024
[c277]Zihan Zhang, Xianjun Xia, Chuanzeng Huang, Yijian Xiao, Lei Xie:
BS-PLCNet 2: Two-stage Band-split Packet Loss Concealment Network with Intra-model Knowledge Distillation. INTERSPEECH 2024
[c276]Alou Diakite, Cheng Li, Lei Xie, Yuanjing Feng, Hua Han, Pei Dong, Shanshan Wang:
Lesen: Label-Efficient Deep Learning for Multi-Parametric Mri-Based Visual Pathway Segmentation. ISBI 2024: 1-5
[c275]Lei Xie, Qingrun Zeng, Huajun Zhou, Guoqiang Xie, Mingchu Li, Jiahao Huang, Jianan Cui, Hao Chen, Yuanjing Feng:
Anatomy-Guided Fiber Trajectory Distribution Estimation for Cranial Nerves Tractography. ISBI 2024: 1-5
[c274]Xuelong Geng, Tianyi Xu, Kun Wei, Bingshen Mu, Hongfei Xue, He Wang, Yangze Li, Pengcheng Guo, Yuhang Dai, Longhao Li, Mingchen Shao, Lei


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID