I am Mohan Shi, a first-year Ph.D. student in Electrical and Computer Engineering at University of California, Los Angeles (UCLA), where I am fortunate to be advised by Prof. Abeer Alwan (Fellow of IEEE/ISCA/ASA). I received my master degree at the University of Science and Technology of China (USTC) and had the pleasure of working under the guidance of Prof. Li-Rong Dai for three years. My research interests span a variety of domains in the world of speech processing:
- Automatic Speech Recognition
- Discrete Speech Units
- Large Language Models
- Cocktail Party Problem
I am eagerly seeking research internship opportunities for the Summer of 2025. Please reach out if you have any leads.
Publications
Advancing Multi-talker ASR Performance with Large Language Models, SLT 2024 [pdf]
Mohan Shi, Zengrui Jin, Yaoxun Xu, Yong Xu, Shi-Xiong Zhang, Kun Wei, Yiwen Shao, Chunlei Zhang, Dong YuLibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization, Interspeech 2024 (Oral) [pdf]
Zengrui Jin*, Yifan Yang*, Mohan Shi*, Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Lingwei Meng, Long Lin, Yong Xu, Shi-Xiong Zhang, Daniel PoveyCASA-ASR: Context-Aware Speaker-Attributed ASR, Interspeech 2023 [pdf]
Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong DaiSemantic VAD: Low-Latency Voice Activity Detection for Speech Interaction, Interspeech 2023 (Oral) [pdf]
Mohan Shi, Yuchun Shu, Lingyun Zuo, Qian Chen, Shiliang Zhang, Jie Zhang, Li-Rong DaiA Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings, APSIPA ASC 2023 [pdf]
Mohan Shi, Jie Zhang, Zhihao Du, Fan Yu, Qian Chen, Shiliang Zhang, Li-Rong DaiThe Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR, ASRU 2023 [pdf]
Yuhao Liang, Mohan Shi, Fan Yu, Yangze Li, Shiliang Zhang, Zhihao Du, Lei Xie, Yanmin Qian, Jian Wu, Zhuo Chen, Kong Aik Lee, Zhijie Yan, Hui BuNon-autoregressive End-to-End Speaker-Attributed ASR, ASRU 2023 [pdf]
Yangze Li, Fan Yu, Yuhao Liang, Pengcheng Guo, Mohan Shi, Zhihao Du, Shiliang Zhang, Lei XieThe USTC-NELSLIP offline speech translation systems for IWSLT 2022, IWSLT 2022 [pdf]
Weitai Zhang, Zhongyi Ye, Haitao Tang, Xiaoxi Li, Xinyuan Zhou, Jing Yang, Jianwei Cui, Pan Deng, Mohan Shi, Yifan Song, Dan Liu, Junhua Liu, Lirong Dai
Education
University of California, Los Angeles
- Ph.D. student, Electrical and Computer Engineering, Sep 2024 – Present
- Advisor: Dr. Abeer Alwan (Distinguished Prof. & Vice Chair, Fellow of IEEE/ISCA/ASA)
University of Science and Technology of China
- Master of Engineering, Electronic Engineering and Information Science, Sep 2021 – Jun 2024
- Advisor: Dr. Li-Rong Dai (Deputy Director of NERC-SLIP)
Dalian University of Technology
- Bachelor of Engineering, Electronic Information Engineering, Sep 2017 – Jun 2021
- Rank: 1 / 185
Experience
Tencent AI Lab, Bellevue, USA (remote)
- Research Intern, Dr. Dong Yu's Speech Lab, Sep 2023 – August 2024
- Mentor(s): Dr. Yong Xu, Dr. Shi-Xiong (Austin) Zhang
Alibaba Damo Academy, Hangzhou, China
- Research Intern, Speech Group, Jul 2022 – May 2023
- Mentor(s): Dr. Shiliang Zhang