I am a PhD student in Computer Science and Engineering at Shanghai Jiao Tong University, supervised by Prof. Yanmin Qian. I received my Master's and Bachelor's degrees from SJTU as well. I was also an exchange student at Télécom Paris (Institut polytechnique de Paris).

My research interests include multi-modality, text-to-speech, audio generation, speaker verification, and speech enhancement. I have published papers at top venues such as NeurIPS, ICLR, ICASSP, SLT, and InterSpeech.

I have interned at Meta and Microsoft. I work closely with Yao Qian and Bowen Shi. I am expected to graduate in 2027 and am actively seeking full-time research positions in industry or academia. Please feel free to reach out at zhangleying@sjtu.edu.cn.

I speak three languages fluently: Chinese, English, and French. Outside of research, I enjoy traveling, music, and movies. I love meeting new people and am happy to grab a coffee and chat!

🔥 News

  • 2026.01: 🎉 One paper accepted by ICLR 2026
  • 2025.10: 🚀 Joined Meta Superintelligence Labs as a Research Intern
  • 2025.09: 🎉 One paper accepted by NeurIPS 2025
  • 2024.12: 🎉 Two paper accepted by ICASSP 2025, received ICASSP 2025 Travel Grant
  • 2024.10: 🚀 Joined Microsoft Core AI as a Research Intern
  • 2024.09: 🎉 One paper accepted by NeurIPS 2024, received NeurIPS 2024 Scholar Award

📝 Publications

🗣️ Dialogue Generation

NeurIPS 2024
CoVoMix

CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations
Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng

  • First zero-shot multi-talker conversational speech generation system
NeurIPS 2025
CoVoMix2

CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching
Leying Zhang, Yao Qian, Xiaofei Wang, Manthan Thakker, Dongmei Wang, Jianwei Yu, Haibin Wu, Yuxuan Hu, Jinyu Li, Yanmin Qian, Sheng Zhao

  • Fully non-autoregressive dialogue generation with flow matching
  • Supports zero-shot multi-speaker, multi-turn and fine-grained temporal control
  • Incorporated into Azure TTS product

🎤 Speech Generation (TTS)

Preprint
DeepASMR

DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice
Leying Zhang, Tingxiao Zhou, Haiyang Sun, Mengxiao Bi, Yanmin Qian

  • LLM-based zero-shot ASMR speech generation
ICASSP 2025
Advanced TTS

Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction
Leying Zhang, Wangyou Zhang, Zhengyang Chen and Yanmin Qian

  • Controllable background removal and preservation in zero-shot TTS

🔊 Speech Extraction & Enhancement

SLT 2024
DDTSE

DDTSE: Discriminative Diffusion Model for Target Speech Extraction
Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Hemin Yang, Shujie Liu, Long Zhou, Yanmin Qian

  • Combined diffusion and discriminative methods for target speech extraction
  • Handles multi- and single-speaker scenarios in both noisy and clean conditions

🔏 Speaker Verification

📦 Other

🎖 Honors and Awards

  • 2025 ICASSP 2025 Travel Grant
  • 2024 NeurIPS 2024 Scholar Award
  • 2022 National Scholarship
  • 2022 First place in CN-Celeb Speaker Recognition Challenge 2022
  • 2021 ISCA and Interspeech Travel Grant
  • 2021 Outstanding Graduates of Shanghai
  • 2021 Outstanding Student Leader of SJTU
  • 2020 Guanghua Scholarship
  • 2019 SJTU Class B Scholarship

📖 Educations

  • 2023.09 - Present, PhD in Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai
  • 2021.09 - 2023.06, Master in Electronic Information, Shanghai Jiao Tong University, Shanghai
  • 2021.09 - 2022.02, Exchange Student in Data Science and Image Processing, Télécom Paris (Institut polytechnique de Paris), France
  • 2017.09 - 2021.06, Bachelor of Information Engineering and French (double degree), Shanghai Jiao Tong University, Shanghai

💻 Industry Experience

  • 2025.10 - 2026.03, Research Intern, Meta Superintelligence Labs, New York, USA
  • 2024.10 - 2025.08, Research Intern, Microsoft Core AI (Remote)
  • 2023.03 - 2024.03, Research Intern, Microsoft Azure Research (Remote)
  • 2022.11 - 2023.03, Research Intern, Microsoft Research Asia, Beijing, China

📚 Teaching

  • Spring 2025, Teaching Assistant - Intelligent Speech Technology
  • Fall 2022, Teaching Assistant - Machine Learning