Leying Zhang

Leying Zhang

PhD Student in Computer Science and Engineering

Shanghai Jiao Tong University

📧 zhangleying@sjtu.edu.cn

📱 (+86) 18621098717

📍 3-520 SEIEE Building, Shanghai Jiao Tong University, Shanghai, China 200240

🔗 Google ScholarLinkedIn

About Me

I am a third-year PhD student in Computer Science and Engineering at Shanghai Jiao Tong University, supervised by Prof. Yanmin Qian. My research focuses on cutting-edge audio and speech technologies.

Research Interests: Text-to-SpeechMulti-modalityAudio GenerationSpeaker Verification

Education

PhD, Computer Science and Engineering

Sep 2023 - Present

Shanghai Jiao Tong University

Supervisor: Prof. Yanmin Qian

Master, Electronic Information

Sep 2021 - Jun 2023

Shanghai Jiao Tong University

Supervisor: Prof. Yanmin Qian

Exchange Student, Data Science and Image Processing

Sep 2021 - Feb 2022

Télécom Paris (Institut polytechnique de Paris)

Bachelor of Information Engineering and French (Double Degree)

Sep 2017 - Jun 2021

Shanghai Jiao Tong University

Selected Publications

Leying Zhang, Yao Qian, Xiaofei Wang, Manthan Thakker, Dongmei Wang, Jianwei Yu, Haibin Wu, Yuxuan Hu, Jinyu Li, Yanmin Qian, Sheng Zhao

"CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching"

NeurIPS, Dec. 2025 • PDF

Leying Zhang, Wangyou Zhang, Zhengyang Chen, Yanmin Qian

"Advanced Zero-Shot Text-to-Speech for Background Removal and Preservation with Controllable Masked Speech Prediction"

ICASSP, April 2025 • PDF

Leying Zhang, Yao Qian, Long Zhou, Shujie Liu, Dongmei Wang, Xiaofei Wang, Midia Yousefi, Yanmin Qian, Jinyu Li, Lei He, Sheng Zhao, Michael Zeng

"CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations"

NeurIPS, Dec. 2024 • PDF

Leying Zhang, Yao Qian, Linfeng Yu, Heming Wang, Hemin Yang, Shujie Liu, Long Zhou, Yanmin Qian

"DDTSE: Discriminative Diffusion Model for Target Speech Extraction"

SLT, Dec. 2024 • PDF

Leying Zhang, Zhengyang Chen, Yanmin Qian

"Adaptive Large Margin Fine-tuning for Speaker Verification"

ICASSP, June 2023

Leying Zhang*, Zhengyang Chen*, Yanmin Qian (* equal contribution)

"Enroll-Aware Attentive Statistics Pooling for Target Speaker Verification"

InterSpeech, Sep. 2022

Leying Zhang, Zhengyang Chen, Yanmin Qian

"Knowledge Distillation from Multi-Modality to Single-Modality for Person Verification"

InterSpeech, Sep. 2021

See CV for complete list of publications including collaborative works.

Industry Experience

Research Intern - Meta Superintelligence Labs

Oct 2025 - Present

Location: New York, USA

Supervisor: Bowen Shi

Research Intern - Microsoft Core AI

Oct 2024 - Aug 2025

Location: Remote

Supervisor: Yao Qian

Project: Text-to-Dialogue Generation - Designed and implemented a purely non-autoregressive dialogue generation framework that supports zero-shot multi-speaker, multi-turn and fine-grained temporal control. This system has been incorporated into the Azure TTS product.

Research Intern - Microsoft Azure Research

Mar 2023 - Mar 2024

Location: Remote

Supervisor: Yao Qian

Projects:

  • Target speech extraction: Investigated diffusion-based model for target speech extraction. Proposed an efficient approach by combining diffusion and discriminative methods for handling multi- and single-speaker scenarios in both noisy and clean conditions.
  • Text-to-Dialogue Generation: Investigated Conversational Voice Mixture Generation, a novel model for zero-shot, human-like, multi-speaker, multi-round dialogue speech generation.

Research Intern - Microsoft Research Asia

Nov 2022 - Mar 2023

Location: Beijing, China

Supervisor: Xu Tan

Projects:

  • Audio generation: Implemented vector-quantized diffusion model with classifier-free guidance. Achieved 10% improvement over baseline. Investigated latent diffusion model's effects by fine-tuning Stable diffusion.
  • Text-to-speech: Utilized vector-quantized diffusion model for text-to-speech on large-scale dataset with different neural audio codecs. Generated high-quality speech and achieved improvements on zero-shot text-to-speech.

Teaching Experience

Teaching Assistant - Intelligent Speech Technology

Spring 2025

Shanghai Jiao Tong University

Teaching Assistant - Machine Learning

Fall 2022

Shanghai Jiao Tong University

Honors and Awards

ICASSP 2025 Travel Grant 2025
NeurIPS 2024 Scholar Award 2024
National Scholarship 2022
First Place - CN-Celeb Speaker Recognition Challenge 2022 2022
ISCA and Interspeech Travel Grant 2021
Outstanding Graduates of Shanghai 2021
Outstanding Student Leader of SJTU 2021
Guanghua Scholarship 2020
SJTU Class B Scholarship 2019

Skills

Languages

  • Chinese - Native
  • English - Professional
  • French - Professional

Interests

  • Badminton
  • Yoga