World's Best Scientists 2026 revealed!
Naoyuki Kanda

Naoyuki Kanda

D-Index & Metrics

Computer Science

D-Index
30
Citations
4416
World Ranking
14023
National Ranking
233

Best Publications

  • WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

    Sanyuan Chen;Chengyi Wang;Zhengyang Chen;Yu Wu

  • CHiME-6 Challenge: Tackling multispeaker speech recognition for unsegmented recordings

    Shinji Watanabe;Michael Mandel;Jon Barker;Emmanuel Vincent

  • End-to-end neural speaker diarization with permutation-free objectives

    Yusuke Fujita;Yusuke Fujita;Naoyuki Kanda;Shota Horiguchi;Kenji Nagamatsu

  • End-to-End Neural Speaker Diarization with Self-Attention

    Yusuke Fujita;Naoyuki Kanda;Shota Horiguchi;Yawen Xue

  • Elastic spectral distortion for low resource speech recognition with deep neural networks

    Unknown

  • CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

    Shinji Watanabe;Michael Mandel;Jon Barker;Emmanuel Vincent

  • Serialized Output Training for End-to-End Overlapped Speech Recognition.

    Naoyuki Kanda;Yashesh Gaur;Xiaofei Wang;Zhong Meng

  • Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition

    Zhong Meng;Sarangarajan Parthasarathy;Eric Sun;Yashesh Gaur

  • Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis

    Desh Raj;Pavel Denisov;Zhuo Chen;Hakan Erdogan

  • A two-layer model for behavior and dialogue planning in conversational service robots

    M. Nakano;Y. Hasegawa;K. Nakadai;T. Nakamura

  • Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020

    Xiong Xiao;Naoyuki Kanda;Zhuo Chen;Tianyan Zhou

  • Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers.

    Naoyuki Kanda;Yashesh Gaur;Xiaofei Wang;Zhong Meng

  • Multi-Domain Spoken Dialogue System with Extensibility and Robustness against Speech Recognition Errors

    Kazunori Komatani;Naoyuki Kanda;Mikio Nakano;Kazuhiro Nakadai

  • Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR

    Naoyuki Kanda;Christoph Böddeker;Jens Heitkaemper;Yusuke Fujita

  • The Hitachi/JHU CHiME-5 system: Advances in speech recognition for everyday home environments using multiple microphone arrays

    Unknown

  • Serialized Output Training for End-to-End Overlapped Speech Recognition

    Naoyuki Kanda;Yashesh Gaur;Xiaofei Wang;Zhong Meng

  • A Review of Speaker Diarization: Recent Advances with Deep Learning

    Tae Jin Park;Naoyuki Kanda;Dimitrios Dimitriadis;Kyu J. Han

  • Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches

    Naoyuki Kanda;Yusuke Fujita;Shota Horiguchi;Rintaro Ikeshita

  • Face-Voice Matching using Cross-modal Embeddings

    Unknown

  • Maximum a posteriori Based Decoding for CTC Acoustic Models.

    Unknown

If you think any of the details on this page are incorrect, let us know.

Report an issue

We appreciate your kind effort to assist us to improve this page, it would be helpful providing us with as much detail as possible in the text box below:

Recently Published Articles