World's Best Scientists 2026 revealed!

D-Index & Metrics

Computer Science

D-Index
37
Citations
8756
World Ranking
10518
National Ranking
4404

Best Publications

  • ESPNet: End-to-end speech processing toolkit

    Shinji Watanabe;Takaaki Hori;Shigeki Karita;Tomoki Hayashi

  • Joint CTC-attention based end-to-end speech recognition using multi-task learning

    Suyoun Kim;Takaaki Hori;Shinji Watanabe

  • Hybrid CTC/Attention Architecture for End-to-End Speech Recognition

    Shinji Watanabe;Takaaki Hori;Suyoun Kim;John R. Hershey

  • A Comparative Study on Transformer vs RNN in Speech Applications

    Shigeki Karita;Nanxin Chen;Tomoki Hayashi;Takaaki Hori

  • Attention-Based Multimodal Fusion for Video Description

    Chiori Hori;Takaaki Hori;Teng-Yok Lee;Ziming Zhang

  • Advances in joint CTC-attention based end-to-end speech recognition with a deep CNN encoder and RNN-LM

    Takaaki Hori;Shinji Watanabe;Yu Zhang;William Chan

  • Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition

    Unknown

  • End-to-End Speech Recognition: A Survey

    Unknown

  • Language independent end-to-end architecture for joint language identification and speech recognition

    Shinji Watanabe;Takaaki Hori;John R. Hershey

  • End-to-end Speech Recognition With Word-Based Rnn Language Models

    Takaaki Hori;Jaejin Cho;Shinji Watanabe

  • Triggered Attention for End-to-end Speech Recognition

    Niko Moritz;Takaaki Hori;Jonathan Le Roux

  • Open-Vocabulary Spoken Utterance Retrieval using Confusion Networks

    T. Hori;I. L. Hetherington;T. J. Hazen;J. R. Glass

  • Joint CTC/attention decoding for end-to-end speech recognition

    Takaaki Hori;Shinji Watanabe;John R. Hershey

  • Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling

    Jaejin Cho;Murali Karthick Baskar;Ruizhi Li;Matthew Wiesner

  • End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features

    Chiori Hori;Huda Alamri;Jue Wang;Gordon Wichern

  • Duration-Controlled LSTM for Polyphonic Sound Event Detection

    Tomoki Hayashi;Shinji Watanabe;Tomoki Toda;Takaaki Hori

  • Back-Translation-Style Data Augmentation for end-to-end ASR

    Tomoki Hayashi;Shinji Watanabe;Yu Zhang;Tomoki Toda

  • Student-teacher network learning with enhanced features

    Shinji Watanabe;Takaaki Hori;Jonathan Le Roux;John R. Hershey

  • ESPnet: End-to-End Speech Processing Toolkit

    Shinji Watanabe;Takaaki Hori;Shigeki Karita;Tomoki Hayashi

  • Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming

    Tsubasa Ochiai;Shinji Watanabe;Takaaki Hori;John R. Hershey

If you think any of the details on this page are incorrect, let us know.

Report an issue

We appreciate your kind effort to assist us to improve this page, it would be helpful providing us with as much detail as possible in the text box below:

Recently Published Articles