World's Best Scientists 2026 revealed!

D-Index & Metrics

Computer Science

D-Index
36
Citations
4083
World Ranking
11403
National Ranking
1409

Best Publications

  • Language Is Not All You Need: Aligning Perception with Language Models

    Unknown

  • Kosmos-2: Grounding Multimodal Large Language Models to the World

    Unknown

  • DeepNet: Scaling Transformers to 1,000 Layers

    Unknown

  • Global Encoding for Abstractive Summarization

    Unknown

  • The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits

    Unknown

  • DeepNet: Scaling Transformers to 1, 000 Layers

    Unknown

  • Retentive Network: A Successor to Transformer for Large Language Models

    Unknown

  • Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

    Unknown

  • meProp: sparsified back propagation for accelerated deep learning with reduced overfitting

    Xu Sun;Xuancheng Ren;Shuming Ma;Houfeng Wang

  • Bag-of-Words as Target for Neural Machine Translation

    Unknown

  • Language Models are General-Purpose Interfaces

    Unknown

  • Alternating Language Modeling for Cross-Lingual Pre-Training

    Jian Yang;Shuming Ma;Dongdong Zhang;ShuangZhi Wu

  • XLM-E: Cross-lingual Language Model Pre-training via ELECTRA

    Zewen Chi;Shaohan Huang;Li Dong;Shuming Ma

  • Query and Output: Generating Words by Querying Distributed Word Representations for Paraphrase Generation

    Shuming Ma;Xu Sun;Wei Li;Sujian Li

  • Improving Semantic Relevance for Sequence-to-Sequence Learning of Chinese Social Media Text Summarization

    Shuming Ma;Xu Sun;Jingjing Xu;Houfeng Wang

  • LongNet: Scaling Transformers to 1, 000, 000, 000 Tokens

    Unknown

  • A Simple and Effective Unified Encoder for Document-Level Machine Translation

    Shuming Ma;Dongdong Zhang;Ming Zhou

  • A Deep Reinforced Sequence-to-Set Model for Multi-Label Text Classification

    Unknown

  • Semantic-Unit-Based Dilated Convolution for Multi-Label Text Classification

    Unknown

  • meProp: Sparsified Back Propagation for Accelerated Deep Learning with Reduced Overfitting

    Xu Sun;Xuancheng Ren;Shuming Ma;Houfeng Wang

If you think any of the details on this page are incorrect, let us know.

Report an issue

We appreciate your kind effort to assist us to improve this page, it would be helpful providing us with as much detail as possible in the text box below:

Recently Published Articles