World's Best Scientists 2026 revealed!
Michael Cafarella

Michael Cafarella

D-Index & Metrics

Computer Science

D-Index
43
Citations
10852
World Ranking
7830
National Ranking
3388

Research.com Recognitions

  • 2016 - Fellow of Alfred P. Sloan Foundation

Overview

Michael Cafarella is affiliated with MIT in the United States and has contributed extensively to the field of computer science, focusing on various subfields and specialized topics. Their research includes a significant number of publications, primarily in prominent venues such as arXiv (Cornell University) and the Proceedings of the VLDB Endowment.

The main field of study for Michael Cafarella is Computer Science, with 83 publications attributed to this area. Their work spans several subfields, including:

  • Artificial Intelligence (26 publications)
  • Computer Networks and Communications (25 publications)
  • Computer Vision and Pattern Recognition (15 publications)
  • Information Systems (10 publications)
  • Management Science and Operations Research (6 publications)

Michael Cafarella's research emphasizes topics related to Cloud Computing and Resource Management, Distributed Systems and Fault Tolerance, Data Quality and Management, Advanced Data Storage Technologies, Topic Modeling, Scientific Computing and Data Management, and Bayesian Modeling and Causal Inference.

Their recent notable papers include:

  • "Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data" (2020, arXiv (Cornell University))
  • "Leveraging Noisy Lists for Social Feed Ranking" (2021, Proceedings of the International AAAI Conference on Web and Social Media)
  • "DBOS" (2021, Proceedings of the VLDB Endowment)
  • "Causal Data Integration" (2023, Proceedings of the VLDB Endowment)

Frequent co-authors include:

  • Samuel Madden (11 collaborations)
  • Michael Stonebraker (10 collaborations)
  • Brit Youngmann (8 collaborations)
  • Tim Kraska (7 collaborations)
  • Matei Zaharia (7 collaborations)

Michael Cafarella's work is most often published in the following venues:

  • arXiv (Cornell University) with 20 publications
  • Proceedings of the VLDB Endowment with 11 publications
  • Proceedings of the ACM on Management of Data with 3 publications
  • SSRN Electronic Journal with 2 publications
  • AI Magazine with 2 publications

In recognition of their contributions, Michael Cafarella was awarded the title of Fellow of the Alfred P. Sloan Foundation in 2016.

Best Publications

  • Unsupervised named-entity extraction from the Web: An experimental study

    Oren Etzioni;Michael Cafarella;Doug Downey;Ana-Maria Popescu

  • Open information extraction from the web

    Michele Banko;Michael J. Cafarella;Stephen Soderland;Matt Broadhead

  • Web-scale information extraction in knowitall: (preliminary results)

    Oren Etzioni;Michael Cafarella;Doug Downey;Stanley Kok

  • WebTables: exploring the power of tables on the web

    Michael J. Cafarella;Alon Halevy;Daisy Zhe Wang;Eugene Wu

  • TextRunner: Open Information Extraction on the Web

    Alexander Yates;Michele Banko;Matthew Broadhead;Michael Cafarella

  • Theoretical Limits of Hydrogen Storage in Metal–Organic Frameworks: Opportunities and Trade-Offs

    Jacob Goldsmith;Antek G. Wong-Foy;Michael J. Cafarella;Donald J. Siegel

  • Using Social Media to Measure Labor Market Flows

    Dolan Antenucci;Michael Cafarella;Margaret Levenstein;Christopher Ré

  • Data integration for the relational web

    Michael J. Cafarella;Alon Halevy;Nodira Khoussainova

  • Automatic optimization for MapReduce programs

    Eaman Jahani;Michael J. Cafarella;Christopher Ré

  • Ontology-driven information extraction with OntoSyphon

    Luke K. Mcdowell;Michael Cafarella

  • Uncovering the Relational Web

    Michael J. Cafarella;Alon Y. Halevy;Yang Zhang;Daisy Zhe Wang

  • Machine reading

    Oren Etzioni;Michele Banko;Michael J. Cafarella

  • Methods for domain-independent information extraction from the web: an experimental comparison

    Oren Etzioni;Michael Cafarella;Doug Downey;Ana-Maria Popescu

  • KnowItNow: Fast, Scalable Information Extraction from the Web

    Michael J. Cafarella;Doug Downey;Stephen Soderland;Oren Etzioni

  • Brainwash: A data system for feature engineering

    Michael R. Anderson;Dolan Antenucci;Victor Bittorf;Matthew Burgess

  • Building Nutch: Open Source Search: A case study in writing an open source search engine

    Mike Cafarella;Doug Cutting

  • A search engine for natural language applications

    Michael J. Cafarella;Oren Etzioni

  • Structured data on the web

    Michael J. Cafarella;Alon Halevy;Jayant Madhavan

  • Foofah: Transforming Data By Example

    Zhongjun Jin;Michael R. Anderson;Michael Cafarella;H. V. Jagadish

  • Structured querying of web text

    Michael J. Cafarella;Christopher Ré;Dan Suciu;Oren Etzioni

  • Methods for domain-independent information extraction from the web

    Oren Etzioni;Michael Cafarella;Doug Downey;Ana Maria Popescu

Frequent Co-Authors

Christopher Ré
Christopher Ré Stanford University
Oren Etzioni
Oren Etzioni University of Washington
Hosagrahar V. Jagadish
Hosagrahar V. Jagadish University of Michigan–Ann Arbor
Alon Halevy
Alon Halevy Facebook (United States)
Stephen Soderland
Stephen Soderland University of Washington
Thomas F. Wenisch
Thomas F. Wenisch University of Michigan–Ann Arbor
Jayant Madhavan
Jayant Madhavan Google (United States)
Daniel S. Weld
Daniel S. Weld University of Washington

If you think any of the details on this page are incorrect, let us know.

Report an issue

We appreciate your kind effort to assist us to improve this page, it would be helpful providing us with as much detail as possible in the text box below:

Related Online Degrees & Career Pathways

Exploring a career in Computer Science opens up many online degree pathways, allowing for both flexibility and affordability. For those just getting started, the most affordable bachelor's degree online programs can provide a solid foundation while keeping costs manageable.

Tech-savvy students interested in a broader range of technical fields may consider programs from the engineering degrees online category. These degrees often enable students to specialize in software, hardware, or related engineering fields—all with the benefits of remote learning.

For professionals aiming to advance to leadership positions, pursuing one of the emba programs can enhance business and management skills, especially relevant in tech companies where interdisciplinary knowledge is highly valued.

Additionally, those interested in data management and information organization might explore a library sciences degree. This can complement a tech background with expertise in digital archives, research tools, and data systems.

Considering these options can help you tailor your education to match your career goals, budget, and lifestyle.

Best Scientists Citing Michael Cafarella

Trending Scientists

Recently Published Articles