Luke Zettlemoyer

Nonparametric Language Models: Trading Data for Parameters (and Compute) in Large Language Models

Large language models (LLMs) such as ChatGPT have taken the world by storm, but are incredibly expensive to train, requiring significant amounts of data and computational resources. They also hallucinate, e.g. by regularly introducing made up facts, and are difficult to keep up to date over time, as the world around them changes. In this talk, I will survey some of our recent work on non-parametric and retrieval-based language models, which are instead designed to be easily extensible and provide much more careful provenience for their predictions. The key idea is to trade parameters for data; rather than attempting to memorize all the worlds facts and knowledge in the learned parameters of a single monolithic LM, we instead provide the model an explicit knowledge store (e.g. a collection of web pages from Wikipedia) that can be used to look up information in real time. This is a new area where best practices are still forming, but I will argue retrieval augmentation is a very general idea that can lead to much more efficient training, can provide fundamentally new insights into how LLMs work, and is broadly applicable to a range of settings, including e.g. models that do text-to-image generation. I will also provide, to the best of my ability, a guess about where things are going and what it would take to convince every major LLM to go non-parametric in the near future.

back to overview

Watch Recording

Biography

Luke Zettlemoyer is a Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and a Research Director at Meta. His research focuses on empirical methods for natural language semantics, and involves designing machine learning algorithms, introducing new tasks and datasets, and, most recently, studying how to best develop self-supervision signals for pre-training. His honors include being named an ACL Fellow as well as winning a PECASE award, an Allen Distinguished Investigator award, and multiple best paper awards. Luke received his PhD from MIT and was a postdoc at the University of Edinburgh.

Imprint / Privacy

Global Software Technology Summit 2024

Spark New Software Ideas, Discuss Software Architecture and Technology