Skip to yearly menu bar Skip to main content


Poster

Knowledge Storage and Extraction in Language Models

Zeyuan Allen-Zhu · Yuanzhi Li


Abstract:

Large language models (LLMs) can store a vast amount of world knowledge, often extractable via question-answering (e.g., ``What is Abraham Lincoln's birthday?''). However, do they answer such questions based on exposure to similar questions during training (i.e., cheating), or by genuinely learning to extract knowledge from sources like Wikipedia?In this paper, we investigate this issue using a controlled biography dataset. We find a strong correlation between the model's ability to extract knowledge and various \emph{diversity measures} of the training data. \textbf{Essentially}, for knowledge to be reliably extracted, it must be sufficiently augmented (e.g., through paraphrasing, sentence shuffling) \emph{during pretraining}. Without such augmentation, knowledge may be memorized but not extractable, leading to 0\% accuracy, regardless of subsequent instruction fine-tuning.To understand why this occurs, we employ (nearly) linear probing to demonstrate a strong connection between the observed correlation and \emph{how the model internally encodes knowledge} --- whether it is linearly encoded in the hidden embeddings of entity names or distributed across other token embeddings in the training text.

Live content is unavailable. Log in and register to view live content