Meta unveils LLaMA, a family of foundational language models aimed at researchers

Meta says smaller, more performant models like LLaMA can help researchers without large infrastructure study these systems.

Meta today announced LLaMA (Large Language Model Meta AI), a set of foundational language models ranging from 7 billion to 65 billion parameters. Meta is releasing LLaMA in 7B, 13B, 33B, and 65B parameter sizes, with 65B and 33B trained on 1.4 trillion tokens and 7B on one trillion tokens. The models were trained with an emphasis on efficiency.

Unlike many recent high-profile language models, LLaMA is being released under a noncommercial license exclusively for research use. Access is granted on a case-by-case basis to academic researchers, government and civil society affiliates, and industry research labs. Meta frames the gated release as a responsible approach to prevent misuse while still enabling the AI community to study and improve upon the models.

Meta is sharing the code and a model card so researchers can test ways to limit bias, toxic comments, and hallucinations. The company believes that the AI community should work together on responsible AI guidelines.

The record

One year later — open only if you can handle spoilers

Within a week of this announcement, LLaMA's weights were leaked on GitHub, leading to widespread unauthorized downloads. That leak accidentally ignited the open-weights movement, eventually enabling fine-tuned variants like Alpaca and Vicuna that matched GPT-3.5 quality on a consumer GPU.

Replay thisPost on X Reddit HN LinkedIn