Preparing LLaMA 2 for launch required many changes to make the model safer and less likely to spread toxic falsehoods than its predecessor, says Al-Dahle.
Meta has many past gaffes to learn from. Her language model for science, Galactica, was taken offline after just three days, and her previous model LlaMA, meant only for research purposes, was leaked online, prompting criticism from politicians who questioned whether Meta was adequately accounting for the risks associated with AI language models, such as misinformation and harassment.
To mitigate the risk of repeating these errors, Meta applied a mix of different machine learning techniques aimed at improving utility and safety.
Meta’s approach to training LLaMA 2 involved more phases than usual for generative AI models, says Sasha Luccioni, a researcher at AI startup Hugging Face.
The model was trained with 40% more data than its predecessor. Al-Dahle says there were two sources of training data: data that was scraped online and a dataset fine-tuned and tweaked based on feedback from human annotators to behave in a more desirable way. The company says it did not use Meta user data in LLaMA 2 and excluded the data from sites it knew contained a lot of personal information.
Despite this, LLaMA 2 still spews offensive, harmful and otherwise problematic language, just like rival models. Meta says it hasn’t removed the toxic data from the dataset, because leaving it could help LLaMA 2 better detect hate speech, and removing it could risk accidentally filtering out certain demographics.
Still, Meta’s commitment to openness is exciting, Luccioni says, because it allows researchers like her to properly study the biases, ethics and efficiency of AI models.
The fact that LLaMA 2 is an open source model will also allow outside researchers and developers to probe it for security flaws, which will make it more secure than proprietary models, says Al-Dahle.
Liang agrees. “I’m very excited to try things out and I think it will be beneficial to the community,” he says.