But what really strikes me is the extent to which Meta is throwing its doors wide open. It will allow the wider AI community to download the model and modify it. This could help make it safer and more efficient. And especially, could demonstrate the benefits of transparency over secrecy when it comes to the inner workings of AI models. This couldn’t be timelier or more important.
Tech companies are rushing to release their AI models, and we’re seeing generative AI embedded in more and more products. But the most powerful models out there, like OpenAI’s GPT-4, are tightly protected by their creators. Developers and researchers pay for limited access to these models through a website and don’t know the details of their inner workings.
This opacity could lead to long-term problems, as highlighted in a new, document not peer-reviewed which caused some buzz last week. Researchers at Stanford University and UC Berkeley found that GPT-3.5 and GPT-4 performed worse at solving math problems, answering sensitive questions, generating code, and reasoning visually than they did a couple of months earlier.
The lack of transparency of these models makes it hard to say exactly why that might be, but regardless, the findings should be taken with a grain of salt, Princeton computer science professor Arvind Narayanan writes in his assessment. They’re more likely caused by « author evaluation quirks » than evidence that OpenAI made the models worse. He thinks the researchers didn’t take into account that OpenAI fine-tuned the models to work better and unintentionally caused some suggestion techniques to crash like they used to do in the past.
This has some serious implications. Companies that have built and optimized their products to work with a certain iteration of OpenAI models could « 100% » see them suddenly buggy and break, says Sasha Luccioni, an AI researcher at startup Hugging Face. When OpenAI refines its models in this way, products that were built using very specific prompts, for example, may stop working as before. Closed models lack accountability, she adds. « If you have a product and you change something in the product, you should tell your customers. »
An open model like LLaMA 2 will at least clarify how the company designed the model and what training techniques it used. Unlike OpenAI, Meta shared the entire recipe for LLaMA 2, including details on how it was trained, what hardware was used, how the data was annotated, and what techniques were used to mitigate the damage. The people researching and building products based on the model know exactly what they’re working on, says Luccioni.
« Once you have access to the model, you can do all kinds of experiments to make sure you’re getting better performance or less bias, or whatever you’re looking for, » he says.
Ultimately, the open versus closed debate around AI boils down to who calls the shots. With open templates, users have more power and control. With models closed, you are at the mercy of their creator.
For a large company like Meta to release such an open and transparent AI model seems like a potential game changer in the Generative AI gold rush.