The arms race between technology companies regarding AI models has heated up significantly in the past two weeks. OpenAI revealed GPT-4o shortly before Google announced a number of Gemini improvements. Microsoft then announced a group of Copilot+ PCs as well as AI improvements.
But while all this was going on, Meta was taking care of his AI business. The company quietly released a research paper on its efforts in the multimodal AI space. Spotted by VentureBeat, the paper shows Meta is working on a multi-modal large language model called Chameleon.
This is not to be confused with the generative AI model CM3leon (pronounced chameleon) that Meta AI revealed last summer. Meta AI said in its blog post that the CM3leon model will lead to improvements in future LLMs.
The research paper claims that the Chameleon is state-of-the-art and beats or equals other models like the Gemini, GPT-4 and Meta's own Llama-2.
Like Google's Gemini, Chameleon is built on an “early fusion token-based mixed-modal” architecture. This means that the model was built from scratch to learn from a combination of images, code, text, and other input, and it uses that content to generate sequences.
Another way to create a multimodal architecture is to stitch together multiple models that are trained on the same model. This is called “late fusion”. In essence, AI systems take individual models and combine them together to make predictions. Late fusion apparently works well but can limit the AI's ability to integrate information.
In the paper, the authors say that Chameleon is more similar to Google's Gemini, which was designed in a similar fashion. Unlike Gemini, though, researchers say Chameleon is an end-to-end model.
If the paper's claims and testing are correct and can be replicated, the Chameleon model appears to match or exceed many AI models available today.
An interesting twist to the claims made about Chameleon, is that Mark Zuckerberg and Meta have been pushing open source as the future for much of this year. The existing Llama 3 is open source and is expected to receive a major update in July. Meta also opened up the Quest headset operating system to hardware manufacturers.
The paper does not indicate when Meta might release this new model. Publicly, Meta has been hard at work on the latest iteration of the Lama Assistant. Llama 3 just went live on Facebook, Instagram and WhatsApp in late April.