Paris-based open-source generative artificial intelligence startup Mistral AI released another major language model today in an effort to keep pace with the industry’s big boys.
The new Mixtral 8x22B model is expected to outperform the company’s previous model Mixtral 8x7B. Many experts compare it to OpenAI’s GPT-3.5 and Meta Platforms Inc. Considered a very worthy competitor to well-known contenders such as Llama 2.
According to the startup, which raised $415 million in December and is valued north of $2 billion, the new model is its most powerful yet, with a 65,000-token context window, including Refers to the amount of text it can process and refer to. At the same time, in addition, Mixtral 8x22B has a parameter size of up to 176 billion, which refers to the number of internal variables it uses to make decisions and predictions.
Mistral was founded by AI researchers at Google LLC and Meta, and is one of several AI startups focused on creating open-source models that anyone can use. The company took a somewhat unusual approach to creating the new model. Available via torrent link. Posted on social media platform X. It later made Mixtral 8x22B available on the Hugging Face and Together AI platforms, where users can retrain and optimize it to handle more specialized tasks.
The startup released the Mixtral 8x22B just days after its competitors delivered their latest models. On Tuesday, OpenAI debuted the GPT-4 Turbo with Vision, the latest in its series of GPT-4 Turbo models with vision capabilities that allow it to process images, drawings and user-updated images. Enables working with other loaded images. Later that day, Google made its latest, Gemini Pro 1.5 LLM, publicly available, giving developers access to a free version that allows them up to 50 requests per day.
Not to be outdone, Meta also said this week that it plans to launch the Lama 3 later this month.
The Mixtral 8x22B is widely expected to surpass Mistral AI’s previous Mixtral 8x7B model, which beat the GPT-3.5 and Llama 2 in several key benchmarks.
This model leverages an advanced, sparse “mixture of experts” architecture that enables it to perform efficient computations and deliver high performance across a wide range of tasks. The Sparse MoE approach aims to provide users with a collection of different models, each specializing in different types of tasks, as a way to optimize performance and costs.
Mistral AI says on its website, “At each layer, for each token, a router network assigns two groups of them (‘experts’) to process the token and further combine their output. selects. “This technique increases the number of model parameters while controlling cost and latency, as the model uses only a fraction of the total set of parameters per token.”
The unique architecture means that, although the Mixtral 8x22B is very large, it requires only 44 billion active parameters per forward pass, making it faster and faster to use than similarly sized models. Makes it more cost effective.
The launch of Mixtral 8x22B is therefore an important milestone for open source generative AI, allowing researchers, developers and other enthusiasts to play with some of the most advanced models without barriers such as limited access and high costs. It is available for use under a permissive Apache 2.0 license.
The response from the AI community on social media has been largely positive, with enthusiasts hoping it will provide important capabilities for tasks such as customer service, drug discovery and climate modeling.
Despite earning considerable praise for its open-source approach, Mistral AI has also drawn criticism. The company’s models are known as “frontier models” and are meant to be subject to abuse. Furthermore, since anyone can download and build the company’s AI models, startups have no way to prevent their technology from being used for malicious purposes.
Photo: Silicon Angle/Microsoft Designer
Your upvote is important to us and helps us keep the content free.
A click below supports our mission to provide free, deep and relevant content.
Join our community on YouTube.
Join a community of over 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many other luminaries and experts.
Thank you