Inside the creation of DBRX, the world’s most powerful open source AI model

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

This past Monday, about a dozen engineers and executives from data science and AI company Databricks gathered in conference rooms connected via Zoom to learn if they had succeeded in creating a high-level prototype of an artificial intelligence language. The team spent months, and about $10 million, training DBRX, a large language model similar to the design behind OpenAI’s ChatGPT. But they wouldn’t know how powerful their creation was until the final test of its abilities returned.

“We’ve outdone everything,” Jonathan Frankel, chief neural network architect at Databricks and leader of the team building DBRX, finally told the team, which was met with sighs, cheers and clapping emojis. . Frankel normally avoids caffeine but was sipping an iced latte after pulling an all-nighter to write the results.

Databricks will release DBRX under an open source license, allowing others to build on top of their work. Frankel shared data showing the AI ​​model’s ability to answer general knowledge questions, perform reading comprehension, solve vexing logic puzzles, and produce high-quality code in nearly a dozen benchmarks. , DBRX was superior to every other open source model available.

AI Decision Makers: Jonathan Frankel, Naveen Rao, Ali Ghodsi, and Hanlin Tang.Photo: Gabriela Hasbin

It beats Meta’s Llama 2 and Mistral’s Mixtral, two of the most popular open source AI models available today. “Yes!” Databricks CEO Ali Ghodsi ran as the score came out. “Wait, did we hit Alvin’s thing?” Frankel responded that they actually outperformed the Grok AI model that was recently open-sourced by Musk’s xAI, adding, “I’ll call it a success if we get even a little bit of Got the tweet.”

To the team’s surprise, DBRX was also surprisingly close to GPT-4 on many scores, OpenAI’s closed model that powers ChatGPT and is widely considered the pinnacle of machine intelligence. “We set a new state of the art for open source LLMs,” Frankel said with a big smile.

Building blocks

Through open sourcing, DBRX Databricks is adding to a movement that is challenging the secretive approach of the companies most prominent in the current generative AI boom. OpenAI and Google closely guard the code for their GPT-4 and Gemini major language models, but some competitors, notably Meta, have released their models for others to use, arguing that It will promote innovation by putting technology in the hands of more people. Researchers, entrepreneurs, startups, and established businesses.

Databricks says it’s also willing to open up about the work involved in creating its open-source model, something Meta hasn’t done for some key details about the creation of its Llama 2 model. The company will release a blog post detailing the work involved in building the model, and WIRED will also be invited to spend time with Databricks engineers as they complete the final stages of the multimillion-dollar process of training DBRX. During the important decisions were made. It provided a glimpse of how complex and challenging it is to build a leading AI model—but also how recent innovations in the field promise to lower costs. This, combined with the availability of open-source models like DBRX, suggests that AI development isn’t going to slow down anytime soon.

Ali Farhadi, CEO of the Allen Institute for AI, says there is a dire need for more transparency around building and training AI models. The field has become increasingly secretive in recent years as companies have sought to gain an edge over competitors. Obfuscation is especially important when there are concerns about the risks that advanced AI models may pose, he says. “I’m very happy to see any effort in openness,” says Farhadi. “I believe a significant part of the market will move towards open models. We need more of that.”

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment