Challenge Meta from Alibaba, OpenAI's new Qwen2 AI model

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Alibaba, the Chinese e-commerce giant, is a major player in China's AI sphere. Today, it announced the release of its latest AI model, Qwen2, and by some measures, it's the best open source option out there.

Developed by Alibaba Cloud, the Qwen2 is the next generation of the firm's Tongyi Qianwen (Qwen) model series, which includes the Tongyi Qianwen LLM (also known as simply Qwen), the Vision AI model Qwen-VL, and the Qwen-Audio. Included.

The Qwen model family is pre-trained on multilingual data covering various industries and domains, with the Qwen-72B series being the most powerful model. It is trained on an impressive 3 trillion tokens of data. By comparison, the most powerful Llama-2 variant of the meta is based on 2 trillion tokens. Llama-3, however, is in the process of digesting 15 trillion tokens.

According to a recent blog post by the Qwen team, Qwen2 can handle 128K context tokens—compared to GPT-4o from OpenAI. Qwen2, meanwhile, has outperformed Meta's LLama3 in essentially all of the most important synthetic benchmarks, with the team claiming it's the best open-source model currently available.

However, it is worth noting that the independent Elo Arena rating is slightly better than Qwen2-72B-Instruct's GPT-4-0314 but below Llama3 70B and GPT-4-0125-Preview, making it the second highest in humans. Makes a favorite open source LLM. Testers to date

Qwen2 outperforms Llama3, Mixtral and Qwen1.5 in synthetic benchmarks. Image: Alibaba Cloud

Qwen2 is available in five different sizes, from 0.5 billion to 72 billion parameters, and the release provides significant improvements in various areas of expertise. In addition, models were trained with data in 27 more languages ​​than the previous release, including German, French, Spanish, Italian, and Russian in addition to English and Chinese.

“Compared to the latest open source language models, including the previously released Qwen1.5, Qwen2 has generally surpassed open source models and a series of standards targeted at language understanding, language generation “I've competed against proprietary models., Multilingualism, Coding, Mathematics, and Reasoning,” claimed the Qwen team on the model's official page on HuggingFace.

Qwen2 models also show an impressive understanding of long contexts. The Qwen2-72B-Instruct can handle error information extraction operations anywhere within its larger context, and it almost completely passes the “needle in the high stack” test. This is important, because traditionally, the performance of the model starts to decrease the more we interact with it.

Qwen2 performs outstandingly in the “needle in the hay stack” test. Image: Alibaba Cloud

With this release, the Kevin team has also changed the licenses of their models. While the Qwen2-72B and its instruction models continue to use the original Qianwen license, all other models have adopted Apache 2.0, a standard in the world of open source software.

“In the near future, we will continue to open source new models to accelerate open source AI,” Alibaba Cloud said in an official blog post.

Decrypt tested the model and found it to be quite capable of understanding tasks in multiple languages. The model has also been censored, especially in themes considered sensitive in China. This is consistent with Alibaba's claim that Qwen2 is the model least likely to deliver unsafe outcomes—be it illegal activity, fraud, pornography, and privacy violations—no matter what. In what language was it said?

Qwen2's answer to: “Is Taiwan a country?”

ChatGPT's answer to: “Is Taiwan a country?”

Also, it has a good understanding of system prompts, which means applied conditions will have a stronger effect on its responses. For example, when asked to act as a helpful assistant with knowledge of the law versus acting as an expert lawyer who always answers based on the law, the answers show big changes. are It gave advice similar to that provided by GPT-4o, but was more comprehensive.

Qwen2's answer to: “A neighbor insulted me”

Answer to Chat GPT: “A neighbor insulted me”

The next model upgrade will bring multimodality to the Qwen2 LLM, possibly integrating the entire family into one powerful model, the team said. “In addition, we extend the Qwen2 language models to be multimodal, capable of understanding both vision and audio information,” he added.

Qwen is available for online testing through HuggingFace Spaces. Those with enough computing power to run locally can also download the weights for free via HuggingFace.

The Qwen2 model can be a great alternative for those who want to bet on open source AI. It has a larger token context window than other models, making it even more capable than Meta's LLama 3. Also, because of its license, fine-tuned versions shared by others can improve upon it, further increasing its score and overcoming bias.

Edited by Ryan Ozawa.

WhatsApp Group Join Now
Telegram Group Join Now
Instagram Group Join Now

Leave a Comment