Feb 24 (Reuters) – Meta Platforms Inc ( META.O ) on Friday released to researchers a new large language model, the core software of a new artificial intelligence system, heating up an AI arms race as big tech companies rush to Integrate technology into their products and attract investors.
The public battle to dominate the AI technology space began late last year with the launch of Microsoft-backed OpenAI’s ChatGPT and has prompted tech heavyweights from Alphabet Inc ( GOOGL.O ) to China’s Baidu Inc ( 9888.HK ) to unveil their own offerings. .
Meta’s LLaMA, short for Large Language Model Meta AI, is available under a non-commercial license to researchers and organizations affiliated with government, civil society, and academia. A blog.
Large language models mine vast amounts of text to summarize information and generate content. They can answer questions, for example, with sentences that can be read as if they were written by humans.
Latest Updates
See 2 more stories
The model, which Meta said requires “much less” computing power than previous offerings, was trained in 20 languages with Latin and Cyrillic alphabets.
“Meta’s announcement today appears to be a step forward in testing their creative AI capabilities so they can implement them into their products in the future,” said Gil Luria, senior software analyst at DA Davidson.
“Generative AI is a new application of AI that Meta has little experience with, but which is clearly important to the future of their business.”
AI has emerged as a bright spot for investments in the technology sector, whose slowing growth has prompted widespread layoffs and a cutback in experimental races.
Meta said LAMA can outperform competitors that examine more parameters or variables the algorithm takes into account.
In particular, it said that the version of LAMA with 13 billion parameters outperforms GPT-3, the latest predecessor of the ChatGPT structured model.
It described its 65-billion-parameter LAMA model as “competitive” with Google’s Chinchilla70B and PalM-540B, which is larger than the model Google used to demonstrate its Bart chat-powered search.
A Meta spokesperson attributed the performance to a large amount of “cleaner” data and “architectural improvements” to the model that improved training consistency.
Meta released the large-language model OPT-175B in May last year, aimed at researchers, which formed the basis of a new iteration of its chatbot BlenderBot.
It later introduced a model called Galactica, which could write scientific papers and solve math problems, but quickly pulled the demo after producing official-sounding incorrect answers.
Reporting by Yuvraj Malik and Eva Mathews in Bangalore and Katie Paul in New York; Editing by Shailesh Kuber and Grant McCool
Our Standards: Thomson Reuters Trust Principles.