Last Updated on February 29, 2024 9:34 am by Laszlo Szabo / NowadAIs | Published on February 29, 2024 by Juhasz “the Mage” Gabor
Meta’s Llama 3 in July: All You Need to Know About Zuckerberg’s New AI Model – Key Notes:
- Meta, led by CEO Mark Zuckerberg, announced the training of Llama 3, a new large language model (LLM).
- Llama 3 follows the release of Llama 1 and Llama 2, with a focus on open sourcing the models.
- No release date yet, but based on previous timelines, Llama 3 could debut around July 2024.
Meta’s Llama 3’s Background
In January 2024, Mark Zuckerberg, the CEO of Meta, shared in an Instagram video that the company’s AI division had recently begun training the Llama 3. This new generation of the LLaMa family of large language models (LLMs) follows the release of Llama 1 models (initially stylized as “LLaMA”) in February 2023 and Llama 3 models in July – according to The Information.
While details such as model sizes and multimodal capabilities have not yet been disclosed, Zuckerberg stated that Meta plans to continue open sourcing the Llama foundation models:
View this post on Instagram
When will Llama 3 be available?
Although there is no official release date, it’s worth noting that it took three months to train Llama 1 and six months to train Llama 2.
If the next generation of models follows a similar timeline, Llama 3 could potentially be released around July 2024. However, Meta may allocate extra time for fine-tuning and ensuring proper model alignment.
Increasing access to Generative AI models empowers more entities than just enterprises, startups, and hobbyists.
As open source models become more powerful, it’s crucial to reduce the risk of malicious use by bad actors.
In his announcement video, Zuckerberg reiterated Meta’s commitment to training models responsibly and safely.
Will Llama 3 be open source?
While Meta provided access to Llama 1 models to research institutions for noncommercial use on a case-by-case basis, Llama 2 code and model weights were released under an open license allowing commercial use by organizations with fewer than 700 million monthly active users.
Although there is debate over whether Llama 2’s license meets the strict technical definition of “open source,” it is generally referred to as such.
There is no evidence suggesting that Llama 3 will be released any differently.
Will Llama 3 be multimodal?
An emerging trend in AI is multimodal models, which can understand and operate across different data formats or modalities.
Rather than creating separate models for text, code, audio, images, or videos, new state-of-the-art models, such as Google’s Gemini and OpenAI’s GPT-4V, and open source models like LLaVa, or Qwen-VL, can seamlessly move between computer vision and natural language processing tasks.
While Zuckerberg has confirmed that Llama 3, like Llama 2, will have code-generating capabilities, he did not explicitly mention other multimodal capabilities.
However, in his Llama 3 announcement video, Zuckerberg discussed how he envisions AI intersecting with the Metaverse, suggesting that Meta’s plans for the Llama models include integrating visual and audio data alongside text and code data, which aligns with their goal of achieving AGI.
How will Llama 3 compare to Llama 2?
Zuckerberg also announced significant investments in training infrastructure. By the end of 2024, Meta aims to have approximately 350,000 NVIDIA H100 GPUs, bringing their total available compute resources to
“600,000 H100 equivalents of compute,”
including the GPUs they already have.
Currently, only Microsoft has a comparable stockpile of computing power. Therefore, it’s reasonable to expect that Llama 3 will offer significant performance enhancements compared to Llama 2 models, even if the sizes are similar.
As suggested by a March 2022 paper from Deepmind, training smaller models on more data yields better performance than training larger models with less data.
While Llama 2 was available in the same sizes as Llama 1, it was pre-trained on 40% more data.
Although the sizes of Llama 3 models have not yet been announced, it’s likely that they will continue to increase performance within 7-70 billion parameter models, as seen in previous generations.
Meta’s recent investments in infrastructure will surely enable more robust pre-training for models of any size. Additionally, Llama 2 doubled the context length of Llama 1, meaning it can “remember” twice as many tokens during inference.
Definitions:
- LLama 3: The third iteration of Meta’s large language model family, focusing on advanced AI capabilities and potential multimodal applications.
- Meta: The parent company of Facebook, Instagram, and WhatsApp, focusing on bringing people together through technology and leading innovation in AI.
- GPU (Graphics Processing Unit): A specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images and calculations in a frame buffer intended for output to a display device.
Frequently Asked Questions:
- What is Meta’s Llama 3?
Llama 3 is the latest large language model developed by Meta, expected to push the boundaries of AI with advanced capabilities. - When will Llama 3 be released?
While there’s no set release date, Meta’s timeline suggests a potential debut around July 2024. - Will Llama 3 be open source?
Based on Meta’s commitment to open licenses, Llama 3 is anticipated to follow an open-source model. - What new features can we expect in Llama 3?
Llama 3 is expected to include multimodal capabilities, enhancing its integration with visual and audio data. - How does Llama 3 compare to its predecessors?
With significant infrastructure investments, Llama 3 is expected to offer notable performance enhancements over Llama 2.