Bozeman, Mont. – Snowflake Inc. (NYSE: SNOW) has announced a major update to its Cortex AI platform, now hosting Meta’s Llama 3.1 collection of multilingual open-source large language models (LLMs). This new offering includes Meta’s most advanced model, Llama 3.1 405B, and is set to significantly improve AI application development for enterprises.
The integration brings Meta’s Llama 3.1 405B to Snowflake’s Cortex AI, allowing businesses to leverage these powerful models for real-time, high-throughput inference and fine-tuning. Snowflake has optimized the model to support a substantial 128,000-token context window from the outset, and promises up to three times lower latency and 1.4 times higher throughput compared to existing open-source solutions. Additionally, fine-tuning can now be accomplished with a single GPU node, streamlining processes and reducing costs.
Vivek Raghunathan, Snowflake’s VP of AI Engineering, emphasized the significance of this development: “Our AI Research Team is setting new standards for how enterprises and the open-source community utilize advanced models like Llama 3.1 405B. We’re not just providing direct access to these cutting-edge models through Cortex AI; we’re also delivering new research and open-source tools to enhance AI capabilities across the board.”
Snowflake’s AI Research Team has also open-sourced its Massive LLM Inference and Fine-Tuning System Optimization Stack. This stack, developed in collaboration with DeepSpeed, Hugging Face, and vLLM, establishes a new benchmark for handling large-scale models. It addresses key challenges such as memory requirements and inference latency, enabling efficient processing on standard hardware and reducing the need for extensive infrastructure.
The optimization stack supports both next-generation and legacy hardware, allowing broader access to powerful AI tools. Data scientists can now fine-tune Llama 3.1 405B using fewer GPUs, making advanced AI applications more accessible and cost-effective for organizations.
In addition to these technical advancements, Snowflake is enhancing its commitment to AI safety with the general availability of Snowflake Cortex Guard. This new feature helps protect against harmful content in LLM applications, leveraging Meta’s Llama Guard 2 to ensure safe and reliable AI deployment.
Snowflake’s announcement has been met with enthusiasm from its partners and customers. Dave Lindley, Sr. Director of Data Products at E15 Group, remarked, “Access to Meta’s leading Llama models within Snowflake Cortex AI empowers us to extract valuable insights from our data, enhancing our Voice of the Customer platform.”
Ryan Klapper, an AI leader at Hakkoda, highlighted the importance of safety and trust: “Snowflake’s integration of Meta’s models ensures that we can innovate and utilize these powerful LLMs while maintaining high standards of safety and reliability.”
Matthew Scullion, CEO of Matillion, added, “The addition of Llama 3.1 within Snowflake Cortex AI provides our users with even more flexibility and choice, keeping them at the forefront of AI innovation.”
Kevin Niparko, VP of Product and Technology Strategy at Twilio Segment, emphasized the practical benefits: “Snowflake Cortex AI’s support for diverse models enables our customers to generate intelligent insights and activate them effectively, crucial for driving optimal business outcomes.”
By: Montana Newsroom staff