Magdalena Jackiewicz
Editorial Expert
Reviewed by a tech expert

Snowflake Arctic: a groundbreaking release with unprecedented advantages for enterprise AI

#Sales
#Sales
#Sales
#Sales
Read this articles in:
EN
PL

Snowflake has just recently announced the release of Arctic, its state-of-the-art Large Language Model. What is an LLM? What makes Snowflake Arctic truly groundbreaking? What does this specific release mean for enterprises?

What is a Large Language Model (LLM)?

Large language models are artificial intelligence systems that are trained on massive amounts of text data (typically in the range of billions of words). Such a training process allows the model to analyze the patterns, structure, and semantics of human language at a very deep level.

The core technical innovation behind LLMs is the use of deep learning, a type of machine learning that relies on artificial neural networks to learn from large amounts of data. LLMs typically use transformer-based architectures, which have proven to be highly effective at capturing the complex relationships and patterns within languages.

The key characteristics of LLMs include:

  • Scale: LLMs are trained on an unprecedented scale, with datasets that are orders of magnitude larger than traditional language models. This scale allows them to capture a more comprehensive understanding of language.
  • Generalization: LLMs are designed to be able to generalize their language understanding beyond the specific data they were trained on. This allows them to be applied to a wide range of language-related tasks, from content generation to question answering.
  • Contextual understanding: capable of understanding language in context, LLMs take into account the surrounding words, phrases, and even broader context to interpret the meaning and intent behind the text.
  • Versatility: LLMs can be adapted to specific tasks and domains, allowing them to be used for a variety of applications, from customer service chatbots to code generation.
  • Continuous learning: some LLMs are designed to continue learning and improving their language understanding through interactions with users, increasing their capabilities over time.

Use cases of LLMs in enterprise AI

LLMs can play a significant role in large organizations, such as corporations, government agencies and non-profit institutions. They can be used to automate tasks, enhance decision-making, and drive business value across various functions and departments – something that’s commonly referred to as ‘enterprise AI’. 

As the size and complexity of LLMs have grown, they have become increasingly powerful and versatile, with the ability to perform a wide range of language-related tasks with human-like fluency. This has led to their rapid adoption across various industries, from customer service to scientific research. Take a look at these examples of applications of enterprise AI: 

Conversational AI

LLMs are the backbone of advanced conversational AI systems that can engage in more natural, contextual, and personalized dialogues. These AI-powered chatbots and virtual assistants can understand complex queries, provide relevant information, and even automate routine tasks like scheduling, order processing, and technical support. By integrating LLMs, enterprises can enhance the customer experience, improve employee productivity, and streamline a wide range of conversational interactions across various communication channels, from customer service to internal knowledge sharing.

Content generation

The language generation capabilities of LLMs enable enterprises to scale their content production significantly. LLMs can be used to create high-quality articles, reports, marketing materials, and even code, ensuring consistency and freeing up subject-matter experts to focus on more strategic work. Additionally, LLMs can summarize large volumes of text-based information, such as customer feedback, research papers, and business documents. By extracting key insights, trends, and actionable recommendations from these sources, enterprises can make more informed decisions, uncover hidden opportunities, and address emerging challenges.

Code generation

While LLMs are primarily known for their language-related capabilities, they can also be applied to software development tasks. LLMs can assist developers by generating boilerplate code, implementing common algorithms and data structures, and even providing suggestions for code refactoring and optimization. This helps accelerate the software development lifecycle and allows developers to focus on more complex, high-value tasks.

Knowledge retrieval

Integrating LLMs with enterprise knowledge management systems allows employees to access information more efficiently through natural language queries. Instead of navigating complex databases or searching through vast repositories, users can simply ask questions in their own words and receive concise, relevant answers. This knowledge retrieval and question-answering functionality powered by LLMs can be particularly valuable in large, complex organizations where employees need to quickly access and synthesize information from various sources to support their decision-making.

Multimodal AI

By combining LLMs with other AI technologies, such as computer vision and speech recognition, enterprises can create multimodal AI systems that can understand and generate content across multiple modalities, including text, images, and audio. These advanced systems enable new applications like visual question answering, where users can ask questions about images, and multimodal content generation, where text, visuals, and other media are produced in a cohesive manner. Multimodal AI can enhance user experiences, improve accessibility, and unlock new possibilities for collaboration and knowledge sharing within the enterprise.

Process automation

LLMs can be leveraged to automate various language-based business processes, such as document processing, contract review, and customer correspondence. By automating these repetitive, high-volume tasks, enterprises can improve efficiency, reduce errors, and free up employees to focus on more strategic and value-adding activities. This integration of LLMs into process automation can lead to significant productivity gains, cost savings, and improved compliance for enterprises.

Personalization

Enterprises can utilize LLMs to personalize content, products, and services based on individual preferences and behaviors, enhancing customer experience and loyalty. LLMs can also power recommendation systems that suggest relevant information, products, or actions to employees and customers, driving engagement, sales, and overall business performance. These personalization and recommendation capabilities enabled by LLMs can be particularly valuable in customer-facing applications, as well as in internal knowledge-sharing and decision-support systems.

R&D

LLMs can assist enterprises in conducting research, generating hypotheses, and exploring new ideas, accelerating innovation and problem-solving. Researchers and engineers can use LLMs to generate and evaluate new concepts, prototypes, and solutions, reducing development time and costs. This can unlock new opportunities for growth, differentiation, and competitive advantage, as enterprises can rapidly ideate, experiment, and validate innovative products, services, and business models.

What is Snowflake Arctic?

Arctic is Snowflake’s proprietary LLM that was designed specifically to handle enterprise workloads with unprecedented reliability, lower latency and lower cost of ownership than other LLMs. It leverages a unique Mixture-of-Experts (MoE) architecture to deliver top-tier intelligence and performance at scale. It promises to outperform industry benchmarks in areas like SQL code generation and instruction following.

Diagram comparing the architecture of dense transformer with MoE transformer model used in Snowflake Arctic.

Based on the Massive Text Embedding Benchmark (MTEB) Retrieval Leaderboard, the largest Arctic model with only 334 million parameters is the only one to surpass an average retrieval performance of 55.9 - a feat usually only achievable with models about 4x larger.

Scheme comparing MTEB retrieval score of Snowflake Arctic and other enterprise ai models.

Snowflake Arctic is part of the broader Snowflake Arctic model family, which also includes high-performing text embedding models for retrieval use cases. The family of five models, ranging from x-small to large, are available on Hugging Face, in Snowflake's Cortex embed function (in private preview), via NVIDIA API Catalog, Microsoft Azure, Replicate, Together AI and will soon be available also on AWS, Perplexity and Lamini. These impressive embedding models leverage the technical expertise, search knowledge, and R&D that Snowflake acquired from Neeva last year.

Snowflake released Arctic under an Apache 2.0 license, allowing free use in research, prototypes, and products.

Table listing the different Snowflake Arctic models and their properties.

What advantages can enterprises unlock with Snowflake Arctic?

Snowflake’s LLM is optimized for complex enterprise workloads and outperforms industry benchmarks in areas like SQL code generation and instruction following, but that’s not where its benefits end.

True openness

Snowflake announced the release of Arctic under an Apache 2.0 license with the promise to be the most open enterprise-grade LLM on the market. Arctic was in fact built on the collective experiences and open insights from the AI community. The company is committed to maintaining Arctic as a truly open ecosystem, going beyond just open-sourcing the model weights and code. They are in fact providing details on the research and training process behind Arctic, setting a new standard for openness in enterprise AI technology.

The company will be releasing comprehensive research insights in the form of a detailed Snowflake Arctic cookbook that shares their learnings from the development process. This covers topics like data sourcing, model architecture, and evaluation - aiming to help others build high-quality MoE models more efficiently.

Excellent value for money

Many enterprises often need to build conversational SQL data copilots, code copilots, and retrieval-augmented generation (RAG) chatbots. These translate to requirements for LLMs that excel at SQL, code, complex instruction following, and producing grounded answers. However, according to an IMB survey, 21% of enterprises identify pricing as a barrier that prevents them from adopting enterprise AI.

Snowflake's Arctic may be an excellent answer to this issue. The LLM uses a training compute budget of roughly under $2 million (less than 3,000 GPU weeks). This means Arctic is more capable than other open-source models trained with a similar compute budget.

Compared to larger models like LLAMA 3 8B and LLAMA 2 70B, Arctic matches or exceeds their performance on enterprise intelligence metrics, while using less than half the training compute budget. Similarly, despite using 17x less compute than LLAMA 3 70B, Arctic remains competitive on overall performance, including language understanding, reasoning, and math tasks.

The high training efficiency of Arctic means that Snowflake customers and the AI community can train custom models in a more affordable way.

Table with performance parameters of Snowflake Arctic and other enterprise AI models.

The most reliable enterprise search

Text embedding models are a crucial component of modern AI systems, enabling the retrieval of the most relevant content for a wide range of applications, from search to powering AI agents. Recognizing the importance of text embeddings, Snowflake has committed to delivering the best possible experience for customers leveraging Snowflake for their search needs.

Leveraging Snowflake's deep expertise in search and the latest research in this area, the company has set out to create the best open-source text embedding models from the ground up.

Snowflake's text embedding models are unmatched in terms of quality and total cost of ownership, making them a compelling choice for enterprises of any size seeking to power their embedding workflows. The groundbreaking set of text embedding models that can serve as a foundation for a wide range of AI-powered applications.

Unprecedented training efficiency

Snowflake's Arctic models use a unique Dense-MoE Hybrid transformer architecture, combining a 10B dense transformer with a 128 x 3.66B MoE MLP. This results in 480B total parameters, with 17B active parameters selected using top-2 gating.

The key innovations behind this architecture and training approach are: Insights from DeepSpeed showed that MoE can significantly improve model quality without increased compute. Arctic leverages a large number of fine-grained experts (128) and total parameters (480B) to enhance model capacity, while judiciously choosing 17B active parameters for efficient training and inference. This contrasts with recent MoE models that use fewer experts.

Diagram showing compute usage for trainig in Snowflake Arctic and other enterprise AI models.

Training large MoE models has high communication overhead, but the Dense-MoE Hybrid architecture enables overlapping communication and computation to hide this overhead.

Excelling at enterprise tasks like code generation and SQL requires a different training curriculum than for generic metrics. Arctic was trained in a three-stage curriculum, starting with generic skills like common sense reasoning, then progressing to more complex enterprise-focused skills in later stages. This curriculum approach, inspired by human learning, allows the model to effectively learn a diverse set of capabilities.

The combination of architectural innovation, system-level optimizations, and a carefully crafted training curriculum enables the high training efficiency and performance of the Snowflake Arctic models.

Unprecedented inference efficiency

Snowflake's Arctic is efficient not just in training, but also in inference - which is critical for practical deployment at low cost. Arctic represents a major leap in the scale of MoE models, using more experts and total parameters than any other open-sourced autoregressive MoE model.

This scale requires several system-level innovations to enable efficient inference:

  • For interactive inference at small batch sizes (e.g. batch size of 1), Arctic's inference is memory bandwidth-bound. Compared to other large MoE models, Arctic requires up to 4x fewer memory reads, leading to faster latency. Snowflake has collaborated with NVIDIA to provide a preliminary implementation of Arctic for interactive inference, leveraging FP8 quantization to fit it within a single GPU node. At batch size 1, this early implementation achieves over 70 tokens/second.
  • As the batch size increases significantly (e.g. thousands of tokens per pass), Arctic transitions to being compute-bound, where it incurs 4x less compute than other 70B models. To enable this high-throughput, compute-bound inference, Snowflake is working on system optimizations like FP8 weights, split-fuse, continuous batching, tensor parallelism, and pipeline parallelism across multiple nodes. This allows storing the nearly 500B parameters required.

Take the next step towards enterprise AI adoption

The launch of Snowflake Arctic marks a significant milestone in the evolution of large language models. By combining state-of-the-art performance, unparalleled efficiency, and unprecedented openness, Snowflake has set a new standard for enterprise-grade AI technology.

The innovations behind Arctic's unique architecture and inference capabilities demonstrate Snowflake's commitment to pushing the boundaries of what's possible with LLMs. But the true impact of Arctic will be seen in how it empowers organizations to tackle their most complex challenges - from automating mission-critical workflows to unlocking transformative insights from their data.

As Snowflake continues to collaborate with the broader AI community to further optimize and expand the Arctic model family, the future looks bright for organizations seeking to harness the power of LLMs. Arctic represents a quantum leap forward and seems to be ushering in a new era of efficient intelligence that will redefine what's achievable with enterprise AI.

The possibilities are limitless for those enterprises bold enough to embrace this transformative technology. If you’re ready to embark on this journey, write to us today through this contact form and we’ll schedule a free strategy call to discuss the details.

People also ask

No items found.
Want more posts from the author?
Read more

Want to read more?

Data

ELT Process: unlock the future of data integration with Extract, Load, Transform

Unlock the future of data integration with our ELT process guide. Learn how Extract, Load, Transform can streamline your data workflow.
Data

Data integration: different techniques, tools and solutions

Master data integration with our comprehensive guide on techniques, tools, and solutions. Enhance your data strategies effectively.
Data

Supply chain analytics examples - 18 modern use cases

Explore real-world applications with our guide on supply chain analytics examples. See how data insights transform operations.
No results found.
There are no results with this criteria. Try changing your search.
en