Compound AI: Thinking in Systems, Not Models

Why composability and flexibility are the way forward in AI system design

By

The deepset Team

,

Published on

July 8, 2024

12

min read

TLDR

Key Metrics:

AI today isn't about models. Or rather, it's not just about models. Modern AI systems are often complex, modular systems made up of multiple interconnected components. Some components contain or connect to AI models, while others perform non-AI functions. This is known as Compound AI.

‍

Platforms like deepset Cloud are uniquely equipped to meet the needs of AI teams looking to design, build, test, and deploy products built with Compound AI.

What is Compound AI?

Compound AI is a design principle championed by the Berkeley AI Research (BAIR) lab, as described in their blog post about “the shift from models to Compound AI systems.”

‍

We define a Compound AI System as a system that tackles AI tasks using multiple interacting components, including multiple calls to models, retrievers, or external tools. – BAIR

‍

This approach is not new. Products and tools that use AI have long been designed as modular systems, graphs, or pipelines. That’s why we’ve built our own generative AI orchestration tools as modular toolboxes that provide the building blocks to create and maintain modern AI tools. These building blocks or components are then combined into increasingly complex pipelines or graphs.

‍

A classic retrieval augmented generation (RAG) system combines a retriever component, a prompt component, and an LLM API connector in a single pipeline or graph.

Components: Building blocks of Compound AI

Components typically have one well-defined task. This means that they can be evaluated and improved in isolation from the rest of the system. It also makes it easy to replace components when, for example, a newer, faster, or cheaper version becomes available. As a result, Compound AI systems are more flexible than static configurations.

Composability: the de-facto standard in AI today

The basic retrieval augmented generation (RAG) setup, shown above, is a common application of Compound AI today. RAG combines a retrieval component with a generative component to create a system that is both creative and grounded in a controlled database. But this basic RAG setup (which can be extended indefinitely) is just one of many Compound system designs. A few examples:

‍

Advanced RAG setups still use components for retrieval and generation, but they also contain other components, for instance to rerank the retriever’s results according to some product-specific criteria, before passing them on to the LLM.
Agents use multiple highly specialized components that are invoked by the controlling system as needed. For example, an agentic system might include a Web browser and access to a proprietary database. Depending on the context required to perform a task, the agent will invoke the browser, the database, or both.
Hybrid retrieval is another example of a popular modular setup that has been in use for years. It uses two components to retrieve information, most commonly one that is keyword-based and one that understands semantics. In this setup, the two retrievers can complement each other when one falls short (for example, when the semantic model doesn't know the language of a query).
Multimodality allows us to process different types of data – such as text, images, video files, and spreadsheets – in a single system, increasing the insights that can be gained. Composability allows us to easily combine models optimized for different data types in a single system.
Finally, the concept of modularity can also be applied to indexing pipelines: These are the systems that preprocess incoming files and prepare them for semantic search with a language model. Thanks to Compound AI, these pipelines can be extended at will. For example, you could add a component that parses documents for names and locations and adds this information to the database as metadata during indexing (this is commonly known as “named entity recognition” or NER).

Advantages of using Compound AI

LLMs are language models that have ingested incredible amounts of data, enabling them to perform a wide range of tasks. New, more powerful models – both commercial and open source – are emerging all the time. To differentiate themselves and truly harness the generative and computational power of these models, companies must learn how to use them in their products and tools. This means not just demonstrating the power of generative AI technology, but building it into your products in a seamless and game-changing way. This is only possible with Compound AI.

‍

Composability allows you to plug a given component into the system you and your AI team have built, often based on a proprietary database. It lets you add models and other components as you go, depending on what your task requires. And it lets you remove the parts you no longer need, or swap them out for a better version. Finally, it lets you combine complex, billion-parameter language models with other, simpler components, for example a keyword-based retriever like BM25.

‍

During evaluation, you can look at each of your components in isolation, breaking down the gargantuan task of evaluating a complex AI system into smaller, more manageable subtasks. The composability paradigm also allows you to reuse models in different parts of your organization. This will become increasingly important as more companies implement comprehensive strategies for building with AI.

A new standard in AI

In a world obsessed with models, Compound AI sets a new standard. When you build a system with Compound AI, you need to start with the design, not just the models. This encourages strategic thinking at all stages of your project. Let's go back to our evaluation example. It is true that modularity allows you to evaluate each component in isolation. But you also need to think about how the system works as a whole.

‍

Similarly, the fact that you can mix and match AI models at virtually unlimited levels means that it becomes much harder to keep track of all the changes and iterations you make to the system. As with traditional software engineering, version control becomes an indispensable part of the process.

‍

Finally, the advent of composite AI may require some realignment among some – or perhaps all – members of your organization. The AI team is first and foremost a product team: it seeks to create products or tools that make life easier for its users. In this mindset, the AI model itself is just a technology – one that has its skills and limitations and will most likely be replaced at some point. This mindset shift is often the biggest hurdle to designing AI systems efficiently – but it is one that, if taken seriously, can be easily overcome.

Compound AI in deepset

The deepset AI Platform (formerly known as deepset Cloud) helps AI teams build with LLMs and Gen AI. The platform was created with modularity as a guiding principle. After building countless Compound AI systems for our early customers, we realized that in an era of growing demand for AI-powered tools and increasing commoditization of the models themselves, what modern AI teams need is a platform that makes it easy to try different models in different Compound AI setups. Such a platform should also automatically handle versioning and logging in the background.

‍

In deepset, you can choose from thousands of different language models to iteratively build the system that best serves your custom tool or product. Once your system is in production, deepset lets you continuously evaluate and monitor it, so that you can improve or replace models that no longer perform well. Additionally, the various workflows we offer in the platform are designed to follow best practices in AI product development, allowing your AI team to collaborate and learn together as they progress through their AI development journey.

‍