Implementing authorization in RAG-based AI systems with Cerbos

Published by Heidi Hokanson on November 12, 2024
Implementing authorization in RAG-based AI systems with Cerbos

Today Cerbos introduces an access control use case for Retrieval Augmented Generation (RAG) and Large Language Models (LLMs), providing a timely solution for software builders looking for secure and practical ways to install guardrails around their AI applications. The functionality is available natively as part of Cerbos PDP and Cerbos Hub, the complete authorization solution for modern architectures.

Learn more and book a demo call here Check out the documentation

As companies implement AI applications, a RAG architecture is often used to give an LLM context from internal data. The challenge that consequently arises is how to provide the LLM with sufficient context without violating privacy and authorization policies. Companies need to ensure that AI agents can’t inappropriately access sensitive data or expose it to unauthorized users.

“The problem is that most architectures centralize all data in one place, making it difficult to segregate specific data an AI model can access,” says Alex Olivier, Chief Product Officer at Cerbos, “The easy solution of loading your corporate data into a central vector store and use this alongside an LLM, but this essentially gives the anyone interacting with the agent root-access to the entire dataset. And that puts you at risk of privacy violations, compliance issues, and losing your customers’ trust.”

Cerbos users have long taken advantage of the Cerbos query plan, which facilitates row-level data filtering based on authorization policies. In a RAG architecture, the same functionality is applied as a filter on the vector store query, enforcing the authorization logic at retrieval time before passing it to the LLM.

How the Cerbos query plan works with RAG

RAG architecture can leverage large data sets of internal knowledge: documents, meeting notes, and resources to provide business-specific context to the LLM. This business data first has to be extracted from the system of record (ERP, CRM, HRIS, etc), go through an embedding process, and then be loaded into a vector store - which is a specialized database that can find related documents based on their contents. Vector stores also support metadata such as what the source system was, which department it belongs to and which region it's associated with. This can all be leveraged for authorization.

With the business data stored along with its associated metadata, a typical workflow would be to put a chatbot-style interface in front which, using Cerbos, can apply authorization logic to the retrieval:

  1. A user interacts with the LLM by asking a simple question which is vectorized via an embedding model.
  2. The filters applicable to the user are generated by the Cerbos Policy Decision Point (PDP).
  3. Then the embedded data store is queried for applicable documents based on the input, with metadata filters based on authorization policies used to restrict which documents the LLM can retrieve.
  4. The retrieved documents are injected into the prompt.
  5. The LLM processes the prompt to generate the answer.

Cerbos AI Agent Diagram (2).png

As a result of this process, all responses generated by the LLM are tailored to the user’s access privileges. Authorization policies can be stored and managed centrally and then applied on the data layer, API layer, application layer, and even in your AI agents.

This approach to access control for RAG also reconciles the conflict between an AI model's need for vast amounts of data and the Zero Trust principle of least privilege access. Applying an RBAC or ABAC model to an AI agent itself would constrain it to specific functions and limit its usefulness. So instead you can make sure that the AI agent honors the RBAC and ABAC policies applied to your users and infrastructure, and acts on behalf of your users according to the privileges allotted to each respective user.

Users of both the free open-source Cerbos PDP and Cerbos Hub can take advantage of this feature for their AI projects. Check out the documentation here.

Learn more and book a demo call here.

About Cerbos

Cerbos is an externalized authorization solution for enterprise software builders with fine-grained security requirements. Ensure least privilege access at every endpoint with policy decision points distributed throughout your architecture for local, runtime authorization. Manage, test, and audit your authorization layer from one Policy Administration Point to streamline policy creation, potential breach investigation, and compliance. Learn more at cerbos.dev.

Book a free Policy Workshop to discuss your requirements and get your first policy written by the Cerbos team