Stateful Authorization Is an Anti-Pattern

This article is available first on CNCF - read it here.

When it comes to designing core components of a software architecture one of the early decisions that need to be made is where to store the state of the application. The state is the persisted data about the resources inside a system and typically these would be kept in a database or some type of file system.

In the context of authorization – defining and enforcing who can do what inside of a system – the state involved is about the principal (usually a user, but could be a machine-token, service account, or any other identity) and the resource being accessed. For both of these, there is a unique ID and attributes for example principal 123 is in the finance department and has the manager role and the resource is an expense claim with ID XYZ for $1000 filed by user 678.

The component of the system that needs to make the authorization decision uses this information as input to produce a result of whether an action is allowed or not based on the business rule. This, therefore, means the authorization layer needs access to this state to do its job, which raises an interesting question about where to keep the state.

Stateful authorization

One approach is to have the authorization store the state in its own storage layer which it manages and this knows precisely where it is stored and can access it when needed. This simplifies checking for authorization as the application layer just requires providing the ID of the principal and the resource along with the action being checked to get a response. The complexity though is in getting the state into the storage layer efficiently, accurately, and most importantly keeping it fresh.

A synchronization system needs to be implemented not just to push the new state into the authorization layer, but also continually keep it up-to-date and current. This has to be a real-time feed as in the example of a user creating a new resource and then immediately trying to access it and person some action, the state of this brand-new resource has to be available for the very next request in order to get the correct response. One cache miss delayed update or failed batch processing will lead to a user being denied (or worse allowed) to do an action on a resource. With all this state now in place, the authorization layer itself now needs to be deployed in a way that can scale the decisioning as well as the data store in line with the application – not a simple task at larger scales.

An alternate method is to have the authorization layer itself call out to where the state is stored when an authorization check is made but this now requires there to be a known data format between all systems and the data stores to be able to scale up and down on demand as authorization checks are being made. There is a real risk in this design that what may seem like a simple check could result in an unbounded number of calls out across the infrastructure just to query, gather and pull in the state required. Given that the authorization logic itself can be updated independently of the external service holding the state.

Stateless authorization

Another way to approach this is to keep the state at the source, in the application layer, and maintain that single source of truth. This makes checking for permissions slightly more verbose as all the attributes about the principal and resource need to be provided at request time to the authorization layer in order to make a decision, but it means the decisions making component itself is now completely self-contained and doesn’t have any external dependencies or having to rely on synchronization to work 100% of the time or an external system to be scaled and available to fetch state from.

Whilst having to fetch all the state upfront in the application layer, before checking permissions may be additional work, in reality, most of the time the information is readily available in the request lifecycle anyway – the principal information comes from the authenticated session typically, and the resource is going to be accessed anyway in order to do the action should the authorization be approved.

From the very start, Cerbos was built for stateless authorization as it enables fast decisions (as fetching state is slow) which is important as authorization is in the critical path of every request, which means that an instance can be run alongside your application and scaled up and down without having any extra dependencies and finally bring predictability to application design as it is the application layer doing the state fetching rather than an opaque box that could be putting an unbounded load on the state-storing services.

This article is available first on CNCF - read it here.

Tagged in

Guide