API Security & Identity

The State of 0Auth


You might be familiar with 0Auth from its use in buttons like these, the same with Facebook and sign in with Google and that kind of thing. But is that really what 0Auth is? Turns out the answer is yes and no. Originally, before 0Auth existed, it was a very common pattern on the internet that, when an application would first launch, it would ask to find your contacts in order to see if your friends were already using the service. To do that, it would ask you for the email address on your account and the password to your email. If we consider it now, we’d never give any random application our email and password for security reasons. Ideally, what we’re looking for is a way to let the application access the contacts and not the actual things in my email. It was an idea of delegated access to a limited amount of data in an account that was really the origins of 0Auth. I like to think of this analogy of checking into a hotel where you go to the person at the front desk, you show them your ID and credit card, they give you back a hotel key, and you take that hotel key to the door and the door opens up and lets you in. In order for this to work, the door does not actually care about who you are and it doesn’t need to know your name, or even a user identifier. The door just needs to know does this key card let you have access, that key card is key to the idea of delegated access. You don’t get access to every room in the hotel like you would if you had the skeleton key for all the rooms, you only get access to certain resources in the hotel when you check in. This makes sense for the third party app use case. But what about first parties when the app you’re logging into is the app that belongs to the API that it’s trying to access? It would be weird if, for example, when you logged into the Twitter mobile app, it asked for permission to access your Twitter data. However, there’s a little bit more to it than that. 

First-party and third app use cases

For the first party app use case, you might expect to see a native login dialogue in a mobile app, you log in and you just type your password into the app. What if you then wanted to add multi factor auth to that application? You could build this into the app and have the app do what it takes to make that work. But what if you had then had four different apps that talk to different parts of your system? You’d have to replicate this to all the different applications and duplicate all of that work adding multi factor to each of the apps. There’s yet another aspect of this on the user expectation side, which is even harder to solve. Here’s an example of a service that actually does this extremely poorly: Apple. In how many different looking dialogues in different places are you asked to enter both your Apple ID as well as your Mac system password? It’s nearly impossible to describe how to identify real dialogues, which basically means you never really know if you’re being fished and if something is trying to steal your apple password. It’s even worse on an iPhone because the dialogue that pops up can be replicated by an application. How does this relate back to 0Auth? It was originally created to solve this third party app use case of not wanting to give random applications our passwords but still wanting to let it access data. It turned out that that same mechanism could also solve these problems with first party applications. It’s all based around this idea of starting at the application, leaving it for the 0Auth server to do the authentication and multifactor there, and then returning back to the application where you’re then logged in. For the third party app use case, it gives us an opportunity to prompt the user and ask them if it’s okay for this application to access certain amounts of data. In the first party app use case, it gives us an opportunity to add in stronger multi factor options, which also don’t require the app developers to do anything. The explanation for this is that, as far as the app is concerned, the app started the flow by sending the user away from the app before returning all logged in.

How does 0Auth work? 

0Auth 2.0 is actually made up of a bunch of different components, starting with the 0Auth 2.0 core specification. This specification is the foundation of everything else, where it starts out by describing several different grant types, which are different ways an application can get an access token to access data. One of the things that the 0Auth core spec does not describe is actually what access tokens are or how they work. In the origin of 0Auth 2.0, there was a pretty large debate about how this should work. One of the main ideas that went out essentially, was the idea of a bearer token, meaning if you have the token, you can use it. The token can be any format, the thing that makes it a bearer token is that if you’re holding it, you can then present it to an API and use it. If somebody steals it, they can use it just as well. This was a pretty controversial topic at the time because 0Auth 1 did not work that way so this got split out into its own specification to describe bearer tokens, we’ll come back to that later.  What else is in 0Auth 2.0 core? One of the other design goals of 0Auth 2.0 was to be usable on mobile devices where 0Auth 1 really wasn’t. Mobile devices and JavaScript apps were relatively new at the time that 0Auth 2.0 was being created and the right tools weren’t quite available so what we ended up with was the implicit flow as a sort of band-aid for how to use 0Auth 2.0 in a way that can be used in JavaScript, even if it wasn’t necessarily the most secure option. The key problem with the implicit flow is how it sends the access token back to the application by sending it in the actual address bar in the redirect. That’s called the front channel. Anytime two things send data by sending data around over the address bar, instead of making an HTTP request, that’s the front channel. I like to think of this as sort of putting your message in an envelope and then sending it in the mail and hoping it makes it to the destination when there isn’t really any assurance that that letter is going to be delivered. On the receiving end, you don’t actually know if the message that was delivered was from the real sender. Even with these problems, the implicit flow was used in JavaScript apps pretty commonly, however, it was realised very quickly that it was even worse using it in mobile compared to JavaScript. A new solution called PKCE was developed, which was meant to address the ability to use the authorization code without a client secret for mobile apps. There are several PKCE available, on how to use 0Auth in mobile devices, mobile apps, single-page apps, and browsers. One of the things that’s happened since the beginning of 0Auth is that older browsers are no longer relevant and we have things like cross-origin resource sharing, which essentially lets us use the authorization code flow with PKCE from JavaScript and which means the implicit flow doesn’t have a use anymore. The Security Best Current Practice is yet another spec being worked on and still in progress. It has recommendations for all kinds of apps using 0Auth, and it goes so far as to say to not use the implicit flow as it has no purpose anymore. It also recommends not to use a password flow because it’s a terrible idea for third-party apps. Even for first-party apps, it’s very limiting and doesn’t give you the ability to add in multi-factor and still has a couple of inherent risks as well. It also recommends using PKCE for confidential clients that have a client secret, because there’s a particular attack that it solves even if you still have a client secret. This means even though PKCE was originally created for mobile apps, it’s applicable to every kind of application. The security BCP also has a bunch of other recommendations including, for example, don’t send access tokens and query strings. 

2.1 is meant to be a consolidation under a single name of all of the current best practices of 0Auth 2.0, removing deprecated features and adding references to extensions that didn’t exist when 0Auth 2.0 was published.  

Along those lines, 2.1 is not actually defining a new behaviour, which means that, if you’re doing 0Auth 2.0 today, in the way that is considered the best practice, you’re already doing 0Auth 2.1. Upgrading doesn’t require a lot of work and doesn’t include anything experimental, in progress, or not widely implemented. 

Recently Published 0Auth Extensions

These are recent RFCs that have gone through the whole process and now have an RFC number. We’ll start with JSON Web Tokens for access tokens. In 0Auth, access tokens can be any format and in practice, a lot of people have a lot of different ideas of how access tokens should work. This is fine because most people don’t need access tokens to be interoperable, other than between their own API and their authorization server. However, if you are building a product that is a standalone 0Auth server, like any of the open-source ones, or otka or auth0, then those are going to be used by a lot of different resource servers built by different people. Those people are using libraries and that’s where it would help to have a standardised format for what access tokens actually look like. JSON Web Token profile for access tokens describes how to use a JSON web token as an access token, mainly for the authorization servers to implement. It describes what things to put into the claims, what kind of algorithms to use to sign it, how that sort of structure should work and it tells resource servers how to validate these as well. Pushed Authorization Requests is one of my favourite ones, it has only been recently published and flips the 0Auth flow on its head. In traditional authorization code flow, the client starts out by building a URL to the authorization server, and then sending that URL through the front channel so that the browser makes that request. Again, anytime we’re using the front channel, we have to think about putting that message in an envelope and sending it in the mail. What we’ve done is the clients built a request, including things like what scopes it’s trying to access and what redirect URL to use. All that stuff is sent in the front channel, which means it can be observed but more importantly, modified by the user. PAR actually starts the flow by sending all that info in the backchannel, it sends in a post request up to the 0Auth server and gets back an opaque identifier to start the flow. That means it can send things that are not visible to the user because if the app is running on a web server, that’s server-to-server communications.

Along those lines, solving a similar problem, we have the idea of using a JSON web token to make that first authorization request. Again, instead of putting a bunch of query string parameters into the URL, the client would first create a JSON web token with all of that data in it, pack it up, sign it, and include that in the URL. There are a few different ways the client could send that to the 0Auth server, one is in the URL, one is by reference to that record existing at some other URL, or combining it with Push Authorization Requests and posting it to the endpoint as well. These two things combined, are the most secure. 

In-Progress experimental work

I mentioned at the beginning that bearer tokens are the most common way access tokens exist right now. That just means that anybody who has a token can use it, even if it is stolen. That’s where we had the idea of sender-Constrained Access Tokens. We are looking for some sort of authentication of the client instance in order to use the access token. How that actually ends up working in practice is currently up for debate. There are a lot of different options and a lot of different attempts at solving this over the years. Here’s a handful of specs that have been attempted to solve this problem:

New 0Auth extensions in progress


What about the future stuff? 

Is there an 0Auth 3.0? You may have heard some rumblings about it but it doesn’t exist. However, there is a group working on something that used to be called TXAuth/XYZ or XAuth and is now called GNAP. That’s an emulation of all the different flows of clients needing to be pre-registered. It is backchannel first, it reduces reliance on the front channel, treats the client as a first-class citizen, and has proof of possession by default and asymmetric cryptography baked in. It is very much in progress and nowhere near being done, therefore unlikely to be actually usable in commercial products anytime soon but there is a lot to learn there. 

Aaron Parecki

Aaron Parecki

Maintainer of OAuth Spec at Okta
Aaron Parecki is the co-founder of IndieWebCamp, a yearly conference on data ownership and online identity. He is the editor of the W3C Webmention and Micropub specifications, and maintains oauth.net. He has spoken at conferences around the world about OAuth, data ownership, quantified self, and even explained why R is a vowel. Aaron has tracked his location at 5 second intervals since 2008, and was the co-founder and CTO of Geoloqi, a location-based software company acquired by Esri in 2012. His work has been featured in Wired, Fast Company and more. He made Inc. Magazine’s 30 Under 30 for his work on Geoloqi. Aaron holds a BS in computer science from University of Oregon and lives in Portland, Oregon.

APIdays | Events | News | Intelligence

Attend APIdays conferences

The Worlds leading API Conferences:

Singapore, Zurich, Helsinki, Amsterdam, San Francisco, Sydney, Barcelona, London, Paris.

Get the API Landscape

The essential 1,000+ companies

Get the API Landscape
Industry Reports

Download our free reports

The State Of Api Documentation: 2017 Edition
  • State of API Documentation
  • The State of Banking APIs
  • GraphQL: all your queries answered
  • APIE Serverless Architecture