You might be familiar with 0Auth from its use in buttons like these, the same with Facebook and sign in with Google and that kind of thing. But is that really what 0Auth is? Turns out the answer is yes and no. Originally, before 0Auth existed, it was a very common pattern on the internet that, when an application would first launch, it would ask to find your contacts in order to see if your friends were already using the service. To do that, it would ask you for the email address on your account and the password to your email. If we consider it now, we’d never give any random application our email and password for security reasons. Ideally, what we’re looking for is a way to let the application access the contacts and not the actual things in my email. It was an idea of delegated access to a limited amount of data in an account that was really the origins of 0Auth. I like to think of this analogy of checking into a hotel where you go to the person at the front desk, you show them your ID and credit card, they give you back a hotel key, and you take that hotel key to the door and the door opens up and lets you in. In order for this to work, the door does not actually care about who you are and it doesn’t need to know your name, or even a user identifier. The door just needs to know does this key card let you have access, that key card is key to the idea of delegated access. You don’t get access to every room in the hotel like you would if you had the skeleton key for all the rooms, you only get access to certain resources in the hotel when you check in. This makes sense for the third party app use case. But what about first parties when the app you’re logging into is the app that belongs to the API that it’s trying to access? It would be weird if, for example, when you logged into the Twitter mobile app, it asked for permission to access your Twitter data. However, there’s a little bit more to it than that.
First-party and third app use cases
For the first party app use case, you might expect to see a native login dialogue in a mobile app, you log in and you just type your password into the app. What if you then wanted to add multi factor auth to that application? You could build this into the app and have the app do what it takes to make that work. But what if you had then had four different apps that talk to different parts of your system? You’d have to replicate this to all the different applications and duplicate all of that work adding multi factor to each of the apps. There’s yet another aspect of this on the user expectation side, which is even harder to solve. Here’s an example of a service that actually does this extremely poorly: Apple. In how many different looking dialogues in different places are you asked to enter both your Apple ID as well as your Mac system password? It’s nearly impossible to describe how to identify real dialogues, which basically means you never really know if you’re being fished and if something is trying to steal your apple password. It’s even worse on an iPhone because the dialogue that pops up can be replicated by an application. How does this relate back to 0Auth? It was originally created to solve this third party app use case of not wanting to give random applications our passwords but still wanting to let it access data. It turned out that that same mechanism could also solve these problems with first party applications. It’s all based around this idea of starting at the application, leaving it for the 0Auth server to do the authentication and multifactor there, and then returning back to the application where you’re then logged in. For the third party app use case, it gives us an opportunity to prompt the user and ask them if it’s okay for this application to access certain amounts of data. In the first party app use case, it gives us an opportunity to add in stronger multi factor options, which also don’t require the app developers to do anything. The explanation for this is that, as far as the app is concerned, the app started the flow by sending the user away from the app before returning all logged in.
How does 0Auth work?
2.1 is meant to be a consolidation under a single name of all of the current best practices of 0Auth 2.0, removing deprecated features and adding references to extensions that didn’t exist when 0Auth 2.0 was published.
Along those lines, 2.1 is not actually defining a new behaviour, which means that, if you’re doing 0Auth 2.0 today, in the way that is considered the best practice, you’re already doing 0Auth 2.1. Upgrading doesn’t require a lot of work and doesn’t include anything experimental, in progress, or not widely implemented.
Recently Published 0Auth Extensions
These are recent RFCs that have gone through the whole process and now have an RFC number. We’ll start with JSON Web Tokens for access tokens. In 0Auth, access tokens can be any format and in practice, a lot of people have a lot of different ideas of how access tokens should work. This is fine because most people don’t need access tokens to be interoperable, other than between their own API and their authorization server. However, if you are building a product that is a standalone 0Auth server, like any of the open-source ones, or otka or auth0, then those are going to be used by a lot of different resource servers built by different people. Those people are using libraries and that’s where it would help to have a standardised format for what access tokens actually look like. JSON Web Token profile for access tokens describes how to use a JSON web token as an access token, mainly for the authorization servers to implement. It describes what things to put into the claims, what kind of algorithms to use to sign it, how that sort of structure should work and it tells resource servers how to validate these as well. Pushed Authorization Requests is one of my favourite ones, it has only been recently published and flips the 0Auth flow on its head. In traditional authorization code flow, the client starts out by building a URL to the authorization server, and then sending that URL through the front channel so that the browser makes that request. Again, anytime we’re using the front channel, we have to think about putting that message in an envelope and sending it in the mail. What we’ve done is the clients built a request, including things like what scopes it’s trying to access and what redirect URL to use. All that stuff is sent in the front channel, which means it can be observed but more importantly, modified by the user. PAR actually starts the flow by sending all that info in the backchannel, it sends in a post request up to the 0Auth server and gets back an opaque identifier to start the flow. That means it can send things that are not visible to the user because if the app is running on a web server, that’s server-to-server communications.
Along those lines, solving a similar problem, we have the idea of using a JSON web token to make that first authorization request. Again, instead of putting a bunch of query string parameters into the URL, the client would first create a JSON web token with all of that data in it, pack it up, sign it, and include that in the URL. There are a few different ways the client could send that to the 0Auth server, one is in the URL, one is by reference to that record existing at some other URL, or combining it with Push Authorization Requests and posting it to the endpoint as well. These two things combined, are the most secure.
In-Progress experimental work
I mentioned at the beginning that bearer tokens are the most common way access tokens exist right now. That just means that anybody who has a token can use it, even if it is stolen. That’s where we had the idea of sender-Constrained Access Tokens. We are looking for some sort of authentication of the client instance in order to use the access token. How that actually ends up working in practice is currently up for debate. There are a lot of different options and a lot of different attempts at solving this over the years. Here’s a handful of specs that have been attempted to solve this problem:
New 0Auth extensions in progress
What about the future stuff?
Is there an 0Auth 3.0? You may have heard some rumblings about it but it doesn’t exist. However, there is a group working on something that used to be called TXAuth/XYZ or XAuth and is now called GNAP. That’s an emulation of all the different flows of clients needing to be pre-registered. It is backchannel first, it reduces reliance on the front channel, treats the client as a first-class citizen, and has proof of possession by default and asymmetric cryptography baked in. It is very much in progress and nowhere near being done, therefore unlikely to be actually usable in commercial products anytime soon but there is a lot to learn there.