I don't have a tremendous amount of experience building RESTful APIs; so, it's not always clear which HTTP status code in the 4xx block I should use when refusing to fulfill an incoming resource request. One tricky scenario that I've had to code against recently is the request for a properly formed, valid resource of which the authenticating user doesn't have permissions to view.
Image that we have two users in our system: Sarah, with ID 4, and Tricia, with ID 37. Now, imagine that Sarah makes an authenticated request to view Tricia's profile resource:
GET /users/37/profile HTTP/1.1
Authorization: Basic YmVuK2F206dGVzdA==
Here, Sarah is using Basic Authorization to identify herself as Sarah; however, she's making a request to another user's profile (Tricia's). For sake of argument, let's say that in this API, a user can only view his or her own profile. What HTTP status code should I return?
The three status codes that felt the most appropriate are:
- 401 - Unauthorized
- 403 - Forbidden
- 404 - Not Found
In my mind, the use of each of these three HTTP status codes could be justified. Sarah is not authorized to view Tricia's profile (401); Sarah is forbidden from viewing someone else's profile (403); and, Sarah simply cannot see resources that she's not allowed to view (404).
The initial problem that I had with using either of the HTTP status codes, 401 or 403, was that I felt like it was exposing secure information. Both of those responses sort of say, "Yeah, that resource exists, but you can't see it." My problem with this is that it confirms that those resources exist.
When you ask a Doctor if he treats a particular patient (at least in Law & Order - wicked awesome show!), he will often say something to the effect of, "Officer, you know I can neither confirm nor deny having a patient as it would be a breach of doctor-patient confidentiality." This is how I feel about 401 and 403 in this particular type of resource request - I don't want to confirm or deny its existence.
Then, one day, when I was reading over the description of the 403 Forbidden HTTP status code, something clicked. At the end of the description, it states:
The server understood the request, but is refusing to fulfill it. Authorization will not help and the request SHOULD NOT be repeated. If the request method was not HEAD and the server wishes to make public why the request has not been fulfilled, it SHOULD describe the reason for the refusal in the entity. If the server does not wish to make this information available to the client, the status code 404 (Not Found) can be used instead.
That last line seemed to solve my problem - if I don't want to expose the information, I should return a 404 Not Found instead. This definitely makes me feel the most comfortable with the response.
Now, clearly, I am not advocating the use of 404 Not Found instead of 401 Unauthorized or 403 Forbidden in all cases; these other HTTP status codes make much more sense in other contexts. I am simply saying that in the specific use case in which an authorized user is making a request for a valid resource that they don't have permissions to view, 404 Not Found feels like most secure solution.
First off that's the third time in three days I have heard or read the phrase "wicked awesome", lol. I would agree that taking advantage of different status codes to update users about the transfer of information from the server to browser is key to successful application. Are you not a REST fan?
No no - I'm definitely a REST fan. I'm currently working on building a RESTful API, which is why I've been doing a lot of thinking about this. I just want to make sure I am on the right track :)
The phrase, Wicked awesome, is hella sweet !
I would definitely handle each error code differently, for a forbidden a redirect my be called for, a 404 display a search form, and an unauthorized maybe a login form. When I read, "The server understood the request, but is refusing to fulfill it.", a flood of inspiration to have a fake popup with famous In Living Color character quotes come up for the using. Just the idea of a server having a personality is hilarious. For 401 have "If you do that again, I will rock your world", for a 404 "Let me show a something...".
I recently implement our a RESTful version of our API (which uses MAC Access Authentication.) I found I used RFC2616 quite a bit in helping me decide what status codes should be returned. :)
As for whether or not 403 or 404 is the correct response, it does really boil down to how much you want to reveal to the user.
If the URI is a known resource (such as published in documentation) then I'd say the 403 is a more appropriate response, since it provides more information as to why it's not working. If you've already published the possible URLs, people already know what should work, so you're not trying to hide anything.
However, if the resources are basically unpublished, then a 404 certainly can make sense.
How do you handle things in your applications when you're securing data? I'd say the policy should be pretty similar. If you display a message to a user when they try to access information they don't have access to, then I'd say the RESTful API should respond accordingly.
I've just been doing a lot of support for our RESTful services lately, and I can tell you that the more your status codes and text reveal to developers using an API, the easier it is to help them troubleshoot.
Just my take...
Wow, what a great thought Ben. I never thought of it that way, but it makes total sense. It's a good approach to make for a public website for security reasons. I DO think that 401 or 404 should be used traditionally on internal applications where the user may or may not know their access rights. Returning a 401 or 404 could help them figure out that they need to contact the administrator to get certain rights applied to their user account.
I think the difference in response depends on the usage. If it's an API and your users have been pre-qualified and already authenticated, then the 403 or 401 is definitely justified. If it's a free-for-all then a blanket 404 is perfectly acceptable for the reasons already cited.
Programmatic access should be handled differently than manual / open access. Or maybe it's that error handling is different at different security layers.
Just my two cents.
I just had a good laugh thinking about In Living Color :) That's a classic show!
I had never thought about it from a debugging standpoint before. I could definitely see that a 403 may be easier to debug that a 404 since it does lend a bit more insight. I'll have to do some more thinking on that.
The one wrinkle that I have been coming up against with this is that a user may know "part" or a resource, but not have access to all commands on that resource. So, for example, they may be able to execute:
... but, NOT be able to execute:
... due to permissions (perhaps they have to *own* an item in order to update it). It does seem a little silly to return a 200 OK for the former and a 404 Not Found for the latter, as clearly the resource exists - you just can't perform that action on it.
That use case just doesn't sit right.
Thanks! I'm glad you found this interesting. RESTful architecture is really fascinating stuff (to me). I'm slowly trying to get the hang of it.
I think it gets interesting when a user can see part of a resource, like:
... but can't see a command on that resource like:
If a user doesn't have permissions to "rename" the resource (for example), then I would return a 404 (give my above explanation). However, part of me wants to do a 401 since its more a security issue than an existence issue.
HOWEVER! What I have to remember is that the above two examples truly are *two* different resources. It's so easy for me to get lost in the idea that I am - behind the scenes - translating the Resource URI into an Event and a set of variable bindings; however, I have to remember that these are actually *two* different URIs - the translation behind the scenes is just coincidental.
Hi Ben! Why not return 401 in both cases: whether the user exists or not? That way no clue would be leaked to the hacker, too. 401 simply says you don't have the right to do what you want (i.e. because you're not the person who this resource may belong to). Whereas, 404 says we just don't have what you want as it doesn't exist yet. So, whether Sarah tries with /users/37/profile or truly nonexistent /users/007/profile she's equally refused since she's nether Tricia nor James Bond, who's yet to register probably ;)
I see what you're saying there. But, if you always return a 401, it seems that the outcome is similar to always returning a 404. In either case, you're being consistent in returning a status code that doesn't reveal too much information. I think its 6 of one, half-a-dozen of the other.
Sure, so why you choose 404 over 401? Is there any particular reason? I just think that 401 makes more sense and is more appropriate in this situation, isn't it? As an example, when you log in to a web site and accidentally has entered wrong credentials, most of the sites will notify you that you provided either a wrong login name or password, which is just like the 401 error.
I don't feel particularly strongly one way or the other. But, using your credentials example, if you enter an incorrect username and password, *most* systems will not tell you:
"Your password is incorrect."
Rather, they will tell you:
"Your username and/or password is incorrect."
The reasoning being that the former gives away too much information. IE, it says, "Yes, that username exists, but that password is incorrect." So, rather than revealing that, they simply tell you that that combination doesn't exist.
It's not 100% applicable to what I'm saying; but, my gut feeling is just that I would rather err on a 404 than a 401 since it reveals less about the underlying data.
Probably I was not very clear but "Your username and/or password is incorrect" is what I meant. What does it tell to one about the underlying data? No more than that the credentials one has provided are incorrect. Isn't it the case when Sarah is trying to access Tricia's profile? Sarah's own credentials are incorrect for Tricia's account. So the system tells the truth. The truth is not that the credentials are incorrect for Tricia's account. The truth is that the credentials are just incorrect. That's a big difference. Of course, the system may turn dumb and just say "I have none!" (404). Would work too, but to me it looks more like security through obscurity.
Sorry if I make too much noice about this topic. It may be not a big deal at all. The more important thing is to be consistent in what the system returns, as you said.
"Sorry if I make too much noice about this topic. It may be not a big deal at all."
No worries :D I think it's a good conversation - after all, this is stuff that I am relatively new at, so I am definitely open to thinking about it more deeply.
Hi Ben. The URL you used in one of your comments contains a verb (/path/to/some/resource/rename) in it. I thought we were supposed to leave verbs outside of URLs ?
I authenticate when I give the system a valid username and password. But I become authorized once the system verified I have access to a particular resource. The problem is that HTTP is ambiguous about it by calling 401 Not authorized. I think the accepted definition is not authenticated instead.
Even if you wanted to honor HTTP to the letter. You are left with a problem. If you become inactive unless you get a different answer your SAP has no way of knowing whether it need to authenticate the user again or indeed there is no access to that resource. And I do believe that is a common use case.
I suppose you could reply 404 instead of 403. By the time you reply a 404 the user already logged on, so they have a valid user account. Of course that is not a guarantee.