The Inevitability of Bugs and the Need for Error Codes
It seems like an inevitability that with modern software comes bugs, errors and unexpected features. Bugs have been there from the beginning with the term bug first being used by Grace Hopper back in 1947 in the pioneering days of computer programming.
Today there are many ways to handle errors, log them and help users find solutions. An important part of this is the use of error codes. This is something that when I started out I missed however quickly learnt was essential to diagnosing and correlating user-reported issues.
When an error occurs the user is ideally presented with a message. For an expected error this can detail out next steps however for an unexpected error the user needs to be presented with something else. It is not good practice to surface the underlying error as at best these contain confusing technical detail that means little to the end-user and at worst can expose sensitive information. Therefore often a generic message is used such as “An unexpected error has occurred, please try again”. When presented with this the users may turn to the internet, send a support email or share screenshots looking for answers. This is little help when the generic error message is the same no matter what error has happened.
Adding error codes to your messages can resolve this. Error codes are numeric or alpha-numeric strings that can be added to the error message to help the user find solutions or report issues. These can be used for both expected errors and unexpected exceptions. They allow developers to uniquely identify and correlate issues as if the same error occurs then it should have the same error code.
Errors can be broken down into expected and unexpected errors. There a number of situations, that although may be unexpected to the end-user can actually be predicted in advance. These can include things like running out of disk space or loss of internet connectivity. In some ways, these are the best type of errors as you can provide potential solutions and static error codes.
In these cases, an error code can be useful if the message and the proposed solution is not successful. Error codes can allow people to communicate and search for a specific problem without having to use the full message. A knowledge base of error codes can be built up in advance of the software being published and can detail alternative or longer solutions that can be updated independently of the software. Error codes are particularly useful in these cases when the error message and interface is localised as the error code should remain consistent.
Unexpected errors occur from a mistake in our code or the third-party library. We can try and reduce this through testing, guards statements and predicting scenarios that may happen however in a complex system things may slip through. Even when we are confident it is error-free, external changes can cause new errors to occur. For desktop or mobile apps, these can include OS upgrades or other software running. If you are running a web app, with evergreen browsers, your app is running in an ever-changing environment and plugins that can interfere with your application.
Unlike expected errors for unexpected errors, we need to generate the codes at run-time inside a global error handler. There are a number of techniques for this however you need to ensure that the codes are short enough to write down or quote, are generated to be unique to the issue but not tied to the user.
It is useful when logging the exception and stack trace to tag it with the same error code shown to the user. If a user sends a screenshot of the error code you can then find the full stack trace and identify the error quickly.
Once you have resolved an issue you can report back that the bug with provided error code was fixed in a specific version of the software.
No matter how many edge cases we guard against a bug is highly likely to make it into our code. Changes to the environment where our code is running that are often outside our control can also cause unexpected behaviours. Through the use of effective error messages and error code generation, we can talk about these errors, correlate them and clearly communicate with our end users to resolve faults.