Top 10 errors types by number of sites throwing the error at least once
It turns out the distribution of errors thrown on the web are highly Zipfian: A few error types make up most of the errors thrown. ReferenceError, TypeError and SyntaxError make up 85% of all unhandled errors. As the web developer Tolstoy put it: Working websites are all different, all broken websites break in the same way.
Of course, there are many ways to produce these error types. The specific string in the error message tell us more about what actually happened. Looking at the most common error messages gives a certain sense of familiarity. As a web developer, you've likely encountered some of these before.
Top 10 error messages by number of sites throwing the error at least once
We used these statistics on the specific error messages, and then went ahead and debugged random samples of these errors to get a qualitative understanding of what went wrong in each case. This yielded some surprising findings. It turns out that for both ReferenceError and SyntaxError, there is a single common root cause that produces most of them: Failures in resource loading. For TypeErrors, there's a similar finding that most of them essentially come from the same kind of problem. Our deep dive resulted in the following articles describing our findings for each error:
12% of sites in our sample had one or more unhandled errors. This is really a stunning number. Each of these errors indicate that some line of execution was aborted due to an unexpected situation, and likely indicates that some functionality is broken as a result. The number is also a testament to the error resiliency of the web: Whatever problems these errors indicate, they must be small enough that no one has bothered fixing them.
The data shows that most of the errors come from missing code, data or document elements at runtime. In some sense these errors are made possible by the late-binding nature of the web: Types are determined at runtime (late), as opposed to at compile time (early). Determining types at runtime means allows the loading libraries at runtime to be easy and natural. It also makes an entire class of errors possible: Errors coming from missing libraries and changing API surfaces. Of course, late-binding is not the only choice: Many languages base themselves on types being known at compile time. Had we collectively decided to build the web using Java Applets, the error landscape would have been quite different. How would it have looked?
In a language where the type system gives strict guarantees about the shape of types, any runtime dynamicity regarding loading of libraries becomes harder to do, especially if these libraries are allowed to evolve their API surface. This relates not only to linked code from the network, but also to the browser runtime. Looking back to time of Java Applets, if you didn't have the right Java runtime installed, the applet would refuse to run until you had downloaded and installed the appropriate JRE. On the web, you can view a page with an old browser, and perhaps expect a progressive breakdown correlated with the age difference of the browser and the site. It is certainly possible to write a web page that works correctly both in current and ancient browsers. In this view, late-binding may be a crucial building block to having an evolving web.
Alan Kay miming an ecological, distributed system without tightly interlocked interfaces (source).
Clearly, the last word has not been said in this. The guarantees given by static typing allows the compiler to prove the absence of a certain class of errors, and this is something many programmers will not give up happily. TypeScript is in an interesting place here, straddling the world of both dynamic and static typing. This does come at a cost: What the compiler believes the types to be at compile time may not be what they actually are at runtime, but that may be a fine trade-off.
Allowing for runtime dynamicity while retaining some of the safety provided by statically typed languages may be the key to a less error-prone web. As the data shows, when the web breaks, it breaks because of untrue assumptions in the code, causing errors at runtime. The document is not as expected, types are not as expected, libraries and data have failed to load over the network. A fruitful avenue of programming language research may be type systems that bake in these assumptions, allowing the compiler to prove that the assumptions have been checked before they are relied on. If this can be achieved in an ergonomic way that is not a pain to use, it may allow for code that is capable of operating in a dynamic execution environment, while eliminating most of the errors that plague the web today.