design history

Lessons learned trying to put `eval` back in the bottle

by @mikesamuel (Google Security Engineering)

Trusted Types protects web applications from remote code execution by keeping attacker-controlled strings away from powerful web APIs like innerHTML and eval. It's the result of almost 8 years of experimentation within Google on technical measures that help security specialists assist developers in producing secure code. It's now Google's primary defense against XSS for most products including the most sensitive like Gmail.

Others have written about how to use Trusted Types on the client to protect against DOM XSS and “Securing the Tangled Web” by C. Kern explains how Google uses Trusted Types (aka SafeHTML types) on the server.

Here I'd like to talk about other approaches that we in Google Security tried which didn't pan out, and share my thoughts on why Trusted Types delivered where those other approaches didn't.

Unaided Human Code Review

When I started on Google Calendar around 2005, there was little in the way of JavaScript infrastructure; jQuery wouldn't be created until 2006. The only way we knew of to prevent XSS was careful code review. We were meticulous about checking any code that produced strings for .innerHTML and tried to use safer DOM APis where possible despite IE6 performance woes. Then two team members went on vacation at the same time, stressing the team. Our first externally discovered XSS dated to that period.

Code review has many benefits, but humans are not good at consistently finding low frequency problems over long periods of time, or quickly and reliably taking into account transitive effects on code not edited.

Static Analysis

Code scanners help identify problems early in the development process. For example, Google built JSConformance to guide JavaScript developers away from code patterns (smells) that have proven problematic, and towards safer abstractions. Code scanners work well when the code evolves alongside the scanner; running a code scanner on a codebase that did not often results in a shedload of false positives and true positives that show how code could have been written a bit better, but which individually are not worth the effort to fix.

Code scanners either have to be very strict, increasing the rate of false and trivial-true positives, or they have to occasionally conclude false things. One of the best understood static checkers for client-side code is TypeScript's type system. It makes necessary, pragmatic assumptions so that it's possible to migrate JavaScript to TypeScript, and does a great job identifying common mistakes before they reach production. But it's insufficient for security.

TypeScript : A Note on Soundness

TypeScript's type system allows certain operations that can't be known at compile-time to be safe.

Consider the capitalOf program. TypeScript assumes, reasonably but incorrectly, that the square bracket operator only accesses members of the countries table, but it actually allows access to the Function constructor. This allows an attacker who controls the inputs to run arbitrary code.

One limitation is that static analyses rarely extend beyond network or memory boundaries. If code trusts a signed input from a micro-service which is written in a different language, it's hard to make use of information about how that input was constructed that was available to static analyzers that ran on that service's code. As we re-architect monolithic services into micro-services, we're perversely limiting the insights we can glean from static analysis.

This is why we in Google want to augment static analysis with runtime enforcement; before a program does things it can't undo, check that critical values were explicitly trusted.

Human-guided Static Analysis

One intuitions about XSS and other injection attacks is that the root cause is a type error.

element.innerHTML

is vulnerable because it incorrectly assumes that HTML can be derived by adding strings without regards to the kind of content they contain.

same code snippet but arrows identify innerHTML and quoted strings have type string<HTML> while the variable str is a string with unknown content. An error message shows that concatenating HTML and unknown content does not produce HTML.

Java's annotation system enabled pluggable static type systems which we used to allow developers to put hints about string types so that automated tools can identify where strings of different “types” are unsafely concatenated.

@HTML String linkifyHtml(@HTML String html) { … }

Code that adopted this was easier to reason about, but it did not solve the underlying problems with static analysis. In Java, you can put an @HTML String into a List<Object> but when you take that String out you lose information about whether it's an @HTML String or some other variant. This is a problem for tools like HTML template languages that use runtime type information, not static types. Pluggable type systems help when almost as widely used as the language's built-in type system but don't help with whole program conclusions if you use third-party code that doesn't plug in the types, or the type system isn't tightly integrated with security-relevant tools.

Automatically Make Code Better

If you know that one way of expressing an idea in code is safer than another, there's a temptation to just rewrite code. For example, you might repeatedly see

element.innerHTML = someText;

and realize that 9 times out of 10, that could be replaced with

element.textContent = someText;

The cases where you can rewrite code without consulting its authors are few and far between though. Usually, there's some reason why there are two APIs, and legitimate use-cases for the more powerful, less safe API. Also, subtle bugs can arise when replacing a use of a dangerous API like eval with one that's widely considered slightly less dangerous like new Function.

Instead of automatically rewriting code, Google Technical Infrastructure has invested heavily in tools to make it easy for code janitors to write a program that modifies many projects' source files, bundle up the changes that can be sensibly reviewed by the same person into a small number of patches, and quickly merge uncontroversial ones. This makes it feasible to aggressively deprecate APIs in favor of better replacements, and was critical in migrating our C++ codebase which had evolved over many years to consistently use modern, safer idioms.

Better tooling

Concurrently with these efforts, we were working on building secure abstractions: tools and libraries that are safe regardless of how they're used. For example:

Auto-escaping template libraries that understand the structure of HTML and preserve template authors' intent regardless of attacker-controlled inputs,
HTML sanitizers that take untrusted HTML and produce trustworthy HTML,
Escapers that convert plain text to another language,
etc.

These efforts, by and large, delivered on their design goals. They've had a low rate of security-relevant bugs, and where they replaced existing systems, they were roughly as easy to use and were straightforward to migrate to.

Where security-team engaged other teams, those other teams were receptive and happy to adopt. Unfortunately, there are many product teams, so without some measures to guide developers to these tools we had to do a lot of outreach. “Build it and they will come” doesn't apply to solutions to problems that developers don't know they have.

Another problem is that these tools still occasionally need to know the content type of a string. Consider this code:

let userName = account.getUserName();
if (account instanceof SparklyAccount) {
  // When we acquired Sparkly, we copied their users' names
  // into our DB but their users could enter HTML when creating
  // accounts.  Don't over-escape.
  userName = sanitizeHTML(userName);
} else {
  userName = escapeHTML(userName);
}
response.write(myHTMLTemplate.apply({ userName }));

This code is brittle because the logic embeds assumptions about an external database. The decision about whether userName is HTML is not made based on anything about the value of userName, but instead based on something about its source. This code may be correct, but it's not likely to remain correct since over-escaping bugs are more visible than security bugs. If escapeHTML just returned its input given a value known to be HTML, then the author could always just escape then sanitize.

Worse, it requires the HTML template to embed knowledge about whether userName is HTML or plain text. If it didn't, then a sparkly user-name "I <3 ponies!" might get re-escaped by the auto-escaper causing the end-user to see “I <3 ponies!” instead of “I <3 ponies!” If the template gets that assumption wrong then it is vulnerable to XSS; without trustedness being part of a value, developers get pulled in both directions by tradeoffs between visible cosmetic bugs, and mostly invisible security bugs.

We saw developers getting into the habit of sprinkling |doNotEscape directives throughout code. Besides being ambiguous, this made us again reliant on a combination of human code review and static analysis to avoid vulnerabilities. It took a lot of work to rid our codebase of those directives, but we did it.

Trusted Types turned out to be the single abstraction that solve these myriad problems. They have a runtime-type that is accessible to tools and libraries, they mesh well with dynamic languages, and they obviate the need for tricky code like the above; if SparklyAccount.getName returns a TrustedHTML value, there's no need for the JavaScript to escape or sanitize anything. The template system knows what to do based on runtime type information.

Google still invests heavily in tools and tool integration, and now that we've got trusted types to organically guide developers towards tools that just do what they need, our investment is amortized over a large developer base and tools teams don't need to spend as much time advocating solutions to problems that developers may not know they have.

Why Client-Side Trusted Types?

We've been using trusted types within Google across half a dozen languages for 8 years now, and it integrates well with tools, works in both deeply static and deeply dynamic languages, and enables an effective partnership between security specialists and application developers.

But when trying to figure out how to solve DOM XSS in the browser, it's worth explaining why we're excited about trusted types instead of restricting ourselves to new types of Content-Security-Policy (CSP) source-expressions.

Why not host-source?

To solve DOM XSS with CSP we'd need to answer questions like: “How do I allow one critical, third-party dependency, whose code I don't control but which I've concluded is safe, to use eval without allowing other third-party dependencies to use eval?” Many projects (ex.) that would like to use CSP to block eval cannot because 'unsafe-eval' is all or nothing, and 80% of CSP policies end up not mitigating eval at all.

One obvious solution seems to be: just list the JavaScript files that you want to able to use eval, innerHTML, and other kinds of sinks. Host-source provides a clear way to represent a list of JavaScript source files. The CSP specification could require sink setters to look at the JavaScript call stack to decide whether to run or not. This idea has clear precedents, for example, the Java security manager which has a mixed track record. Unfortunately, host-source based CSP policies have a poor track record, and Google recommends using nonces instead.

To answer the question above, host-source would need to list individual files, but widely used code bundlers like Webpack combine many JavaScript files into one to reduce load time. Any definition of “privileged code” that doesn't survive web bundling is untenable.

Another problem is that this is assigning a different meaning to host-source than its previous uses in CSP. Existing uses of host-source match against the list when deciding whether to load the content, not when deciding whether to grant privileges to loaded content.

The amount of spec work required is large. Existing JavaScript engines documents treat source metadata as entirely diagnostic. Node.js's vm module lets JavaScript specify the filename used in stack traces, so we can't just build security barriers on top of metadata like V8's source_url. The amount of spec work to thread reliable source metadata around through JavaScript would be large, and would irreparably break existing dynamic code loading mechanisms that fetch code and load it using new Function.

Worse, allowing some files to use a sink is coming at the problem from the wrong direction. We want to grant privilege to some subset of code that we have carefully checked. Let's consider what we have to do to check questions of the form “is this safe?”:

Three JavaScript functions, f, g, and h. H calls g which calls f which assigns innerHTML. A comment in f, asks

To know whether it's safe to assign x to .innerHTML we need to figure out where x comes from.

arrows between f, g, and h show the flow of arguments to innerHTML. Comment notes that this analysis is flow-sensitive and unbounded.

Without trusted types, concluding that f uses x safely requires us to reason about all of the present and future callers of f. Unless x comes from places on our list of host-sources, then are we really listing the right things? With trusted types, we can focus our attention on checking that trusted values are properly constructed. Trusted Types avoids this by letting us ignore f and g entirely.

At Google, we do use lists of source files. For example, when we have unsafe APIs, we require calling code to be on a list of code that has been checked by someone who understands the caveats for those error-prone APIs. That works when we can guard import but just doesn't work for global APIs like eval and the DOM APIs.

Why not use nonce-source?

At Google, we like nonce-source for deciding whether to load scripts. It may be possible to extend nonce-source to guard APIs like eval, but there are problems. Instead of writing

eval(code)

you might write

eval(code, nonce)

and the JavaScript engine could use the extra argument to decide whether to parse and run code.

This has problems though:

This requires changes to code that uses sinks.
You need a way to get nonce to some code while denying it to other code. Embedding nonces in JavaScript breaks caching.
This won't work for other sinks like .innerHTML without new JavaScript syntax.
As with host-source, this is backwards; it grants privilege to sinks instead of value creators. The decision to trust code should be based on something about code.
Nonces in HTML requires a nonce that lasts for a single request/response. Reusing the same nonce in JavaScript loaded via <script src="..."></script> not only encourages nonce-reuse, but requires it.

Why not hash-source?

CSP allows listing hashes. This may make sense for commonly used libraries loaded via a CDN, like jQuery. Unlike the other CSP source-expression variants, it is based on something about the value. So that's nice.

It only allows a closed set of possible inputs, so won't extend to widely used meta-programming code like that in Promise.denodify (32M downloads/month).

Expression whitelisting schemes in other contexts have had trouble scaling; serving long hash lists with every HTML response could cause latency problems.

Conclusion

Google's Security Engineering group tried many things over the last eight years to make it easy to produce complex systems without code-injection vulnerabilities. Often, bits or pieces of those efforts proved useful, but none was sufficient alone. Trusted Types is the unifying theme that ties these individual efforts into a whole that is greater than the sum of its parts.

Trusted Types works because developers are already in the habit of looking at the types of an API and working backwards to figure out how to get what they have as that type. Trusted Types are just types that have additional security properties. This enables a sensible division of duties: security specialists provide safe abstractions for creating trusted values from unsafe inputs, and developers get to use powerful APIs safely. In the rare case where the safe abstractions aren't sufficient, security specialists work with developers to craft new abstractions, or produce reliable, application-specific code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

design history

Lessons learned trying to put `eval` back in the bottle

by @mikesamuel (Google Security Engineering)

Unaided Human Code Review

Static Analysis

TypeScript : A Note on Soundness

Human-guided Static Analysis

Automatically Make Code Better

Better tooling

Why Client-Side Trusted Types?

Why not host-source?

Why not use nonce-source?

Why not hash-source?

Conclusion

Clone this wiki locally

design history

Lessons learned trying to put eval back in the bottle

by @mikesamuel (Google Security Engineering)

Unaided Human Code Review

Static Analysis

TypeScript : A Note on Soundness

Human-guided Static Analysis

Automatically Make Code Better

Better tooling

Why Client-Side Trusted Types?

Why not host-source?

Why not use nonce-source?

Why not hash-source?

Conclusion

Clone this wiki locally

Lessons learned trying to put `eval` back in the bottle