Ephemeral Fingerprinting On The Web

TL;DR:

Background

diagram of site isolation boundaries with overlap

Figure 1: Two sites observe a sequence of device orientation changes at times 𝒕₀, 𝒕₁,𝒕₂ .

All sites on the same UA instance share a clock and therefore can agree on the timestamps with a small margin of error. The triplet 𝒕₀, 𝒕₁,𝒕₂ has a high probability of uniquely identifying the user. The two sites can thus use these observations to conclude that the observations originate from the same user.

As illustrated above, one or more low entropy signals observed concurrently can be used to identify a user with a high degree of confidence. Let’s call these ephemeral fingerprints. This document discusses two types:

  1. The sequence of timestamps corresponding to observed changes of a volatile surface can be used for identification. Let’s call these correlated events.

  2. A stream of observations of a volatile surface can be identifying. Let’s call these unique event streams.

Signals considered for ephemeral fingerprinting don’t need to be highly identifying by themselves. The privacy budget proposal does not adequately account for fingerprinting based on concurrent observations of low entropy signals.

Device orientation, from our earlier example, can take one of two values (portrait or landscape) and is unstable. Thus a single sample of device orientation carries almost no information. I.e. A recorded observation of device orientation doesn’t help at all with identifying the user at a later time. However the timestamps corresponding to orientation changes could have identifying levels of entropy.

These are not new. For example, this is discussed by Van Goethem et. al. 1

who calls these “Cross-Session Events” (§ 5 of linked paper). Potential ephemeral fingerprinting surfaces also get flagged during standardization discussions ( Example: Polling enumerateDevices, Example: Ambient light events).

Modelling Correlated Events

A correlatable event can be thought of as the tuple <surface-sample, timestamp>. The addition of the timestamp strictly increases the amount of information carried by the surface sample.

Modelling Event Streams

An event stream is simply a list of observed samples sample₀, sample₁, ....

Each additional observation strictly increases the amount of information.

Other Examples:

Mitigation

Permissions

Goal: Require informed consent from users.

There’s precedent for considering permissions 2 to be sufficient mitigation for similar issues. For example, the Media Capture API specification includes the following:

For origins to which permission has been granted, the devicechange event will be emitted across browsing contexts and origins each time a new media device is added or removed; user agents can mitigate the risk of correlation of browsing activity across origins by fuzzing the timing of these events.

From §15 of Media Capture and Streams API specification.

Pros

Cons

Fuzzing Timing of Events

Goal: Deter correlation of events by injecting timing skew.

Mentioned in the snippet above from the Media Capture and Streams API and called out by Jeffrey Yasskin as a potential general mitigation in “desynchronize whole-browser events” in this issue filed against the WHATWG HTML specification.

Pros

Cons

First-Party Restriction for APIs

Goal: Deter identity correlation by third-party sites.

Restrict APIs to the origin of the top-level browsing context.

The latter may choose to explicitly delegate access to the APIs via feature policies. But third-party contexts can’t “reach across” browsing contexts via correlation of cross context events or attributes that may be made available by the API.

Pros

Cons

Limit API Access To Visible Browsing Contexts

Goal: Prevent background browsing contexts from skimming identifiable events.

The Page Visibility API defines the visibility state of a document as visible if the document is “at least partially visible on at least one screen”.

Restrict APIs to — possibly top-level — browsing context’s active document.

Pros

Cons

Limit Events To Focused Top-Level Browsing Context

Goal: Limit firing correlatable events to a single top-level browsing context.

The HTML spec defines a concept of a currently focused area of a top-level browsing context. As defined, every top level browsing context has one regardless of visibility. A similar narrow concept could be introduced that recognizes the top level browsing context that has system input focus. There should be only one of these on a single device.

Let’s call the top-level browsing context that has system input focus as the focused top-level browsing context.

New specifications could restrict browser-wide events to the focused top-level browsing context.

Pros

Cons

Limit API Access To Focused Top-Level Browsing Context

Goal: Limit access to sensitive APIs to a single top-level browsing context.

Similar to the above, but addresses issues around polling by disallowing access to the entire API or sensitive attributes by restricting the entire API instead of just events.

Pros

Cons

Secure-Context Restriction and Control via Feature-Policy

These should be pretty standard at this point.

Pros

Cons

Spotting Ephemeral Fingerprinting Surfaces In Web Specs

Ephemeral fingerprints:

What to look for:

Example

Consider onfocus and onblur events.

The focus update steps involve firing up to three distinct events: change if the node losing focus is an input element, focus, and blur.

When focus traverses a browsing context boundary, these events may be fired simultaneously to two different browsing contexts. Browsers mitigate this by not firing blur for cross site tab switches, but they still fire blur when the browser itself goes out of focus. Thus identity can be correlated when switching browser windows.

Possible Mitigation

When the new chain and the old chain 3 are in different top-level browsing contexts whose active documents are not same-origin, queue but don’t fire change and blur events until focus returns to the old top-level browsing context.

Notes