Identification of users by browsing history in the browser

Mozilla employees published the results of a study of the possibility of identifying users based on the profile of visits in the browser, which can be seen by third parties and sites. An analysis of 52 browsing profiles provided by Firefox users who took part in the experiment showed that preferences in visiting sites are specific to each user and are constant. The uniqueness of the obtained profiles of the history of visits was 99%. At the same time, a high degree of profile uniqueness is maintained even if the sample is limited to only a hundred popular sites.

Identification of users by browsing history in the browser

The possibility of re-identification was tested during a two-week experiment - data on visits in the first week were tried to be compared with data on the second week. It turned out that it was possible to re-identify 50% of users who visited 50 or more different domains. When visiting 150 or more different domains, the re-identification coverage increased to 80%. The check was performed on a sample of 10 sites to simulate data that large content providers can obtain (for example, Google can control access to 9823 sites out of these 10000, Facebook to 7348, Verizon to 5500).

This feature allows large owners of popular resources to identify users with a fairly high probability. For example, Google, Facebook, and Twitter, whose widgets are hosted on third-party sites, could theoretically re-identify approximately 80% of users.

Identification of users by browsing history in the browser

You can also determine previously opened sites by indirect methods, for example, by enumerating popular domains in JavaScript code with an assessment of the difference in delays when loading resources - if the site was recently opened by the user, then the resource will be issued from the browser cache almost instantly. Previously, to determine the open pages could be used evaluation caching HSTS settings (when opening a site with HSTS, the HTTP request was immediately redirected to HTTPS without trying to access HTTP) and analysis state of the "visited" CSS property.

Similar CSS-based browsing history methods were used in a similar study, conducted from 2009 to 2011. These researchers showed the ability to identify 42% of users when checking 50 pages and 70% when checking 500 pages. Mozilla Research confirmed and clarified the conclusions of the previous publication, while the accuracy of determining the history of visits was significantly improved, and the coverage of verified domains was increased from 6000 to 10000 (in total, data on 660000 domains were obtained, but a sample of 10 most popular domains was used to evaluate identification).

Source: opennet.ru

Add a comment