Recently, Google proposed the Topics API framework as a privacy-friendly alternative for behavioural advertising as a possible solution to balance user’s privacy and advertisement effectiveness. Using the Topics API, the browser builds a user profile based on navigation history, which advertisers can access. The Topics API aim at becoming the new standard for behavioural advertising, thus it is necessary to fully understand its operation and find possible limitations. In this paper, we evaluate the robustness of the Topics API to a re-identification attack. To build a user profile, we suppose an attacker accumulates over time the topics a user exposes to different websites. The attacker later re-identifies the same user matching the profiles of their audience. We leverage real traffic traces and realistic population models, and we present increasingly powerful attack threats. We find that the Topics API mitigates but cannot prevent re-identification from taking place, as there is a sizeable chance that a user’s profile remains unique within a website’s audience and the attacker successfully matches it with the profile of the same user on a second website. Depending on environmental factors, the probability of correct re-identification can reach , considering a pool of 1 000 users. We offer the code and data we use in this work to stimulate further studies and the tuning of the Topic API parameters.

Re-Identification Attacks against the Topics API

Trevisan, Martino;
2024-01-01

Abstract

Recently, Google proposed the Topics API framework as a privacy-friendly alternative for behavioural advertising as a possible solution to balance user’s privacy and advertisement effectiveness. Using the Topics API, the browser builds a user profile based on navigation history, which advertisers can access. The Topics API aim at becoming the new standard for behavioural advertising, thus it is necessary to fully understand its operation and find possible limitations. In this paper, we evaluate the robustness of the Topics API to a re-identification attack. To build a user profile, we suppose an attacker accumulates over time the topics a user exposes to different websites. The attacker later re-identifies the same user matching the profiles of their audience. We leverage real traffic traces and realistic population models, and we present increasingly powerful attack threats. We find that the Topics API mitigates but cannot prevent re-identification from taking place, as there is a sizeable chance that a user’s profile remains unique within a website’s audience and the attacker successfully matches it with the profile of the same user on a second website. Depending on environmental factors, the probability of correct re-identification can reach , considering a pool of 1 000 users. We offer the code and data we use in this work to stimulate further studies and the tuning of the Topic API parameters.
File in questo prodotto:
File Dimensione Formato  
3675400.pdf

Accesso chiuso

Descrizione: Articolo scaricabile dal sito dell editore:https://dl.acm.org/doi/10.1145/3675400
Tipologia: Documento in Versione Editoriale
Licenza: Copyright Editore
Dimensione 1.5 MB
Formato Adobe PDF
1.5 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11368/3079198
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact