The Impact of Synthetic Data Generation on Data Utility with Application to the 1991 UK Samples of Anonymised Records

www.lmu.de | UB | Blättern | Hilfe

Zur erweiterten Suche

English

Zur erweiterten Suche

Taub, Jennifer; Elliot, Mark und Sakshaug, Joseph W. (2020): The Impact of Synthetic Data Generation on Data Utility with Application to the 1991 UK Samples of Anonymised Records. In: Transactions on Data Privacy, Bd. 13, Nr. 1: S. 1-23

Volltext auf 'Open Access LMU' nicht verfügbar.

Abstract

Synthetic data generation has been proposed as a flexible alternative to more traditional statistical disclosure control (SDC) methods for minimising disclosure risk. However, a barrier to the use of synthetic data is the uncertainty about the reliability and validity of the results that are derived from these data. Surprisingly, there has been a relative dearth of research on how to measure the utility of synthetic data. Utility measures developed to date have been either information theoretic abstractions or somewhat arbitrary collations of statistics, and replication of previously published results has been rare. In this paper, we adopt a methodology previously used by Purdam and Elliot (2007), in which they replicated published analyses using disclosure-controlled versions of the same microdata used in said analyses and then evaluated the impact of disclosure control on the analytic outcomes. We utilise the same studies as Purdam and Elliot, based on the 1991 UK Samples of Anonymised Records, to facilitate comparisons of synthetic data utility between different utility metrics.

Dokumententyp:	Zeitschriftenartikel
Fakultät:	Mathematik, Informatik und Statistik > Statistik
Themengebiete:	500 Naturwissenschaften und Mathematik > 510 Mathematik
ISSN:	1888-5063
Sprache:	Englisch
Dokumenten ID:	88855
Datum der Veröffentlichung auf Open Access LMU:	25. Jan. 2022 09:28
Letzte Änderungen:	25. Jan. 2022 09:28

Dokument bearbeiten