Utilising identifier error variation in linkage of large administrative data sources
Utilising identifier error variation in linkage of large administrative data sources
Linkage of administrative data sources often relies on probabilistic methods using a set of common identifiers (e.g. sex, date of birth, postcode). Variation in data quality on an individual or organisational level (e.g. by hospital) can result in clustering of identifier errors, violating the assumption of independence between identifiers required …