An Introduction to Data Science for Administrative Data Research course (IDS-ADR)

Our course gives a full introduction to working with administrative data - by the end of this online course participants will have the skills to understand, access and prepare linked administrative data for analysis.

If you are a social or health researcher with experience of analysing survey data, or if you already work with administrative data but want to analyse multiple administrative datasets linked together, then this course is for you. Teaching is given by experts in that specific area and you will have the opportunity to manipulate data, create variables and initial outputs.

Our course (organised jointly by SCADR and the SLS–Development Support Unit) has been run very successfully for several years as a face-to-face 4 to 5 day course. Given the current restrictions we took the decision to run the course in 2021 virtually, with a mix of live practical lab sessions using synthetic data; recorded lectures with virtual handouts (each lecturer will also provide an agreed time slot for a live Q&A session) and a host of online materials.

The course will:

  • describe what administrative data is
  • discuss some of the particular problems raised in working with this type of data
  • and how to deal with these problems

Theoretical sessions will be backed up by live practical lab sessions, using R to write syntax to tidy, clean and recode data; link datasets; manipulate data; conduct data visualisation; identify data quality issues; and fit regression models. SLS synthetic data will be used in live practical lab sessions.

For further information on the course content and delivery, please click here.


IDS-ADR course - typical programme

By the end of this course participants will have the skills to understand, access and prepare linked administrative data for analysis.

The full programme is currently being finalised but a draft document of this year's programme is included here.  

Sessions include:

  • Data tidying
  • Data Cleaning & tidying
  • Indexing, linking and joining datasets
  • Working with dates and times
  • Data visualisation
  • Introduction to programming & data manipulation verbs
  • A Scottish & UK data showcase session will give a flavour of the type of data that is available
  • Understanding Data Provenance - From Origins to Output
  • Information on how to apply for access to linked data, and secure data access within a safe setting
  • Ethical, confidentiality and disclosure issues around using this type of data
  • Current SCADR researchers will highlight their research using linked administrative data and describe the advantages of this approach, as well as the problems they might have encountered and the lessons learned



The course will run online from 3rd – 28th May 2021.

Participants will be given access to our online learning platform, Learn, where teaching materials will be uploaded at the start of each week; recordings of lectures archived and participants can practice the techniques learned in the course.

Participants can work through the material and information in their own time, however there will be a live practical lab session each week, where participants can work during the session and speak directly to the tutor if they have any questions or need support.

Course materials will remain accessible to participants on the Learn platform, until 1st July.


Time Commitment:

Online learning offers greater flexibility, for participants, who can study at a time that suits them.

However, as everyone works at different speeds and it is very difficult to provide an estimate of how many hours per week are required to complete this course.

As an estimate, we expect that participants will require between 4-6 hours per week to read teaching materials, and need a further 3-4 hours per week if they wish to join the live practical lab sessions.

Whilst we expect everyone to work at times to suit them, participants should be aware that the drop-in to ask questions on the practical examples is ONLY available during the live practical lab sessions (usually Thursday afternoons).



The number of participants on this course is capped. Applications will be reviewed by end of April, and successful applicants will be notified shortly thereafter and invited to register for the course and complete payment

Applicants should have prior experience of quantitative analysis, experience of packages within R and set out why they are interested in attending.

Costs: £120 per participant – payable upon confirmation of a place.

Application: Please note that the closing date for the May 2021 course, is 22 April 2021.

Categories and tags

admindata course