BLOG - What do Administrative Data Researchers need to know?

The Scottish Centre for Administrative Data Research (SCADR) have been exploring what people using administrative data think they need to know, and what’s already available regarding training, support and access in this area.

 

What do people want?

We ran a Twitter poll for researchers and also asked data providers, the electronic Data Research and Innovation Service (eDRIS) team and Early Career Researchers Using Scottish Administrative Data (eCRUSADers) network what they wished researchers knew or had known more about when accessing and analysing administrative data.

In our Twitter poll, nearly twenty responses revealed the areas of key interest:

 

These results and our wider discussions highlighted 7 themes:

  1. The data access process – including how to write documents, timescales, and how things are different in Scotland
  2. Data protection – legal and ethical issues specific to admin data use – including the risks of using admin. data, the legislation that underpins data linkage, the differences between anonymisation and pseudonymisation, and when a Data Protection Impact Assessment (DPIA) is required.
  3. How to work within a safe haven – including processes (e.g. how to request software, disclosure control rules), and data management within a secure environment (versioning, following published standards such as RECORD).
  4. Understanding data/admin data in general – including understanding the provenance and uses of data, data management techniques for large messy datasets and understanding linkage.
  5. Specific analysis/data science techniques – including causal inference methods, survival analysis and longitudinal modelling.
  6. Information on specific datasets – including information on variable codes, variable quality (including how variables are derived), and overall what datasets can and can’t be used for.
  7. Other – including data visualisation, creating innovative policy relevant projects and how to fit this process into wider career development.

Interestingly, data protection was not mentioned by any researchers specifically, but was seen as an area lacking by data controllers and teams supporting researchers in Scotland and elsewhere. Perhaps researchers feel this area is covered by the training required for data access or conversations with the Data Protection Officer (DPO) and therefore they believe they know all that is required? Or perhaps specific data protection information on linking datasets needs more of a focus?

A few groups asked for training for all parties involved, i.e. sometimes support teams and data controllers could benefit from some training/development too. This could be on the legality and safety of data sharing and the system, or overviews of statistical methods to make output checking easier. A potential opportunity might be to deliver training to data controllers, support teams and researchers together to share knowledge, points of view and discuss how best to co-create research projects that are both feasible and impactful.

 

What’s already out there?

Looking at the themes, you might be thinking “there must be guidance published on a lot of this already?” You’d be right, however, this exercise indicates that researchers may not be aware of the huge amounts of information available on the process and datasets from sources such as the ISD National Data Catalogue, eDRIS website, eCRUSADers website or Scottish Government website.

It’s also possible sources such as these are not covering the information that researchers need, or that the information isn't easily discoverable.

A look through some major training providers found the majority of live courses and events are focused on data management, statistical methods or information on datasets. Gaps were noticed around specific information on Scottish datasets and the use of a safe haven. Health datasets for example are one of the biggest data assets in Scotland, and often are required to be used within a secure environment or safe haven - but there are no courses that train researchers in how to understand features such as hospital episodes and practically working with the data to recognise irregularities, or how to deal with working with the data within a secure environment.

Timing of events also could be an issue. Data management and statistical methods courses tend to be run annually. This may mean that people starting just after a course have nearly a full year to wait. Similar problems may arise for researchers that aren’t aware what methods they need training in until they have access to their data.

Through this exercise we identified four comprehensive courses that covered most of the above themes. More details on course content can be found on our training page, but include:

These and other events are available for support staff as well as researchers, and there are courses in development from ONS / University of West of England (UWE) on output checking specifically for support staff.

Another important aspect we found is that the most useful courses use admin. data as the training dataset and provide interactive sessions for hands on practice. The materials are also available for a long time afterwards to allow researchers to come back to them when they get to different stages of their data access/analysis journey.

 

What can we do about it?

With this in mind, SCADR is now looking into how they we can best support filling the gaps identified. The development of Research Data Scotland is likely to be instrumental at signposting to all the available resources out there, to help first time researchers navigate the joy of Administrative Data Research.

 

We are very interested to hear your thoughts
  • What are your views on the issues we’ve raised? Do you agree with the gaps identified?
  • Do you have suggestions on what other training and support can be given to new researchers in this area?

Any answers or feedback from this blog would be gratefully received - please email to scadr@ed.ac.uk.

This article was published on 21 Jan 2021

Categories and tags

Author

Amy Tilbrook