Creating a training course that meets both the needs of the researcher and the data provider.

Prior to delivering the Scottish Centre for Administrative Data Research (SCADR) 5 day training course, we carried out market research to ensure we developed training courses that were viewed as essential skills for those interested in accessing and analysing administrative data courses. 

Market Research

In order to investigate what training was needed and valuable, we asked the opinions of the electronic Data Research and Innovation Service (eDRIS) team, who are part of Public Health Scotland (PHS) what were the common mistakes made by new researchers, when requesting access to administrative datasets. Our intention was to then discuss what training was needed, in order for them not to make these mistakes.

We also sought the opinions from the members of Early Career Researchers Using Scottish Administrative Data (eCRUSADers) on what they believed people using administrative data think they need to know, and what’s already out there in the marketplace with regards to training.

This allowed us to create a Twitter poll, that asked for people to identify what was the most important training component, out of 4 potential answers. We received the following percentages from 17 responses:

- the application process (chosen by 29.4%)

- data protection (0%)

- data science methods (47.1%)

- my specific data set (23.5%)

With input from the discussions and the twitter poll results, there were 7 areas of training/themes required:

  1. The data access process – including how to write documents, timescales, and how things are different in Scotland.
  2. Data protection – legal and ethical issues specific to admin data use – including the difference between anonymisation and pseudonymisation, the risks of using admin data, the legislation that underpins data linkage, and when a DPIA is requited.
  3. How to work within a safe haven – including processes (e.g. how to request software, disclosure control rules), and data management (versioning, following published standards such as FAIR and RECORD within a secure environment with restricted data.
  4. Understanding data/admin data in general – including understanding the provenance of data, data management techniques for big data, understanding linkage.
  5. Specific analysis/data science techniques – including causal inference methods, survival analysis, longitudinal modelling.
  6. Information on specific datasets – including information on variable codes, variable quality and how derived, and overall what data can and can’t be used for.
  7. Other – including data visualisation, creating innovative policy relevant projects, how to fit this process into wider career development.

Most themes were agreed by all members and groups asked – interestingly, data protection was not mentioned by any researchers specifically, but was seen as a training need by data controllers and teams supporting researchers in Scotland and elsewhere. This is perhaps that researchers don’t see data protection as their responsibility? Or that they feel it’s covered by the conversations with the Data Protection Officer (DPO) and therefore doesn't require additional training.

A smaller theme noticed was the request for training for both sides of the fence – i.e. sometimes eDRIS and data controllers could benefit from some training/development too – potentially on the legal gateways that allow data sharing and the safety of the system, or statistical methods overviews to make output checking easier.  Collaborating to co-create research questions that are both feasible and impactful is also another area where joint training should be considered and could be beneficial.

What’s already out there?

Looking at the above seven areas, you might be thinking “there must be guidance published on a lot of this already?” You would be correct, however, the fact that researchers indicated that they needed it, shows that they aren’t aware of the huge amounts of information available. Therefore communication needs to be improved to make people aware that there are sources, such as the ISD National Data Catalogue, the eDRIS website, or the Scottish Government website. It’s also possible they’re not covering the information that researchers need, or that the information isn't easily discoverable.

A brief look through some major training providers found the majority of courses and events are focused on data management, statistical methods and information on datasets – but not necessarily Scottish data, or using a safe haven. Data management and stats courses tend to be run annually – perhaps people are missing the boat, or aren’t aware on what methods they need until they get access to the data?

The training courses that we have identified as comprehensive introductions that cover most of the above 7 themes are:

- SCADR “Introduction to Admin Data” course,

- Centric’s online Admin Data intro

- Health Foundation’s Safe Data Awareness Training

- Swansea/Sydney’s Linked Health Data courses

We’d recommend the Centric course as a great first resource – and it’s free!

Another thing we found is that the most useful courses, use admin data as the training dataset, providing interactive sessions for hands on practice, and have materials that are available for a long time to allow researchers to come back to them when they get to different stages of their data access/analysis journey.

What changes are SCADR going to make?

With this in mind, SCADR is now looking into how they we can support filling the gaps identified – hopefully the development of Research Data Scotland will be instrumental at signposting to all the available resources out there, to help first time researchers navigate the joy of Administrative Data Research without reinventing the wheel.

We are always keen to hear other people's opinions, so please feel free to let us know 'what do you think' by emailing us at scadr@ed.ac.uk:

Do you agree that more training is needed in these areas?

Have you been on any of the training mentioned, and are willing to give us feedback?

What else could support new researchers to the area?

We look forward to hearing from you!

This article was published on 14 Dec 2020

Categories and tags

Author

Elizabeth Lemmon Amy Tilbrook