- Home
- Administrative data
- Our research
- About us
- Contact us
- News and events
- BLOG - Access to secure data during the Covid-19 pandemic - a model for the future?
- Covid-19 and Care Homes: Advances in Administrative Data Research during the pandemic
- DATA INSIGHTS -Deprivation and informal care at the end of life
- BLOG - Reflections on engaging with children & young people about data
- NEWS - Innovative new residential linkage tool launched
- BLOG SERIES - Dramatic increase in deaths at home- No.4
- DATA INSIGHTS - Youth Movements, Social Mobility and Health Inequalities
- NEWS - New report warns of deepening poverty crisis for Scottish families
- New report on Infants Born into Care in Scotland
- Spotlight on Dr Elizabeth Lemmon
- Spotlight on Jan Savinc
- BLOG - Covid-19 fines in Scotland: What we know so far
- BLOG - The value of social science and administrative data research in Scotland: how we are helping respond to COVID-19
- NEWS - Joining together with Research Data Scotland to co-host existing public panel on data
- DATA INSIGHTS - Exploring illegal drug consignments in Scotland
- DATA INSIGHTS -Linking two administrative data sets about looked after children
- NEWS - ADR UK grants 20 PhD studentship opportunities focused on quantitative research using linked administrative data
- NEWS – ADR Scotland data ambassadors launched
- Spotlight on Peter Christen
- The importance of administrative data
- Virtual Conference - Data Linkage: Information to Impact
- An Introduction to Data Science for Administrative Data Research course - March 2023
- BLOG SERIES - Dramatic increase in deaths at home - No.7
- BLOG SERIES - Dramatic increase in deaths at home
- BLOG SERIES - Dramatic increase in deaths at home- No.3
- DATA INSIGHTS - Investigating the effects of class composition and class size on pupils’ attainment in Scottish primary schools
- NEWS - New opportunity to join ADR Scotland’s Public Panel
- BLOG - Engaging the public through our public panel
- BLOG - Exploring the potential of synthetic data
- Children’s Health in Care in Scotland (CHiCS)
- DATA INSIGHTS - Automatic Coding of Occupations: Methods to create the Scottish Historic Population Database (SHPD)
- DATA INSIGHTS - Selective schools: do they improve health?
- DATA INSIGHTS - Were people who died at home less likely to attend hospital at the end of life during the Covid pandemic?
- EVENT - Active Travel: New Data, New Insights
- EVENT - Holyrood Evidence Week: Doing Data Better for Policy and Public Good
- EVENT - Unlocking criminal justice data in Scotland: Findings from Data First
- IPDLN Conference - Data linkage research: informing policy and practice
- NEWS - Making nursing data available to inform policy
- NEWS - New report on The Impact of Covid-19 on Children’s Care Journeys in Scotland: An Analysis of the Administrative Data on 'Looked After' Children
- NEWS - Updated report on Infants Born into Care in Scotland
- Scout and Guide participation boosts later life health
- BLOG - Geospatial Ambitions
- BLOG - Taking historical death records and developing a database for future analysis
- BLOG - Unlocking criminal justice data
- DATA INSIGHTS - Community mortality due to Covid-19
- DATA INSIGHTS - What makes people more likely to cycle to work?
- Future-proofing investment into administrative data research announced in Scotland
- NEWS - Understanding the dynamics of the nursing workforce: the potential of routinely collected data
- Spotlight on Joanna Soraghan
- Spotlight on Katherine Falconer
- Why misconceptions about population data can lead to bad outcomes
- ADR Scotland publishes its strategy for 2022-2026
- BLOG - Developing and re-shaping our public panel
- BLOG - Review of the recent DWP Areas of Research Interest Workshop
- BLOG: Developing a cross-national research agenda on crime and convictions
- BLOG: Working together to make a difference with data
- DATA INSIGHTS - Homelessness duration in Scotland: how long does rehousing take?
- DATA INSIGHTS - Occupation and COVID-19 deaths: Scotland in a comparative perspective
- DATA INSIGHTS -The health and economic benefits of active commuting in Scotland
- EVENT - ADR UK Conference 2023
- EVENT - RSE The secret world of data
- NEWS - New comic on children's rights and data
- NEWS - Report published on our children’s engagement pilot study
- NEWS - When did fines issued by the police for breaking Covid rules peak?
- Scotland’s portfolios: Research and Statistical Data - building a new approach to thematic data linkage
- Spotlight on Cecilia Macintyre
- Spotlight on Dr Evan Williams
- Spotlight on Fernando Pantoja
- Spotlight on Laurie Berrie
- ADR Scotland Winter Partnership Session - **internal event**
- BLOG - AGEING AND HOMELESSNESS IN SCOTLAND
- BLOG - Can we use linked administrative data to identify social disadvantage?
- BLOG - Commuting and its impact on health
- BLOG - The Dynamics of the Nursing Workforce in the UK: Using data to support our nurses
- BLOG: Growing up in kinship care
- Congratulations to Alastair McAlpine, the new Chief Statistician for Scottish Government
- DATA INSIGHTS - Analysing a season of death and excess mortality in Scotland’s past
- EVENT - ADR UK Virtual Half Day event
- EVENT - HDR UK Conference: Data for global health and society
- EVENT - Introduction to Data Science for Administrative Data Research course (IDS-ADR)
- Event - Public data for public good: towards better understanding children's lives
- NEWS - ADR Scotland's first flagship dataset
- NEWS - Data research initiative secures £90 m funding extension
- NEWS: Our role supporting the new Covid-19 research data service in Scotland
- Spotlight on Gina Anghelescu
- Spotlight on Michelle K Jamieson
- Webinar - An Introduction to Looked-After Children Dataset
- ADR Scotland Away Day (**for staff only**)
- BLOG - An Inside Job: Using Criminology, Police Data and a Lot of Nouse
- BLOG - Improving justice data to promote data justice in Scotland
- BLOG - Location of death in 2020: a changing trend from hospitals to homes
- BLOG - Reflecting on the ADR UK Conference: Insights from our new PhD Researchers
- BLOG - Seeking feedback on Research Data Scotland’s core principles via our public panel
- BLOG - What skills, training and support are required by those wishing a career as an administrative data researcher?
- BLOG No. 9 - Final blog in this 'deaths at home' series
- BLOG SERIES - Dramatic increase in deaths at home - No. 6
- BLOG SERIES - Dramatic increase in deaths at home - No.8
- BLOG SERIES - Dramatic increase in deaths at home- No.5
- BLOG: 5 things I've learnt about working with policymakers...
- BLOG: Automating Coding for Large Historical Datasets
- BLOG: COVID-19- How increased deaths at home impact the carer community
- DATA INSIGHTS -Postal deliveries of drugs in Scotland
- EVENT - 'Getting things done with data in government'
- EVENT - Linking public sector data for research: an ADR UK showcase event
- EVENT Seminar - Administrative data for social policy research: potential and pitfalls
- NEWS - ADR Scotland launches new podcast series
- NEWS - Additional funding for Understanding Children’s Lives and Outcomes
- NEWS - Engaging children and young people
- NEWS: Police use of Fixed Penalty Notices under the Covid-19 regulations in Scotland: A new data report highlights links with deprivation and inequality
- NEWS: Police use of the new Covid-19 powers: Using administrative data to analyse and evaluate practice
- Research Data Scotland - New user forum
- Spotlight on Dr Patricio Troncoso
- Spotlight on Renata Samulnik
- Summary of ADR Scotland Winter Partnership session
- Directorship of the International Population Data Linkage Network (IPDLN) for 2021-22.
- BLOG: In the light of experience: InterRAI and the final thousand days of life
- Multiple health conditions and social care
- NEWS - Susan McVie elected as Fellow of the Academy of Social Sciences
- SCADR relocates to the Bayes Centre
- EVENT: Four day introduction to using administrative data for social and health research
- BLOG: The value of administrative data: DALYs and the Scottish Burden of Disease study
- BLOG: Where to start with parliamentary and policy engagement
- EVENT - International Conference on Administrative Data Research, Cardiff
- EVENT - Using data to realise the potential of the 'Last 1000 days'
- EVENT: TalkingData: ADR Scotland mini-summit
- EVENT: “Let’s use data to save time, money and lives”: ADR Scotland partners gather for mini-summit
- EVENTS: ADR Scotland researchers present at international conference in Cardiff
BLOG - Exploring the potential of synthetic data
This week we explore how synthetic data can be used to enhance administrative data research and practice.
What is synthetic data?
Synthetic data is information that's generated from models of the real-world used in the place of the actual data. Synthetic data, because it is based on a model, retains the structure and some of the patterns of the original dataset whilst containing none of the original data.
Why use synthetic data?
As outlined in the recent ADR UK report on synthetic data, there are clear uses for this type of data. Two significant purposes that can benefit researchers in administrative data research are training and testing.
Synthetic data is a really useful tool for training as it allows researchers to practice handling datasets and to become familiar with working with them, without requiring full access to secure environments. It also means researchers can test their code and explore the kinds of analysis they can undertake.
For example, synthetic data can help PhD students to prepare their research whilst waiting for data access which can often be lengthy! Even ‘surface level’ information on this protected data, like variable names, can help in producing a synthetic structure so a researcher can then make sure the code runs without errors and check that they’re getting the expected (dummy) results. This means when they do get the data, analysis is much quicker.
Researchers can also show data controllers the synthetic data and share exactly what they want to do with it, so they can better understand how the data is being used and are reassured.
What tools are out there?
There are various tools that are available or being developed for creating synthetic data. Here at SCADR, our colleagues Professor Chris Dibben, Professor Gillian Raab and Dr Beata Nowok created a synthetic data package called Synthpop several years ago (listening to synthpop music is optional!). Synthpop for R (ideally R studio) allows users to create synthetic versions of confidential, individual-level data for use by researchers. To find out more about Synthpop and explore their range of resources, please visit their website.
Since it was first made available as an open-source package in 2014 it has been widely used by a variety of groups including National Statistical Offices. As well as supporting a variety of methods of creating synthetic data (we could add details if you wish), the package provides tools for evaluating the utility/fidelity of synthetic data and assessing disclosure risk; this last is not yet fully developed but is being expanded now. The number of downloads of the package has increased in recent years. Since mid-2020 there have consistently been over 2,000 downloads per month.
Staff at the Scottish Longitudinal Study (SLS) use the package to provide datasets for preliminary analysis to users of the SLS. We have also created synthetic datasets to use in training courses.
Further information
SCADR researcher, Professor Gillian Raab, received £30,000 funding from Research Data Scotland at the start of 2023 to 'Review and evaluate methodology on how to measure the disclosure risks from synthetic data'. We look forward to sharing results from this project later this year.
Gillian’s work has also recently informed and contributed to the United Nations Economic Commission for Europe (UNECE) Synthetic Data for Official Statistics - Starter Guide.
This article was published on 24 Feb 2023