BLOG - Creation of large road network distance matrix for research on commuting and health

Working closely with National Records of Scotland (NRS), we created a dataset of Scotland's road networks, for our research project on how commuting can affect a person's health and wellbeing and for use in future research on transport and travel.

Introduction

As part of the Census, the National Records of Scotland (NRS) collects data on the home and work postcode of respondents which can be used to analyse commuting across Scotland.

This is a valuable dataset with a number of uses but it is classified as personal data and therefore must be handled accordingly. For this project that meant that the analysis would be handled internally by NRS and be aggregated and anonymised to an acceptable level before the final data book could be released.

As the GIS (Geographic Information Systems) Analyst, part of my job was to find the road network distance from the home postcode to the work postcode for people in the 2001 and 2011 Census. This data would then be forwarded to colleagues in NRS who would produce the final data book.

NRS digitise postcodes as both a polygon and grid reference point with the grid reference being chosen as the best option to represent postcodes for this project.

The total number of unique home to work postcode pairs was over 900,000 for each Census year covering all destinations in the four large cities in Scotland (Glasgow, Edinburgh, Aberdeen and Dundee).

Method

We used the Ordinance Survey Integrated Transport Network (ITN) for 2011 for our road network data. The analysis was carried out in FME, a piece of software that allows the user to build workbenches consisting of three main parts:

Reader(s), which read in the data
Transformers which perform the analysis
Writer(s) which write out the final data

The workbench created for this analysis contained a writer to produce the final output table plus, two readers who:

Read in the table of work to home postcode pairs (including coordinates of postcode grid reference)
Read in the ITN road network data

We also had three groups of transformers:

Convert work to home postcode table to a list. This ensures each pair is analysed individually and not as a continuous route
Plot Postcode grid reference points and snap to the closest node on the road network data
Calculate the shortest path between the postcode pairs (flag any with no joining path)

Dataset

As with any dataset there are some assumptions and limitations.

The road network data did not contain any data on travel direction (one way vs two way streets) or junctions. Because of this all roads were classed as two way and all roads which crossed were classed as having full junctions. Also FME only calculates distances to and from nodes on the network so all postcode grid reference points had to be snapped to the road network but no grid reference points crossed the postcode boundary as defined by the polygons.

The Future?

This data will be used for the commuting and health project, one of the projects in the Lifelong health and wellbeing strand.

However, the data will be of great value for further studies in transport and travel behaviour.

This article was published on 03 Mar 2021

Author

David Rice & Zhiqiang Feng

Dr Zhiqiang Feng

BLOG - Creation of large road network distance matrix for research on commuting and health

Introduction

Method

Dataset

The Future?

Author

David Rice & Zhiqiang Feng

Site highlights

BLOG - Getting to grips with administrative data analysis: My experience of the IADRA course

Understanding social circumstances of Veterans

BLOG - Reflecting on the ADR UK Conference: Insights from our new PhD Researchers