Novel approaches to the Living Costs and Food Survey

This project aims to explore the application of computer vision and Natural Language Processing (NLP) techniques to the Office for National Statistics (ONS) Living Costs and Food Survey (LCF). Specifically, we will produce a set of tools for automatically extracting textual data from scanned shopping receipts (optical character recognition, OCR) and then convert this unstructured text data into tabular form using various NLP techniques.

Team members

  • Lan Benedikt
  • Ian Grimstead
  • Alex Noyvirt
  • Jeremy Rowe
  • David Pugh
  • Emily Tew
  • Sharon Hook

The need

Explore the use of receipt scanning data and barcode scanning data to replace manual data entry of the LCF diaries. The purpose is to reduce respondent burden and make efficiency savings, we will benchmark the automated process against the current manual process. Success measures are speed and data quality.

Impact

Clear goal to improve a production process. Possibility for knowledge sharing with other NSI’s. Opportunity to reuse some of other Campus projects e.g. Optimus

Data science

  • Image processing
  • OCR
  • Supervised, unsupervised machine learning (ML)
  • NLP
  • Data linking

Stakeholders

  • LCF team in Social Survey Division
  • ONS Prices Division
  • Statistics Netherlands,
  • Statistics Austria
  • Statistics Finland
  • Statistics Slovenia

Further information

Please contact datasciencecampus@ons.gov.uk for more information.

Updates

  • No updates yet.

Notes

This page has been automatically generated. Click here to download this project description as a pdf or click here to download as a word document.


Updated