DSC-70 Novel approaches to the Living Costs and Food Survey

This project aims to explore the application of computer vision and Natural Language Processing (NLP) techniques to the Office for National Statistics (ONS) Living Costs and Food Survey (LCF). Specifically, we will produce a set of tools for automatically extracting textual data from scanned shopping receipts (optical character recognition, OCR) and then convert this unstructured text data into tabular form using various NLP techniques.

Team members

  • Lan Benedikt
  • Chaitanya Joshi
  • Sharon Hook

The need

Explore the use of receipt scanning data and barcode scanning data to replace manual data entry of the LCF diaries. The purpose is to reduce respondent burden and make efficiency savings, we will benchmark the automated process against the current manual process. Success measures are speed and data quality.


There is a clear goal to improve a production process. There is also a possibility for knowledge sharing with other National Statistic Institutes (NSIs) and the opportunity to reuse some of the other Data Science Campus projects, such as Optimus.

Data science

  • Image processing
  • OCR
  • Supervised, unsupervised machine learning (ML)
  • NLP
  • Data linking


  • LCF team in Social Survey Division
  • ONS Prices Division
  • Statistics Netherlands
  • Statistics Austria
  • Statistics Finland
  • Statistics Slovenia

Further information

Please contact datasciencecampus@ons.gov.uk for more information.


  • No updates yet.


This page has been automatically generated. Click here to download this project description as a pdf or click here to download as a word document.