This project aims to explore the application of computer vision and Natural Language Processing (NLP) techniques to the Office for National Statistics (ONS) Living Costs and Food Survey (LCF). Specifically, we will produce a set of tools for automatically extracting textual data from scanned shopping receipts (optical character recognition, OCR) and then convert this unstructured text data into tabular form using various NLP techniques.
- Lan Benedikt (Lead)
- Ian Grimstead
- Jeremy Rowe
- Sharon Hook
Explore the use of receipt scanning data and barcode scanning data to replace manual data entry of the LCF diaries. The purpose is to reduce respondent burden and make efficiency savings, we will benchmark the automated process against the current manual process. Success measures are speed and data quality.
There is a clear goal to improve a production process. There is also a possibility for knowledge sharing with other National Statistic Institutes (NSIs) and the opportunity to reuse some of the other Data Science Campus projects, such as Optimus.
- Image processing
- Supervised, unsupervised machine learning (ML)
- Data linking
- LCF team in Social Survey Division
- ONS Prices Division
- Statistics Netherlands
- Statistics Austria
- Statistics Finland
- Statistics Slovenia
Please contact firstname.lastname@example.org for more information.
- No updates yet.