DSC-174 DIT - Digital Trade

Digital retail trade

https://github.com/datasciencecampus/Digital_retail_trade

The Challenge

There is no currently accepted definition of digital trade or a comprehensive measurement framework. Work by Stats Netherlands estimates that the value of cross border online consumption of goods in the Netherlands is six times higher than the value estimated by conventional survey methods.

Can the approach taken by Stats Netherlands be applied to the UK and are there any additional approaches (such as credit card data) that can be used to augment or replace the Stats Netherlands approach?

Background

It is currently difficult to capture digital trade

Conventional statistics methods have difficulty in capturing,

  • New digitally-enabled business models (e.g. Airbnb, Uber)
  • Cross-border flows of data that are ‘free’ but generate business revenue
  • Imports of digital downloads (films, music, e-books, software) by households
  • Small value B2C and C2C transactions (below VAT threshold) via auction sites and marketplace platforms (Ebay and Amazon) Changes in the quality of digital goods and services
  • Service activities previously undertaken by businesses that are undertaken by households (online travel booking, online banking)
  • Investment in intangible assets (intellectual property, organisational capital, marketing capital etc.) and its origin/location

Current approach

The currently approach for measuring cross-border online purchases and sales used the Consumer and Business Surveys (ONS Internet Access Survey and ONS E-Commerce Survey). However, this has the following drawbacks,

  • Sample sizes are small
  • Households and businesses may not know if they engage in cross-border transactions (transactions may be via an intermediary)
  • Surveys do not ask respondents to report the value of transactions (only proportions available)
  • Most questions focus on exports (sales) with limited focus on imports (purchases)
  • Data is not very timely or frequent
  • No breakdowns are availbale by partner country

Stats Netherlands approach

The Stats Netherlands approach combines data-linking and data science techniques into the following steps,

  1. Obtain tax returns data filled by foreign businesses (Under EU law, any EU business selling online in the EU has to pay VAT in the country of consumption by filling a tax return)
  2. Link tax returns data with business register to identify businesses active in retail trade (according to NACE Rev 2). A problem here is that that the legal name of business may be different in the two datasets. Stats Netherlands address this by using text mining and data-driven record linkage techniques.
  3. Find the webpage of the business and assess whether website belongs to a web shop. This is achieved by identified the presence of a shopping cart on the web site using web-scraping and machine learning
  4. Estimate the bias and standard deviation of the estimate

Updates

19 February 2020

  • Draft of data sources and methods section of the final report, due to send to DIT on Friday (21st)
  • Company reference number link files prepped for IDBR linkage, export request from DAP raised
  • Response from glass.ai to data queries received, along with a correction to one of the data files

24 March 2020

  • Extract of non-UK enterprises from IDBR received

Outstanding work:

  • Manual quality checks of glass.ai data
  • IDBR data linkage
  • Estimating turnover of foreign owned webshops

Updates

  • No updates yet.

Notes

This page has been automatically generated. Click here to download this project description as a pdf or click here to download as a word document.


Updated