Scripts to create a collection of open freight transport data sets from EuroStats, Destatis, CBS StatLine and Rijkswaterstaat. See figure below for an overview. We deliver two IWT data sets: one file covers the total annual IWT performance (billion tonkilometers) per EU country from 1970 and the other includes annual IWT volumes (thousand tonnes) per EU country and good type. For IWT volumes to, from and via The Netherlands, the origin and destination information on NUTS-2 level is also included from 1988 onwards. The IWT performance data set is named ‘eu iwt time series tonkm.csv’. The IWT volumes per good type is named ‘eu iwt time series goodtypes.csv’. Both files can be found on the mainpage. The input data sources are also in this repository under GitHub\Freight-Transport-Data\data\sources. The folder structure under 'sources' follows the headings of the figure: CBS, CBS Archive, Destatis and EuroStat. The mappings between the several classifications can be found under 'mappings'. The scripts are all Jupyter Notebook files (.ipynb) and are stored under GitHub\Freight-Transport-Data\src.
Although we attempted to automize the processing as much as possible, some task are conducted manually. The code to retrieve IWT volume data from image files (.jpeg) need to be adjusted manually by the user for every year between 1970-1982 ('retrieve_table_from_multiple_images.ipynb'). The provided code is just an example to retrieve IWT volumes from the image file regarding one year. Since the structure of the images is not consistent over all years and the automatic retrieval is not 100% accurate, manual processing is partly needed in this step.
The same applies for the generation of the time series for The Netherlands between 1970-1987, using the processed image files as input ('generate_time_series_nl.ipynb'). From 1982-1987, goods types on domestic transport were published on 4 NST-R digits, before 1982 on 3 NST-R digits. Goods types on international transport were published on 2 digits. Therefore, there are two different piece of codes to prepare for the mapping from NST-R to NST2007. The user can simply uncomment the piece, referring to different period
The goods type imputation also requires some manual adjusting in the code for France, Poland, Romania, Bulgaria and Croatia ('iwt_eu_goodtype_historical_imputed.ipynb'). The user has to enter the country manually, for which the imputation should be executed. Since the missing values for the goods types vary between the countries, the imputation period need also to be entered manually in the code. The imputation periods per country can be derived from Figure 2 in the paper.
Some steps refer to figures in our data paper, so the reader is facilitated to reproduce these figures if needed.