Draft Standard Investigation Report

Welcome to the Open Contracting Data Spike

The objective was to spend 5 days exploring the feasibility of developing a common standard for open contracting data and scoping the opportunities and challenges.

This work was "supply-side" only, meaning that it looked at a number of available data sets to examine the problem rather than looking at the "demand-side" - what information do stakeholders want.

In addition to examining feasibility, opportunities and challenges, a draft standard was produced in order to provide something on paper for people to engage with - the draft included here is in no way an accepted or agreed standard, mearly a talking point.

This work was presented at the ODI with http://www.open-contracting.org/ on March 28th 2013.

The following data was looked at during this project:

The Guinea data was not used in the mapping as it was too disparate to facilitate the process that we were trying to do. However, bringing in extractive industry and other such disparate data will be an important part of future work. The US data was only partially looked at, specifically to look at the transaction data available.

Additional information including the presented presentation, and the code used to produce this report is available on Github - https://github.com/birdsarah/oc-datamerge-spike

Draft Standard

Based on the very preliminary investigation, the following draft standard was created, intended only as a starting point for discussion.

Visualised Structure

This structure can be organized into blocks of data that allow us to visualize the different data points that different data sets contain. Hovering over a square will show a key that can be referenced in the standard above.



We took all the datasets and overlayed which data set had which data points to build a map of the data coverage. The darker the blue, the more datasets had that particular data point. This shows us that Organization, Meta, Bid, and Award information is more commonly available in the datsets we examined than performance and termination data.


Here we look at which fields each dataset contains.