The decennial censuses of the United Kingdom cost our country ~£500,000,000 and is the biggest, most expensive to collect and most important set of Open Data that exists. Yet the creation of what many power users need, a single simple consistent national file containing the core small area data, is not something the national census agencies do. Today the raw data needed to produce that file are being released for England and Wales.

This leaves the power users of census data in a tricky spot. Either the several hundred businesses that require this file each produce one, or we pay to license it from a census distributor agreeing to additional license terms and conditions. The final option is for someone to do the processing work, and re-release the processed data under the same truly open terms the censuses use.

GeoLytix are taking the plunge and will be processing today’s detailed England and Wales data and making our output available as Open Data.  We will produce a single flat file containing the core ~250 Output Area (OA) variables. It should be ready in a month or so, when Scotland and Northern Ireland’s data is released this summer we will add their data. We will be producing data dictionaries, a user guide and  ensuring all the data is consistent and logical. We will also supply notes on any adjustments we have to make to get the three different national datasets consistent.

Data is, well, just data. The act of processing it is, to be blunt, not that exciting or innovative. But once we have the full data in a single easy to manipulate form we can all get on with the good stuff. Merging with the Livings Costs Survey to produce consumption estimates, creating bespoke small area classifiers, rocket fast online area reporting tools, using the Land Registry Price Paid data and ONS wealth survey to create postcode level asset models, personal income modelling using the labour force survey, Using VOA non-domestic ratings lists for high fidelity business demographics, merging in the NHS prescribing data for health modelling, using public transport flow data to understand complex transport systems… The options are limited only by our imaginations. We hope all of you, large and small, get busy doing great work.

The core pack will cover the topics below and will be made up of about 250 variables. The data will be available as simple text files in both long/thin and short/fat structure, i.e. <OAID>, <VariableID>, <Value> and <OAID>, <Variable1>, <Variable2>…<VariableN>. Creating all the XML, JSON, shapefile, excel or bak files you need from those will take minutes of work not weeks. The data will come under a minimal OGL-like license and the data will be freely available without log-ins or any data harvesting. Just click a link and get to the data.

I set GeoLytix up 15 months ago to help organisations solve problems where location is vital. The growth of the Open Data movement is a key enabler for us. We produce a number of high value derived datasets alongside our contributions back to the Open Data community. If you would like to make any suggestions about the census packs, or want to know more about our data or services please get in touch.

  • Age by Gender
  • Building Type
  • Cars Present
  • Change 2001-2011
  • Communal Establishments
  • Country of Birth
  • Dwelling Rooms
  • Economic Activity
  • Employment
  • Ethnicity
  • Family Structure
  • Health
  • Household Composition
  • Household Tenure
  • Industry Employment
  • Method of Return
  • NS SEC classifications
  • Occupation
  • Pop/Hhd Denominators
  • Qualifications
  • Religion
  • Second Addresses
  • Social Grade
  • Students
  • Household Tenure
  • Travel to Work/Study
  • Year of Arrival in UK