Hack day tips for working with IATI Data

The International Aid Transparency Initiative (IATI) has led to a lot of data on aid projects becoming available.

In preparation for the Open Government Partnership hack days in Brasilia this Tuesday and Wednesday I’ve put together a few quick tips for getting started with IATI data, whether you are planning to make it the foundation of a hack, or you want to link in aid project information to another project you are working on.

What is in the datasets?

IATI is an XML standard for representing data on aid budgets and projects and their associated documents and resources. There are two kinds of IATI file:

  • Organisation files - these include basic information about donors and may contain details of their planned budgets, or relevant programme documents. There are only a few of these published so far.
  • Activity files - these include details of aid-funded projects, including locations, budgets, spending and often include associated documents and resources. Most donors publish one activity file for each country they work in, so there are 100s of these files.
All the available IATI files are listed on the IATI Registry (although they are hosted on the respective donors websites), and details of each file can be accessed using the CKAN API.

What could you do with the data?

The strength of IATI data is in having granular information on individual aid projects. In the future it will be possible to look at changing funding patterns, but right now many donors are only publishing recent projects – so think about uses that take advantage of information on individual projects.

We’ve started to gather together some stories about potential users of aid data to act as inspiration for hack day ideas. Getting aid information into the hands of those who can use it to make a difference is one of the big areas of focus for IATI related projects right now – from developing tools to help researchers dig into the available information, to projects that make it easy to print out and take project information direct to the communities.

Projects might also look at how to augment the available data: adding annotations to aid projects to help transparency and accountability activists share knowledge about particular projects – or linking up IATI data with other transparency and accountability data and documents.

 

Technical approaches

The IATI XML Standard is documented at http://www.iatistandard.org where you will also find the IATI Knowledge Basewith tips for working with the data, and space to report any issues you come across to the technical team looking after the XML standard. If you just want to work with a single IATI file, or a few files:

  • You could use your preferred tools to read the XML directly;
  • You could use Google Refine which can read in and turn small XML files into tabular data;
  • You could make use the hosted CSV Conversion tool (also linked from registry pages), or adapt the XSLT it uses for your own purposes to flatten out a file for use in spreadsheet or statistical software
If you want to work with a lot of files, you might want to have access to the information in database form. A number of projects have been building data stores, APIs and ways to query and access IATI data. These vary in terms of how complete the data they currently hold is, but they should get you started building against live data.
  • data.aidinfolabs.org hosts an API which is running against selected collections of IATI activities, put together for the AidView application. This API offers a number of aggregation features, and has pre-computed useful values like a ‘total project size’ (based on the highest of commitments or spending), and performed some basic currency conversion to USD.
  • The IATI Explorer site aggregates raw IATI data files on a regular basis into an XML database. You can run xpath queries over the data using as RESTFUL API, and apply XSLT to the results. Whilst you can run basic count and sum operations over the data (using xpath functions), the strength of this API is for retrieving filtered sets of documents.
There’s lots more shared source code and ideas for working with IATI data in the ‘Inspiration’ sections of AidInfoLabs.
If you want to find out what codes in the XML mean, you might want to take a look at the Codelist documentation, and Codelist API.

Things to watch out for

IATI data is provided directly by donors. There’s a technical team working to help donors improve their published data, but right now applications might need to:
  • Watch out for really big files: Most IATI files are just as few MB in size, but there are one or two really large files (from donors who publish all their information in one file, rather than splitting by country, or where granular transaction data is available).
  • Be flexible about code-lists: Not everyone uses the same ‘sector codes’ or region codes.
  • Be careful with aggregation: IATI lets donors record a range of different transaction types (commitments, disbursements, spending etc.). If you are aggregating financial figures, be aware of the definitions of these, and the current limitations of the data (e.g. some donors don’t publish commitments that far into the future etc.)

Got a question?

Tweet @timdavies during the OGP for support with using IATI data.

Share your learning

Working with the IATI data? If you discover any new tricks for working with the data, or you overcome particular challenges to make open aid data work for transparency and accountability you can share them on the IATI-Tools mailing list, or by adding a post to the knowledge hub.

Still quiet here.sas

Leave a Comment

Author:

E-mail:

Homepage: