The International Aid Transparency Initiative (IATI) has led to a lot of data on aid projects becoming available.
In preparation for the Open Government Partnership hack days in Brasilia this Tuesday and Wednesday I’ve put together a few quick tips for getting started with IATI data, whether you are planning to make it the foundation of a hack, or you want to link in aid project information to another project you are working on.
What is in the datasets?
IATI is an XML standard for representing data on aid budgets and projects and their associated documents and resources. There are two kinds of IATI file:
- Organisation files - these include basic information about donors and may contain details of their planned budgets, or relevant programme documents. There are only a few of these published so far.
- Activity files - these include details of aid-funded projects, including locations, budgets, spending and often include associated documents and resources. Most donors publish one activity file for each country they work in, so there are 100s of these files.
What could you do with the data?
The strength of IATI data is in having granular information on individual aid projects. In the future it will be possible to look at changing funding patterns, but right now many donors are only publishing recent projects – so think about uses that take advantage of information on individual projects.
We’ve started to gather together some stories about potential users of aid data to act as inspiration for hack day ideas. Getting aid information into the hands of those who can use it to make a difference is one of the big areas of focus for IATI related projects right now – from developing tools to help researchers dig into the available information, to projects that make it easy to print out and take project information direct to the communities.
Projects might also look at how to augment the available data: adding annotations to aid projects to help transparency and accountability activists share knowledge about particular projects – or linking up IATI data with other transparency and accountability data and documents.
Technical approaches
The IATI XML Standard is documented at http://www.iatistandard.org where you will also find the IATI Knowledge Basewith tips for working with the data, and space to report any issues you come across to the technical team looking after the XML standard. If you just want to work with a single IATI file, or a few files:
- You could use your preferred tools to read the XML directly;
- You could use Google Refine which can read in and turn small XML files into tabular data;
- You could make use the hosted CSV Conversion tool (also linked from registry pages), or adapt the XSLT it uses for your own purposes to flatten out a file for use in spreadsheet or statistical software
- data.aidinfolabs.org hosts an API which is running against selected collections of IATI activities, put together for the AidView application. This API offers a number of aggregation features, and has pre-computed useful values like a ‘total project size’ (based on the highest of commitments or spending), and performed some basic currency conversion to USD.
- The IATI Explorer site aggregates raw IATI data files on a regular basis into an XML database. You can run xpath queries over the data using as RESTFUL API, and apply XSLT to the results. Whilst you can run basic count and sum operations over the data (using xpath functions), the strength of this API is for retrieving filtered sets of documents.
- An OpenAid.nl API is available to query Dutch IATI data, and the underlying source-code, which converts IATI XML into a relational database model, and then makes it available via an API (django application) is available on github here.
- Financial transactions from IATI have been loaded onto the Open Spending platform which also provides API access to the data.(Read here for more on the open spending data)
- A sub-set of IATI Data has also been converted to Linked Data and published on the Kasabi Platform where it can be accessed with the SPARQL query language.
Things to watch out for
- Watch out for really big files: Most IATI files are just as few MB in size, but there are one or two really large files (from donors who publish all their information in one file, rather than splitting by country, or where granular transaction data is available).
- Be flexible about code-lists: Not everyone uses the same ‘sector codes’ or region codes.
- Be careful with aggregation: IATI lets donors record a range of different transaction types (commitments, disbursements, spending etc.). If you are aggregating financial figures, be aware of the definitions of these, and the current limitations of the data (e.g. some donors don’t publish commitments that far into the future etc.)