Case Study | The AIDS Data Repository for UNAIDS
In a complex data collection process involving many different public health data sources and stakeholders, how do you ensure data is clean, complete, replicable, consistently formatted, accessible and secure?
Project size | $400 000+ Timeline | 2018 - Present
“Working with Fjelltopp has been a pleasure. The team is responsive to our technical needs, has quickly picked up the complexities of our substantive work, and has been forward-thinking by identifying solutions that our clients will appreciate.”
Dr Mary Mahy
Epidemiology Team Lead, UNAIDS Geneva
Client | UNAIDS is a joint United Nations programme fighting HIV/AIDS. They aim to lead and inspire the world in achieving universal access to HIV prevention, treatment, care and support. Their vision is for zero new HIV infections, zero discrimination, zero AIDS related deaths.
We work directly with UNAIDS headquarters in Geneva, and the Epidemiology Team Lead Dr Mary Mahy. Through this role we have also engaged with stakeholders from governments across Africa, the CDC, PEPFAR, and academic institutions across the globe. Fjelltopp won the work by responding to a request for proposal raised on the UN global marketplace.
Context | In pursuing their vision UNAIDS provides regular estimates of how the HIV epidemic is changing across the world; for instance the rate of new infections, the number of people living with HIV, the number of people on treatment etc...These estimates are based on complex mathematical models that are built with data from a vast number of sources around the globe.
The data sources used in the UNAIDS estimates include:
- Health programme data from the countries’ health information systems (e.g. DHIS2)
- Survey data from sources such as DHS and PHIA
- Subnational geographic data agreed with the ministries of health
- Population data from sources such as WorldPop or national census data.
We found that different sources used different data formats, and that these formats changed over time. Much of the data required cleaning and validation, but no audit trail was in place to record these changes. The cleaned and validated datasets were then kept on laptops of local staff in Excel sheets and Word documents, and shared in unencrypted emails without clear licensing, meaning data was forgotten about, or lost to staff turnover and equipment failure.
Problem | UNAIDS were aware that the quality of the data inputs was a significant factor in determining the quality of their estimates, so they hired Fjelltopp to help them take control of data collection effort, harmonise their data inputs, and by doing so improve the quality of their estimations.
Solution | Fjelltopp began by conducting a series of focus groups with stakeholders around the globe, including UN staff, academics and ministries of health. We also reviewed similar work undertaken by other organisations. From this we compiled a software requirements specification for a data portal hosted in the cloud and built with the open source CKAN project. Following further review from stakeholders, Fjelltopp set about building the data portal in house for UNAIDS.
The data portal is a staging environment to validate, clean and archive data for use in their estimation process. It keeps an audit trail for the data as it undergoes multiple rounds of cleaning and validation. Data is organised with the necessary metadata and archived securely on hardware that is encrypted and backed up. Powerful search enables data discoverability, and data owners have the power to configure who can access their data and when. Tools have been developed that facilitate automated data pulls from third party systems (DHIS2). An API allowing third party tools to write and read data from the repository is also maintained.
The project is entitled the AIDS Data Repository and can be viewed at https://adr.unaids.org.
Contact Us now if you are wrestling with the problem of having no consolidated space to validate, archive and share your data.