Transparent and trusted data management for epidemiological modelling on an international scale.
The AIDS Data Repository for UNAIDS
Client: UNAIDS Geneva
Project size: $400 000+
Timeline: 2018 – Present
UNAIDS is a joint United Nations programme fighting HIV and AIDS. We work directly with the UNAIDS headquarters in Geneva, and the Epidemiology Team Lead Dr Mary Mahy.
Their goal: To lead and inspire the world to achieve universal access to HIV prevention, treatment, care and support.
Their vision: Zero new HIV infections, zero discrimination, zero AIDS related deaths
Through this project we have also engaged with stakeholders from governments across Africa, the CDC, PEPFAR, and academic institutions across the globe.
UNAIDS provides regular estimates of how the HIV epidemic is changing across the world. Some examples include:
- The rate of new infections.
- The number of people living with HIV.
- The number of people currently receiving treatment.
These estimates are based on complex mathematical models that are built with data from a vast number of sources around the globe.
The data sources used in the UNAIDS estimates include:
- Health programme data from the countries’ health information systems. (e.g. DHIS2)
- Survey data from sources such as DHS and PHIA.
- Subnational geographic data agreed with the ministries of health.
- Population data from sources such as WorldPop or national census data.
Different sources used different data formats and these formats changed over time.
Much of the data required cleaning and validation, but no audit trail was in place to record these changes.
The cleaned and validated datasets were then kept on the laptops of local staff in Excel sheets and Word documents.
They were shared in unencrypted emails without clear licensing, meaning data was forgotten about, or lost to staff turnover and equipment failure.
UNAIDS were aware that the quality of the data inputs was a significant factor in determining the quality of their estimates. They got us on board to help them take control of the data collection effort, harmonise their data inputs and improve the quality of their estimations.
We began by conducting a series of focus groups with stakeholders around the globe, including UN staff, academics and ministries of health. We reviewed and analysed similar work undertaken by other organisations.
From this we compiled software requirements specifications for a data portal hosted in the cloud and built with the open-source CKAN project. Following further review from stakeholders, we set about building the data portal in-house for UNAIDS.
- The data portal is a staging environment to validate, clean and archive data for use in their estimation process.
- It keeps an audit trail for the data as it undergoes multiple rounds of cleaning and validation.
- Data is organised with the necessary metadata and archived securely on hardware that is encrypted and backed up.
- Powerful search enables data discoverability, and data owners have the power to configure who can access their data and when.
- Tools have been developed that facilitate automated data pulls from third party systems (DHIS2).
- An API allowing third party tools to write and read data from the repository is also maintained.
The project is entitled the AIDS Data Repository and can be viewed here (https://adr.unaids.org)
What our client said about us
“Working with Fjelltopp has been a pleasure. The team is responsive to our technical needs, has quickly picked up the complexities of our substantive work, and has been forward-thinking by identifying solutions that our clients will appreciate.”
Dr Mary Mahy, Epidemiology Team Lead, UNAIDS Geneva