Setting up a country-wide water quality testing system in Sierra Leone with UNICEF and the Ministry of Water Resources
Access to safe drinking water is a major challenge in Sierra Leone. In 2017, Statistics Sierra Leone (Stats SL) and UNICEF conducted a Multiple Indicator Cluster Survey (MICS) to collect internationally comparable data on a wide range of indicators. The survey revealed that almost 90% of the drinking water at household level contained the E. coli bacteria, presenting a serious health threat to citizens.
Figure 1: E.Coli contamination as revealed by the survey
This was the first time in recent years that water quality data had been collected at scale, and the exceptionally high results were questioned by state departments and non-governmental organisations (NGOs). What’s more, the data wasn’t comprehensive enough to provide insights into the source of contamination. Was it at household level or at the source?
The Ministry of Water Resources went on to test 300 wells, and the results showed that the source water was safe to drink. This caused other questions to arise. If water at the source is safe, is it definitely contaminated at household level? Is 300 wells enough to conclusively state that water at the source is safe to drink? Could contamination have occurred in the lab, or was it caused by environmental factors such as exposed pipes? If water was becoming contaminated in people’s homes, when exactly was the contamination taking place and how? For the ministry to be able to answer these questions and take steps to improve the quality of drinking water, there was a strong need for more comprehensive and recent data.
In 2018, together with the Ministry of Water Resources in Sierra Leone and UNICEF, we set out to implement a country-wide water quality testing system to identify the levels and root causes of contamination, implement strategies to treat the cause, and monitor the success of the programme at large. We set up the system using the data journey methodology - Design, Capture, Understand and Act - to ensure success from implementation to impact. The data journey is aimed at helping organisations and governments design their programmes so that they can capture and understand reliable data which they can act upon.
Before data collection could begin, we needed to identify the impact that we were aiming to contribute to - safe drinking water in Sierra Leone - and the strategies that would be most effective in getting there. Our strategies to achieve safe water in Sierra Leone were dependent on where the contamination occurred and why. Therefore, we needed to capture data that would answer these key questions and clearly define a data use strategy - how will the data be used to inform decision making?
If contamination occurs at household level, then we need to reduce E. coli within the home by conducting behavioural change campaigns to improve hygiene, sanitation, and water handling practices. To understand where there are gaps in people’s safe water practices, we needed to capture data on the safe water journey - how do people collect, transport, store, and consume their water? We designed the survey to capture data on the safe water journey and corroborated that data with actual water quality results. This way, we could find links between behaviour and risk. For immediate results, household water treatment methods such as chlorination could also be explored.
To build on the water source data collected by the ministry, we decided to test the source once again to include sources that were either built or rehabilitated within the last year. The survey was designed to capture both risks associated with water infrastructure, such as nearby farms or toilets, and actual water quality data. This way, correlations could be explored during data analysis between certain risks and actual contamination.
The sampling strategy included 1100 UNICEF supported communities, with two randomly selected households per community tested for E. coli. Furthermore, 600 drinking water sources were tested on 14 parameters: E. coli, EC, pH, Ammonia, Fluoride, Iron, Nitrate, Nitrite, Potassium, Phosphate, Sulfate, and Chlorine.
During programme design, it’s essential that all stakeholders agree upon the approach used and that knowledge is actively shared. Based on input from the ministries, for example, we knew which physicochemical and microbiological parameters were common in the area and which we needed to test. Our aim is always to match the data needs with the data use so that we only capture what is necessary for decision making. This way, we can keep data cleaning to a minimum and analyse and act upon the results quickly and effectively.
This was the first time a water quality testing exercise had been carried out in Sierra Leone at such a large scale. To execute the data collection strategy, 32 ministry mappers in 16 districts were trained to capture water quality data using Akvo’s data platform, which is connected to a smartphone app and proven third party hardware. These mappers were then responsible for training implementing partners - 180 people - thereby bolstering the sustainability of the programme. With newly gained knowledge on water quality, these mappers can continue to serve the district even after the programme has finished.
The data collection process was smooth due to the clear survey design, which left little margin for error, and the user friendly data platform - through the app, data collectors receive step by step instructions for each water quality test to ensure quality collection. In order to ensure data quality on the go, we had two Akvo staff on hand to provide logistical, technical and practical support during the first weeks of data collection.
Figure 2: Data collection tracking dashboard for the water quality monitoring
We also created a dashboard on Akvo’s data platform to update and track anomalies automatically. Using this dashboard, [see figure 2] we conducted automatic checks to ensure a match between the barcodes on the water quality sample and the water source or household it came from. In case of abnormalities, team leads were contacted. This enabled us to maintain data quality in the field automatically.
Before data can be understood and acted upon, it needs to be cleaned to ensure reliability. In this programme, data cleaning was conducted during the capture and understand phase according to the following parameters:
- Accuracy - Is all of the data in the same format?
- Completeness - Are all the results in?
- Timeliness - Has any data come in before collection?
- Uniqueness - Is there any duplicate data?
- Consistency - Are there temporal or spatial inconsistencies in the data?
- Logic - Are there any logical flaws in the data?
We also checked the data for outliers. For example, if there are a hundred pumps with an E. coli level of zero then they will influence the results - we decided to either remove them or take the mean without them in order to avoid skewing the data. The dashboards we set up in the capture phase were also used during data collection to track progress and steer where needed, as they were able to give a first impression of the results before the final analysis.
This programme is currently at the beginning of the analysis phase. Three types of analysis will be used to effectively drive decision making.
Descriptive analysis: Descriptive analysis is the first layer of information you can get from the data you’ve collected, for example, how many water sources are contaminated and how many people have contaminated water in their households?
Figure 3: E. coli risk levels as measured in the Unicef wells
Figure 4: E. coli risk levels as measured at households level
Diagnostic analysis: When going one step deeper into the water source data, we can use diagnostic analysis to see whether there is a relation between risk and E. coli using the risk association data and water quality data. This can result in insights such as “water sources within one mile of a farm is X% more likely to be contaminated than water sources within five miles of a farm.” At household level, we can see whether there’s a link between behaviour and contamination using the safe water journey data and the water quality data. This may result in an insight such as “If you store your water at ground level, contamination will increase by X%.” We can then steer the intervention based on those results.
Figure 5: E. Coli risk levels per water source as reported by the households. As expected, unprotected sources and surface water result in the highest risk of E. Coli. What is unexpected is the high level of E. Coli in water coming from piped water into a dwelling.
Experimental analysis: Using experimental analysis, we can see if there are other correlations in the data that weren’t in our hypothesis. For example, is there a connection between the source used and the situation in the household? Based on the results of this data, we can make recommendations which the ministry will also provide input for.
Figure 6: E. coli risk levels as measured at household level combined with reported handling of stool. As you might expect, stool left in the open is only found in combination with a high or very high risk of E. coli contamination. Otherwise, the influence of different ways of handling stool on water quality doesn’t seem that apparent in this overview. It is good, however ,to keep in mind that the “very high risk” group contains more than 60 percent of the sample.
Using the insights from the understand phase, we can act upon the data to contribute to impact in four ways.
First of all, we will recommend water treatment at household level in the form of chlorination tablets provided by the ministry. While this isn’t a long term solution, it’s extremely effective in providing safe drinking water in the short term and reducing the risks associated with E. coli contamination. In the long term, behavioural change campaigns will be conducted in order to improve the safe water journey. These campaigns will involve numerous stakeholders in order to ensure citizen engagement, and will be tailored to the insights gleaned from the safe water journey data.
Secondly, data will be used to guide the maintenance and repairs of water sources so that those at risk of contamination are improved and protected.
Third, data will be published to the WASH data portal, which means that other stakeholders can make use of the data, citizens can be informed, and similar programmes can make use of the lessons learned.
Finally, we will monitor the success of the programme using the same data journey methodology in order to ensure that the behavioural change campaigns are working.
Clearly define roles and responsibilities
In the design phase, it’s essential to define how the data will be used and by whom. Who is responsible for data analysis and visualisation and when? Is it the Ministry of Water Resources, UNICEF or Akvo, and what happens if that person is unavailable? Who is responsible for reporting? It is essential that a data to decision strategy is clearly defined at the beginning of the programme and that progress isn’t stalled by organisational issues, such as a change in personnel. Always have a plan B, C, and D!
Set up a robust communication platform and plan
Often, the lack of a clear communication platform and plan in the field can lead to wasted resources and logistical issues during data collection. We made sure that we had a plan set up - from Whatsapp groups to support staff - to avoid any of these pitfalls. Conversely, we did come up against communication challenges between the partner organisations - Akvo, UNICEF and the ministry. Besides the direct lines of communication, who should be contacted if someone is unavailable for a longer period of time? Where and how were communications documented? Who is responsible for what? Setting up a clear plan for how things are communicated and by whom is part of defining clear roles and responsibilities and is essential in the Design phase.
Prepare for logistical issue
Some teams faced logistical issues during data collection, such as photometers that had to be recalibrated with standard solutions (which only had limited availability). In case photometers or sensors broke, spare items were available. However, if teams ran out of consumables it took longer than expected for them to be supplied, especially the shipment of reagents from Europe. In retrospect, it would have been cost efficient to supply more reagents and standard solutions upfront for the included parameters.
Supported water quality parameters in our data platform (Akvo Caddisfly)
- Calcium hardness
- Electrical conductivity
- Suspende solids
- Total coliforms