Solving Challenges in Disease Surveillance and Data Quality Through the Use of LLMs (IntelSurv Project)
The response of Malawi to COVID-19 highlighted the significance of surveillance data in shaping effective disease control strategies. The current system involves gathering client data manually by district health workers using paper-based COVID-19 Case Based Surveillance (CBSR) forms. This information is consolidated in an Excel line list database submitted to the Public Health Institute of Malawi (PHIM). The shared data is further consolidated with other district data for public dissemination and disease response planning.
In 2020, before the pandemic escalated, the District Health Management Team (DHMT) facilitated the training of health workers in COVID-19 case management and response. The team included Environmental Health Officers, Laboratory Officers, Clinicians, nurses, and Health Surveillance Assistants (HSA) who were tasked with managing COVID-19 cases and executing surveillance tasks. Unfortunately, at that time, there was very little information available, and the training was not comprehensive, lacking data management guidance or in- depth surveillance training. Nevertheless, the rapid response team grew and played a vital role in patient care, collecting client data and entering it into the Case-based Surveillance forms. Despite these efforts, the data collection system faced numerous challenges, such as understaffing and being overwhelmed by the growing number of COVID-19 cases and clients requiring testing and review.
To support the surveillance efforts, interns were incorporated into the Environmental Health team. They were provided with on-the-job data collection and management training, despite the knowledge limitations in using the data collection tools. The teams worked tremendously hard to screen and manage COVID-19 cases in prisons, schools, hospitals, and other areas, but the nature of the disease and the information given out constantly evolved. The DHMT made efforts to train health workers in COVID-19 case management with updated guidelines from WHO and the Ministry of Health, but there was no formal training on using the Case- Based Surveillance form and key data management principles.
This project is an excellent solution to the challenges districts have faced with data for many years, and its impact will be tremendous in any future pandemic.
The Lilongwe DHMT received concerns from PHIM about the quality of data submitted and incidents of data mismatches in national and district data disseminated to the public that frequently occurred during the pandemic. It was apparent that there were flaws in the data collection, management, and reporting process. DHMT worked with the surveillance teams to address the issues raised. Still, the manual data entry systems, lack of training, understaffing, and workload remained vital matters that needed to be addressed to combat the problems.
Throughout the pandemic, the challenges highlighted as potential causes of data discrepancies in the district were only assumed, and there was never any data to support the assumptions and to warrant action. It wasn't until 2022, when the PEACH project led by Dr Amelia Taylor started exploring the national COVID-19 list data, that the data evidence gap was filled. The data discrepancies were numerous, including missing data, incomplete data, and incorrect terminologies. We collaborated and began to investigate the data collection and management systems and reasons for data discrepancies in Lilongwe and Blantyre districts through a qualitative research project. Our study findings highlighted the need for training health workers in the use of the case-based surveillance form and developing a feedback mechanism to address data discrepancies in the line list.
The qualitative research findings led to the development of the Intelligent Surveillance Project (IntelSurv), which leverages large language models to develop a Surveillance training application and create an intelligent surveillance feedback system. This project is an excellent solution to the challenges districts have faced with data for many years, and its impact will be tremendous in any future pandemic. Health workers will have a quick and accessible tool that will enable them to collect data correctly, and the feedback system will reduce the workload that the manual system poses on health workers.
As a member of the District Health Management team for Lilongwe and lead for Covid-19 case management and Cholera, I am thrilled about this project and the transformative work it is offering to districts. We have had a long journey of numerous data challenges, but the Intelligent Surveillance Project has shown us that there is hope in changing this by leveraging the existence of Artificial Intelligence.