Skip Standard Navigation Links
Centers for Disease Control and Prevention
 CDC Home Search Health Topics A-Z
peer-reviewed.gif (582 bytes)
eid_header.gif (2942 bytes)
Past Issue

Vol. 10, No. 7
July 2004

Adobe Acrobat logo

EID Home | Ahead of Print | Past Issues | EID Search | Contact Us | Announcements | Suggested Citation | Submit Manuscript

Comments Comments



Appendix Figure 1
Appendix Figure 2
Back to article

Research

Alert Threshold Algorithms and Malaria Epidemic Detection

Hailay Desta Teklehaimanot,*Comments Joel Schwartz,* Awash Teklehaimanot,† and Marc Lipsitch*
*Harvard School of Public Health, Boston, Massachusetts, USA; and †Columbia University, New York, New York, USA


Appendix

Data

Appendix Figure 1
Figure 1.

Click to view enlarged image

Figure 1. Time series of normalized weekly average daily malaria cases for 10 districts...

  

Appendix Figure 2

Figure 2.
Click to view enlarged image

Figure 2. Percent potentially preventable cases (PPC) by number of alerts per year from all districts for each alert threshold algorithm.

A team from the Ethiopian National Malaria Control Program visited different health facilities to look for complete epidemiologic data with consistent malaria case definitions. After the field trip, health facilities with relatively high-quality recording and information systems were chosen. Health personnel from each facility selected for the study and staff from the National Malaria Control Program were given training on compiling data on illness and death from the existing patient logs.

Raw data for each week (j) of each year (i) from each health facility (h) consist of the number of slides examined during that week (Ehij) and the number of those slides that were positive for Plasmodium falciparum (Chij). The original data were collected weekly on the basis of the Ethiopian calendar, which consists of 12 months, each with three 7-day weeks and one 9-day week (48 weeks), and a 49th week of 5 days. Weekly numbers of cases were normalized to a daily average by dividing raw case numbers by the number of days in the week.

Notation

Each point in the dataset refers to some measure of malaria prevalence in a given health facility (h) during a given week. Data from a given health facility during a given week may be indexed by an absolute time (t), or by a year (i) and a week of the year (j=1…49), to facilitate comparisons of the same week across years.

Subscripts

h – health facility

t – time in weeks

i – year

j – week of the year, using the Ethiopian calendar

ft = number of days in week t

Weekly Data and Their Transformations

Eht – number of slides examined at health facility h in week t

Cht – number of positive slides ("cases") at health facility h in week t

Xht = Eht / ft – normalized daily average total slides examined for week t at health facility h

Yht = Cht / ft – normalized daily average positive slides for week t at health facility h

Lht = ln(Yht) – log-transformed cases for week t at health facility h

Sht =– order-3 moving average daily cases for week t at health facility h

Pht = 100%* Yht∕Xht – slide-positivity percentage for week t at health facility h

Zhj – number of data points (years) for week j in dataset from health facility h

nh total number of weeks in dataset from health facility h

Threshold Definitions

Thijs – threshold value in health facility h during week j of year i, using sensitivity s

hTs – number of alerts generated in health facility h by threshold type T with sensitivity s

Qphij the pth percentile of Yh.j excluding year i

s – sensitivity of a threshold: number of standard deviations (SD), percentile cutoff, threshold for slide positivity, or log slope

μhij =[]∕ [ Zhj – 1], weekly mean of normalized cases for week j, from all years except i

σYhij =√ {[(Yhij – μhij)]∕[ Zhj – 2]}, weekly SD of normalized cases for week j, from all years except i

Mathematical Details Used To Calculate Epidemic Detection Algorithms

Weekly Percentile

Threshold is exceeded when Yhij > Thij , where Thij =Qphij, where Qphij represents the pth percentile (p = 70, 75, 80, 85, 90, or 95) percentile of observations from week j at facility h in years other than i.

Weekly Mean with SD

Threshold is exceeded when Yhij > Thij, where Thji = µhij + βsYhij, where β = 0.5, 1.0, 1.5, 2.0, 2.5, or 3. A parallel definition is used for log-transformed (Lhij) and smoothed (Shij) data, by using corresponding means and SD.

Normalized counts: the number of normalized weekly cases was used to derive the weekly mean and SD.

Smoothed normalized counts: To improve data smoothness, moving averages Shij were obtained (see Notation above) from normalized counts and used both to calculate mean and SD for the thresholds, and to compare against the thresholds. Weekly means and SD were calculated from the {Shij}.

Log-transformed series: To obtain data with reduced right skew, logged weekly counts Lhij were obtained (see Notation above) from normalized counts and used both to calculate mean and SD for thresholds and to compare against the thresholds. Weekly means and SD were calculated from the {Lhij}.

Slide Positivity Percentage

Slide positivity proportion (Pht) was calculated for each week:

Pht = 100% * YhtXht

Threshold is exceeded when Pht > z, where z = 30%, 35%, 40% …80%.

Slope of Weekly Cases on Log Scale

We defined a set of alert thresholds based on the slope (Mht) of the natural logarithm of the number of normalized cases: Mht = Lht – Lht-1

The threshold is exceeded when Mht >m, where m = 0.2, 0.3, 0.4, or 0.7, which approximately corresponds to 25%, 35%, 50% or 100% increase relative to previous week’s number of cases.

Generating Potentially Prevented Cases

Potentially prevented cases (PPC) for each alert were defined as a function (q) of the number of cases in a defined window starting 2 weeks () after the alert (to allow for time to implement control measures). The window of effectiveness () was assumed to last either 8 or 24 weeks (to account for control measures whose effects are of different durations). Since no control measure would be expected to abrogate malaria cases completely, we considered two possibilities for the number of cases in each week of the window that could be prevented: 1) cases in excess of the weekly mean: q1ij = max(0,Yij - µij), and 2) cases in excess of the weekly mean minus l SD: q2ij = max(0,Yij – [µij - σYij]), where Yij is the observed number of cases, and µij and σYij are mean and SD for week j excluding year i. When the observed number of cases in a week is less than the weekly mean (in calculating q1) or the weekly mean minus the SD (in calculating q2), q is set to a minimum value of zero for that week. The PPC using a threshold type T with sensitivity s in dataset from health facility h, using function qk is written as:

1) PPC1hTs = max (0, Yhji - mhji)

2) PPC2hTs = max (0, Yhji – [mhji - sYhji]),

where hTs = number of alerts triggered by threshold type T with sensitivity s in data set from health facility h; t= the time of alert φ;  = 2 representing number of weeks after alert turns on; and  = 8 or 24 representing window period. Once an alert threshold triggers an alert at time t, and a control measure is applied, we ignored alerts within the next 6-month period until t+24 with the assumption that effective control measures will be taken and risk for another epidemic soon will be minimal.

For each value of each type of threshold at each health facility, the number of potentially prevented cases was transformed into a proportion (percentage), by adding the number of potentially prevented cases for the alerts obtained and dividing this sum by the sum, over all weeks in the dataset, of the number of potentially prevented cases in that week. Let %PPChTs denote percent of PPChTs and %PPCTs denotes the mean of %PPChTs from the different health facilities.

1) %PPC1hTs =

2) %PPC2hTs=

      

(Note: here the t and ij notations are used interchangeably.)

To compare the performance of dissimilar alert types on a single scale, a curve was plotted for each type of algorithm showing mean %PPC vs. average number of alerts triggered per year, with each point representing a particular threshold value.

Random Alert

To calculate the expected PPC for randomly timed alerts, the excess cases under excess case definition (qk) for all weeks in the study was averaged to obtain an overall mean. For a window of length weeks, PPCh represents the expected PPC for Φ randomly chosen alerts in dataset from health facility h.

 

 

Annual Alert

To determine the optimal week, we calculated PPC for a policy of triggering an alert automatically during week j (j = 1..49) every year, using window . The optimal week was the week j with the maximum value of PPC. The value of PPC corresponding to the optimal week simulated an "optimally timed" policy of annual interventions; thus, it represents one alert every year.

 describes PPC from health facility h when the alert is triggered every year at week j, and  denotes the maximum PPC from the best possible week.

 = and  

 =  and  

Optimally Timed Alerts

To calculate the expected PPC for optimally timed alerts, we followed a recursive procedure. First, we searched through all weeks in the data set and chose the single week on which an alert would have the maximum PPC under a given case definition (qk). Then, the process was repeated with the weeks that remained after "blocking" alerts for a period of 24 weeks before or after the first alert (since our algorithms were similarly constrained never to have two alerts <24 weeks apart). This process was repeated up to a total of 10 alerts for each site. This process approximates the optimal timing of alerts, although theoretically all possible combinations of a given number of alerts would have to be tried to ensure optimal timing. Since each site had a slightly different number of weeks in the dataset, a given number of alerts corresponded to slightly different frequencies in the different sites (hence, the horizontal scatter of points in Figure 2). The "optimal alert" points in Figure 2 were calculated by "binning" similar frequency values and averaging the %PPC across values in a bin. Percent PPC from all districts for a given number of random and optimally timed alerts and for the annual alert were calculated in analogous ways.

Comparison of Use of Weekly versus Monthly Data

We also compared the efficiency of the weekly percentile method applied to weekly data vs. the same method applied to monthly data. For this purpose weekly data were converted into monthly data, and alert threshold levels based on the percentile were built and a similar procedure was used, except that an alert was triggered when the observed monthly value exceeded the threshold determined by the method in any single month. For this comparison, we considered PPC formula q1, with φ = 8 weeks. A set of computer programs written in Stata to perform the methods presented in this article is available.

   
     
   
Comments to the Authors

Please use the form below to submit correspondence to the authors or contact them at the following address:

Hailay D Teklehaimanot, Department of Epidemiology, Harvard School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA; fax: 617-566-7805; email: htekleha@hsph.harvard.edu

Return email address optional:


 


Comments to the EID Editors
Please contact the EID Editors at eideditor@cdc.gov

 

EID Home | Top of Page | Ahead-of-Print | Past Issues | Suggested Citation | EID Search | Contact Us | Accessibility | Privacy Policy Notice | CDC Home | CDC Search | Health Topics A-Z

This page posted June 14, 2004
This page last reviewed July 21, 2004

Emerging Infectious Diseases Journal
National Center for Infectious Diseases
Centers for Disease Control and Prevention