|
July 19, 2005
A Monte Carlo simulation model was developed for the V. parahaemolyticus risk assessment to capture the variability and uncertainty of the description of the processes associated with V. parahaemolyticus densities in oysters, the effects of oyster harvesting, consumer consumption, and human response to the pathogen. This model is made up of biogenic and abiogenic factors. Abiogenic factors include environmental air, water temperatures, and storage times; and biogenic factors include predicted harvest behavior and amounts of oysters consumed. Within the model these factors are combined with growth rate, death rate, and dose-response models. The result is a probabilistic simulation predicting a distribution of baseline risk for each region/season and distributions of risks associated with mitigations.
The model simulations were implemented in @Risk (Palisade). All of the calculations were performed by the Monte Carlo method of resampling from specified input distributions and appropriately combining the sampled values to generate the corresponding output distributions. For each region and season, a total of 10,000 servings were simulated for combinations of 1,000 samples of the uncertainty parameters. Due to the relatively large number of servings consumed within each of the region and season combinations, the number of illnesses implied by the model was determined by the average risk per serving multiplied by the number of servings consumed. The appropriateness of this calculation follows from the Central Limit Theorem. The sum of a large sequence of independent Bernoulli random variables, representing simulated illness outcomes, will converge to the product of the number of variables in the sequence times the average risk of illness (i.e., as the number of variables in the sequence increases). This it true even when a sequence of random variables are not identically distributed, as is the case here due to differing levels of exposure and hence risk per serving.
For each iteration of the model the prediction of the density of pathogenic V. parahaemolyticus per gram of oyster tissue was determined at harvest by applying the estimated distribution of the percentage of V. parahaemolyticus that are pathogenic to the predicted distribution of total V. parahaemolyticus at the time of harvest and then evaluating changes in the density of pathogenic V. parahaemolyticus through the post-harvest module. Levels of total V. parahaemolyticus were also evaluated from time of harvest through the post-harvest module. This approach was adopted because total V. parahaemolyticus levels were necessary to implement the bound on the level at 6 log10, so that the comparable pathogenic levels would not be exceeded. Also, results from the FDA/ISSC retail study (FDA/ISSC, 2001), the only post-harvest/retail study available, provide levels of total V. parahaemolyticus, but not pathogenic V. parahaemolyticus for all regions. We used this study to validate the exposure assessment of our model by comparing levels of total V. parahaemolyticus found at retail with the model predicted levels.
Exposure distributions of predicted numbers of pathogenic V. parahaemolyticus ingested per serving were obtained by combining distributions describing the probabilistic variation of number and meat weight of oysters in a serving and the expected variation of the density of pathogenic V. parahaemolyticus per gram at the end of the Post-harvest process. Individual iterations of the model predicting the number of pathogenic V. parahaemolyticus consumed by an individual were used to calculate a risk of illness. @Risk keeps track of the value of each calculation of risk for the 10,000 iterations. When one simulation is completed summary statistics are available for the 10,000 calculations of risk under both baseline and mitigation scenarios.
The number of iterations was set high enough to allow for a range of all the variables to be run through the model. At this number of iterations the summary statistics (e.g., mean values) calculated for the risk of illness were found to converge during the simulation; meaning that, by the 10,000th iteration, these values were nearly constant. The Monte Carlo simulation error associated with this aspect of the simulation was determined to correspond to an average coefficient of variation of 0.2% up to ~5%. The precision was lowest when the mean dose (and risk) was low and it approached 0.2% for those regions and seasons that collectively account for >95% of the estimated annual illness burden.
The estimates of mean risk determined by the average of simulated illness outcomes for selected high risk region/seasons confirmed the appropriateness of just using the mean risk rather than directly simulating illness outcomes (as Bernoulli random variables). Thus, predicted numbers of illnesses were obtained by determining the mean risk and then calculating the associated number of illnesses as the product of this estimate and the number of servings (based on the NMFS landings statistics).
These simulations of 10,000 sampled exposures (and risks) were repeated 1,000 times with selected uncertainty parameters in order to evaluate their influence the model's output (risk). Parameters evaluated on this level of the simulation included the effect of likely year-to-year variation in the distributions of water temperatures, and the uncertainties associated with parameters such as the percentage of total V. parahaemolyticus which are pathogenic, the dose-response and the relative growth rate of V. parahaemolyticus in oysters versus broth cultures. A sample size of 1,000 was selected based on practical time constraints. The software selected for Monte Carlo simulations (@Risk) does not directly facilitate a fully nested two-dimensional (variability and uncertainty) simulation approach. Consequently, the uncertainty dimension of the simulation was conducted by performing simple random sampling of uncertainty parameters in Microsoft Excel per se and then calling a sequence of 1,000 @Risk simulations with the uncertainty parameters fixed at the values corresponding to each of the uncertainty samples obtained.
Simple random sampling is considerably less precise than Latin hypercube (or other types of stratified) sampling. The relative precision or Monte Carlo error of the mean simulation output with respect to uncertainty at the selected sample size of 1,000 was estimated to correspond to a coefficient of variation of ~3-4% of the nominal mean for each region and season combination. It was determined that the most significant source of this variation was due to the Monte Carlo error of the simple random sampling of the dose-response uncertainty. As a consequence of this degree of simulation error, calibration or "anchoring" of the model to CDC estimates of annual illness burden was accomplished by using "rejection" -sampling to obtain a single fixed representative sample of specified precision from the distribution of the dose-response uncertainty. A criteria of <0.1% relative difference between the sample versus the actual (population) mean was used. Thus, a fixed sample of 1,000 dose-response parameters satisfying this criteria was obtained by iteratively taking samples (of size 1,000) from the uncertainty distribution via simple random sampling and rejecting all but the 1st (collection of 1,000) satisfying the chosen criteria. After having obtained a representative sample of 1,000 dose-response parameters, the adjustment factor associated with food-matrix and pathogen-host effects was estimated by anchoring the model to be consistent with CDC estimates of annual illness burden. This was accomplished by running the model with different adjustment factors and then interpolating between the results to obtain a suitable estimate.
Although the model implementation fixes the samples of the uncertainty parameters rather than randomizing them on each model invocation, the effect of the Monte Carlo error is minimal with respect to both the identification of influential variables and the evaluation of effectiveness of mitigations.
A copy (CD-ROM) of the model is available. Fax request for the model to the CFSAN Outreach and Information Center at 1-877-366-3322. Additional information can be found on the spreadsheets:
The basic @RISK model showing each step as described below is shown in Figure A3-1. Figures A3-2 to A3-6 show how various parameters are used to derive levels of V. parahaemolyticus at different stages in the pathway. For example, Figure A3-5 shows how the mitigation strategies are incorporated into the model and Figure A3-6 shows how the model is adjusted to include intertidal parameters for the Pacific Northwest region as described in the Exposure Assessment and Appendix 5.
Figure A3-1. Spreadsheet Showing Each Step of the @RISK Vibrio
parahaemolyticus Risk Assessment Model
Figure A3-2. Screen Shot of @RISK Spreadsheet Showing How
the Levels of Pathogenic Vibrio parahaemolyticus at Harvest were Determined
Figure A3-3. Screen Shot of
@RISK Spreadsheet Showing How Vibrio parahaemolyticus Levels at First Refrigeration
were Derived
Figure A3-4. Screen Shot of @RISK Spreadsheet Showing How
Probability of Illness was Derived using Oyster Servings, Levels of Pathogenic Vibrio
parahaemolyticus Consumed and the Dose-Response Parameters
Figure A3-5. Screen Shot of @RISK Spreadsheet Providing an
Example of How the Effect of Mitigation on Levels of Pathogenic Vibrio
parahaemolyticus Per Serving was calculated
Figure A3-6. Screen Shot of @RISK
Spreadsheet Showing the Effect of Intertidal Harvesting on Levels of Vibrio
parahaemolyticus in Oysters in the Pacific
Northwest
The modeling approach adopted in the present assessment is similar in structure to that of other risk assessments, but has several unique aspects. Foremost, risk has been analyzed in terms of region and season to take proper account of differing harvest practices and climates. Second, the model that was developed is scalable in that it may be applied to finer levels of spatial and/or temporal resolution as data become available. Thirdly, the modeling approach has separated variability from uncertainty by identifying four key variables as uncertain, selecting values for these variables according to specific distributions, and then simulating the effects of variability parameters for all randomly selected values of the uncertainty parameters. In this manner, parameters that represent variability in the model are not mixed with parameters that are uncertain. However, parameters like water temperature can represent uncertainty as well as variability. This separation has allowed for an estimate of the reduction in the overall uncertainty of the analysis that would be gained if the uncertainty of individual variables were reduced. Other microbial risk assessments have separated variability from uncertainty; however, this risk assessment has investigated the gain in information that results from reduction in uncertainty of individual variables. Each of these points is discussed in turn.
The model developed here analyzes risk within categories defined by the four seasons and six primary harvesting regions (Northeast Atlantic, Mid-Atlantic, Gulf of Mexico [divided into 2 regions], and Pacific Northwest [divided into dredged harvesting and intertidal harvesting]) due to differing harvest practices and climates. The analysis could have subdivided further; however, the limitations of acquiring data with respect to a finer level of detail are such that the analysis was conducted at the specified regional level. Analyzing the regions separately allows for an assessment of mitigations that may be tailored to specific regions and seasons. The results may then be used in a subsequent cost benefit analysis. The principle limitation of this approach, which effectively segments the risk assessment into spatial and temporal groupings, is that the results are generally conditional to the a priori definitions of region and season. In particular, the selective application of sensitivity and importance analyses to predefined regions/seasons one at a time (i.e., to determine influential parameters and uncertainties) yields results that pertain to specific regions and seasons. Consequently, overarching and comparatively more influential effects may be obscured. In the present assessment, air/water temperatures are highly influential variables across region/season groupings and this is partially obscured by the necessity of presenting sensitivity analysis results for selected region/season combinations. The fact that air/water temperatures may be relatively homogeneous within some of the defined region/seasons in the present assessment, and hence relatively inconsequential in such a context, does not obviate the fact that there are wide (and important) variations in these parameters across regions and seasons.
The structure of the model is amenable to further subdivision of locality and season because it is scalable. Specifically, the model is structured to simulate the density of V. parahaemolyticus in oysters at selected steps from harvest to consumption as a function of environmental and industry-specific parameters (e.g., air/water temperatures, harvest practices) corresponding to locality and season. Given the existence of appropriate data, the model can be used to simulate this process for any appropriately defined harvest location and time frame. The selected level of spatial/temporal categorization (regional and seasonal) was determined by consideration of data availability; most specifically, the quantity and quality of data that could be obtained pertaining to air/water temperatures, harvest practices, V. parahaemolyticus prevalence, and shellfish landing information. Given that more detailed data on air/water temperatures is available from satellite observations a finer level of categorization (e.g., by state and/or by month) may be possible and/or other methodological approaches (e.g., harmonic regression) may be applicable to incorporate the effects of climate into further assessments without "segmenting" by region/season categories. However, the utility of such a level of detail in modeling of air/water temperature effects is mitigated by the fact that additional uncertainties may arise if the model is applied on a finer level of detail (e.g. harvesting areas) for which more refined data on industry-specific harvesting practices are missing or incomplete. The effects of such incomplete (or inaccurate) data on the results of the model have not been evaluated at this time, but an analysis of this type may be appropriate in the future if the model is to be applied on the State or shellfish harvesting area level using more refined temperature data available from satellite observations or other sources.
Variability and uncertainty have been separated in the analysis because this separation provides a more informative characterization. We distinguish between model inputs that are less well characterized because of lack of knowledge (uncertainty) and model inputs that are inherently heterogeneous (variable). A model input which is designated as heterogeneous is a parameter that is considered to be naturally variable, even when there is no uncertainty present. For example, the day-to-day water temperatures within each of the different regions and seasons are considered inherently variable and have been modeled as normal distributions with given means and standard deviations. At the same time, the relative growth rate of V. parahaemolyticus in laboratory broth cultures versus that in oysters has been characterized as an uncertainty. This uncertainty was specified to appropriately represent the present lack of knowledge as to the true growth rate versus temperature relationship in oysters and the uncertainty inherent in extrapolating from studies of the relationship in axenic culture. With additional study this uncertainty could be reduced. Variations in day-to-day water temperatures on the other hand can not be reduced by further study. The result of making a distinction between model inputs that are uncertain and model inputs that are variable is that the effect of reducing the uncertainty of each of the uncertainty variables can be assessed separately. Based on such an analysis, uncertainties can be prioritized in order to help identify research efforts that are most likely to help reduce the total uncertainty that has been identified in the risk assessment.
The model can be improved. At present, the modeling approach simulates individual exposures and risks with defined variability largely based on the relationship of total V. parahaemolyticus levels to air/water temperature and a random variation (within defined limits) of the percentage of total V. parahaemolyticus that are pathogenic. However, within region/season groupings the model is not temporal and thus the structure of the model does not allow for a complete and quantitative evaluation of the likely reduction in risk resulting from implementation of the FDA/ISSC V. parahaemolyticus interim control plan (adopted by the ISSC in July 1999 and revised in 2001). This is because, implicitly, the interim control plan operates at a finer level of spatial resolution (e.g. harvest areas) and is time-sensitive in the sense that there is prescribed closure to harvesting after measuring an unsafe level of V. parahaemolyticus and then re-opening once exposure levels have been demonstrated to have subsided. In order to develop an assessment model applicable to an evaluation of this control plan additional data and consequent restructuring of the present assessment would be needed. First the model would need to be scaled to the level of individual shellfish harvesting areas. To accomplish this, further data (e.g. water temperature) are needed with respect to the individual harvesting areas. Second, sensitivity and specificity characteristics of the pathogenic V. parahaemolyticus gene probe methodology used by the individual laboratories (doing the tests) are needed. Third, the model needs to be extended to encompass putative factors responsible for or affecting the rapidity by which pathogenic V. parahaemolyticus levels may change in specific areas and not in others. At present the model predictions are primarily based on temperature. Although variation of the percentage of total pathogenic V. parahaemolyticus has been incorporated in the assessment, this variation has not been modeled (or linked) to any environmental factor(s). The model might be improved by considering the rapidity of turnover of water in shellfish harvesting areas based on levels of freshwater flows, tide changes, wind direction, and depth of harvesting area if these environmental factors have an effect on persistence of pathogenic V. parahaemolyticus. It is, however, unknown at the present time how these factors may affect the persistence of pathogenic V. parahaemolyticus and hence this remains an area for future study.
If and when such model refinement is feasible, more sophisticated approaches to modeling of the data may be appropriate. For example, whereas normal distribution approximations of water/air temperatures were found to be sufficient at the regional/seasonal level, stochastic weather models (e.g., Richardson, 1981), which better represent skewness of temperature distributions as a consequence of precipitation patterns, may help facilitate a more unified approach that is not based on segmentation by season (and region).
This spreadsheet was used to generate the correlated statistics used in Appendix II, Spreadsheet 2.
This spreadsheet is separated from Spreadsheet 2 for simulation performance reasons. Select a region worksheet to see how the temperature statistics were generated.
Download a self-extracting *.exe file of Spreadsheet 1 (136 KB) *.
Running simulations with this spreadsheet requires the user to have the spreadsheet add-in "@Risk" installed in Excel (Excel is a product of Microsoft and @Risk is available from the Palisade Corporation www.palisade.com)**.
In a typical run, the cells F54, H54, I54, and J54 were selected as @Risk "outputs" and the simulation settings were set with iterations at 10,000 and simulations at 1000. New uncertainty parameters are brought in to the spreadsheet using the macro "TotalUncert". Add this macro to the simulations settings to be executed after a simulation run.
Download a self-extracting *.exe of Spreadsheet 2 (613 kB)*.
* Help Downloading and Using Compressed ZIP and EXE files. There is no other "Help Facility" associated with this computer model. There is no "Help Desk" available to call and ask for explanation, instruction, or assistance.
** If @Risk software is not loaded onto the computer, then numeric values will not appear in the cells of the Excel spreadsheet. The label #NAME? will appear in the cells of the spreadsheets. It will be of minimal value to run the computer model without @Risk already loaded onto the computer. @Risk is a product of the Palisade Corporation (www.palisade.com). Use of @Risk, or any other software product, in this risk model should not be construed as an endorsement of such products by the FDA.
Questions on the running of the spreadsheet may be directed to Mark Walderhaug at Mark.Walderhaug@fda.hhs.gov.