FDA/Center for Food Safety and Applied Nutrition
USDA/Food Safety and Inspection Service
Centers for Disease Control and Prevention
September 2003


Quantitative Assessment of Relative Risk to Public Health
from Foodborne Listeria monocytogenes
Among Selected Categories of Ready-to-Eat Foods

Table of Contents

Appendix 6: Software

(Also available in PDF)

 

This section contains a description of some of the software routines used in preparing the Listeria monocytogenes risk assessment.

The Dose Frequency Curve-Fitting Program

This routine takes a data set that contains historical records of an association between a continuous measure and the frequency of occurrence of a discrete event in a population. It is similar in operation to ParamFit, except that it is designed to be used with data sets that correlate a dose with an outcome, rather than a simple distribution. After fitting one of more models to the data, the parameters are written to a file that may be used with the DoseFrequency object. The parameter file may also be examined with the DoseFrequency plotting routine, or the parameter estimates may examined in Excel and used without the object.

Data

In order to proceed, the routine must be supplied with data in the proper format. There are two ways to do that.

The first is to supply a data file that is in the correct format. The "File Open" button may be used to browse for the file name. The file is not actually opened unless the "Data Edit" or "Run" buttons are selected. Alternatively, the file name, including the path, may be entered into the text box to the right of the "File Open" button.

Alternatively, data may be entered using the "Data Editor," which is started with the "Data Edit" button. If a file name has already been entered the Data Editor will open this data file. If it has not, then the "Editor" begins with no entries.

Choosing Models

The models used by the DoseFrequency curve-fitting routine have 1 to 3 components. The total number of models fit will be equal to the number of possible permutations of each of the three components selected.

The mandatory component is the primary dose-response function listed in the "Models-to-be-Fit" box. At least one model must be selected for the program to proceed. However, any combination may be selected. The curve-fitting routine will attempt to fit all models selected. A description of the models currently supported is given below.

Models currently supported by the DF curve-fitting program and object:
Model Name Parameters Equation for Frequency Given Dose
Beta Poisson alpha, beta 1 - ((1 + (dose / beta) alpha)
Logistic alpha, beta ealpha + beta * ln(dose) / (1 + ealpha + beta * ln(dose))
Exponential slope 1 - e -dose * slope
Gompertz - Log alpha, beta1 - e -e ^(alpha + (beta * ln(dose)))
Gompertz - Power alpha, beta, power1 - e -e ^ (alpha + (beta * (dose^power)))
Probit alpha, beta normal_cdf(alpha + beta*ln(dose))
Multihit gamma, kgamma_cdf(gamma*dose, k)
GammaWeibull alpha, beta, gamma 1 - (1 + (dosegamma/beta)-alpha)
Ln = natural log. cdf = cumulative distribution function. ^ = raised to the power of

Background Parameter

A parameter may be added to the model to accommodate other influences on the outcome of the causal event. There are three options:

The program will attempt to fit all options that are checked. For example, if all three boxes are checked, then the program will examine all three different options. At least one box must be checked.

If there are fewer than five data points, then a Background Parameter cannot be employed.

Threshold Parameter

If a threshold dose parameter is included in the model, and if the nominal dose is less than the threshold dose value, then the effective dose used to predict frequency is zero. If the nominal dose is greater than the threshold dose value, then the effective dose used to predict frequency is the nominal dose minus the threshold dose. The program attempts to fit all options that are checked. If both boxes are checked, then the program examines both options. At least one box must be checked. If there are fewer than five data points, then a "Threshold Parameter" cannot be employed.

Options

Selecting the "Options" button opens another dialog, which gives the user some additional choices regarding how the routine operates. These include choosing the goodness-of-fit measure, how the program weights models when creating a probability tree, and the initial estimates for each of the model parameters.

Bootstraps

In order to represent uncertainty arising from sampling error, dose measurement error, or the size of the exposed population in which illnesses are observed, multiple bootstraps may be performed. Sampling error, where the small sample of observed values is presumed to come from a much larger sample that is of interest, is represented by presuming a binomial distribution where the total set of values is infinitely large. The likelihood of a series of possible values for the actual frequency are computed by comparing the relative likelihood of generating the observed value. Dose and population size measurement error are sampled from distributions supplied with the data set. The total number of models fit equals the number of models selected times the number of bootstraps. While very large numbers are possible, the program has not been tested with more than 10,000 models.

Initial Parameter Estimates

The default initial parameter estimates that are used by the IMSL nonlinear regression program to produce an optimum fit may not work well for all data sets. The initial estimate for the primary functions can be changed in this dialog. Initial estimates for the threshold and background parameters cannot be changed at this time. Selecting "OK" results in retention of the new parameter estimate. Selecting "Cancel" will not.

Model Weighting

Even if bootstrapping is selected, the first bootstrap (the first set of models in the parameter file) always uses the original data.

Run

The routine begins by fitting curves when the "Run" button is selected. The "Parameter File" dialog appears when the routine is finished. A progress bar displays the percentage of the task that has been completed. However, unless bootstrapping is selected, the results are nearly instantaneous. Selecting the "Cancel" button causes the program to exit.

As the program fits the alternative models to the data set, it calculates a weight for each of the models that is used by the object to assign probabilities to each of the models. The weighting algorithm used by the program rewards models for goodness of fit, and penalizes for parameters. Moving the slider bar to the left increases the importance of producing a good fit, while moving it to the right emphasizes the use of fewer parameters.

When bootstraps are run, the weights are recalculated on a relative basis for each bootstrap, so that the total weight for each bootstrap is identical. This means if the routine is used solely to represent the uncertainty in the parameters for a given model (i. e., parameter uncertainty with no model uncertainty), then the model weighting algorithm has no effect.

ParamFit

ParamFit is a procedure for fitting a statistical distribution to a set of individual values for use in a subsequent Monte-Carlo simulation. It is similar in function to the routine included with Crystal Ball (Decisioneering) or BestFit, the add-on sold by Palisade as a companion to @Risk. The principle difference is that ParamFit is specifically intended for use in a population modeling exercise (e. g., public health). In this circumstance, the primary purpose of a distribution is to represent variability in the measured quantity among individuals in a population, rather than the uncertainty associated with the prediction of a single event. Under such circumstances, the uncertainty is associated with the distribution used to generalize the data and draw inferences about the population as a whole. The end product of ParamFit is an Excel function containing a list of plausible alternative models which may be used to draw an inference in a two-dimensional Monte-Carlo model. It was written in Excel Visual Basic for Applications and requires Excel 5.0 or later versions and the Toxfunct add-in.

Models and Common Names of Their Parameters

There are ten distributional models that may be employed for the purpose of describing the data and drawing inferences. Any models that are checked will be fit to the data. You may select only one model, all the models, or any subset. The following distributions (and the common names of their parameters) are supported by ParamFit.

Statistical Distributions - Common Names of the Parameters:

Beta: Parameter 1 = alpha, Parameter 2 = beta, Parameter 3 = minimum, Parameter 4= maximum.

Cauchy: Parameter 1 = median, Parameter 2 = beta.

Exponential: Parameter 1 = lambda.

Gamma: Parameter 1 = alpha, Parameter 2 = beta.

Logistic: Parameter 1 = mean, Parameter 2 = beta.

Lognormal: Parameter 1 = geometric mean, Parameter 2 = geometric standard deviation.

Normal: Parameter 1 = mean, Parameter 2 = standard deviation.

Rectangular function: Parameter 1 = minimum, Parameter 2 = maximum.

Triangular Function: Parameter 1= minimum, Parameter 2 = most frequent, Parameter 3 = maximum.

Weibull: Parameter 1 = alpha, Parameter 2 = beta.

Model Weighting Criteria

The frequency of use of each model is allocated according to it relative model weight which is calculated as follows:

Model Weight = (((1 + n / Pn) ^ O) * ((1 - gof) ^ H)

Where:

The specific optimization criteria for the L. monocytogenes concentration were:

Gof = Upper case Sigma (predicted - Observed)2 * n * concentration 0.25

Where the parameters for the goodness of fit were Predicted and Observed are the cumulative percentiles for a given concentration of L. monocytogenes, n is the number of samples in the report, concentration is in cfu/g. The model weight for the L. monocytogenes concentration = 1 / (pN * Gof 2). Where pN is the number of adjustable parameters in the distribution being fitted.



National Food Safety Programs | CFSAN Home | FSIS Home | CDC Home
www.FoodSafety.gov | Search/Subject Index | Disclaimers & Privacy Policy | Accessibility

Hypertext updated by dms 2005-OCT-08