GENPRES

by

Jim Hines
USGS, Patuxent Wildlife Research Center
Laurel, MD, 20708, USA
email:jhines@usgs.gov
www.mbr-pwrc.usgs.gov

This file last modified: <080808.1342>

This program simulates presence/absence data to be input to programs MARK or PRESENCE. It can be used to get an idea of how precise the estimates are for given sample effort or design, or the bias of estimates when heterogeneity exists. See the following papers for a description of the methods involved in estimating parameters from presence/absence data:

1MacKenzie, D. I., J. D. Nichols, G. B. Lachman, S. Droege, J. A. Royle and C. A. Langtimm. 2002. Estimating site occupancy rates when detection probabilities are less than one. Ecology 83: 2248-2255

2MacKenzie, D. I., J. D. Nichols, J. E. Hines, M. G. Knutson and A. B. Franklin. Estimating site occupancy, colonization and local extinction probabilities when a species is not detected with certainty. (Submitted to Ecology)

3Bailey LL, Hines JE, Nichols JD, MacKenzie DI (2007) Sampling Design Trade-offs in Occupancy Studies with Imperfect Detection: Examples and Software. Ecological Applications: Vol. 17, No. 1 pp. 281–290

Definitions:

Single,Multi-season models

PSI : Initial occupancy rate (proportion of sites occupied)
P(i) : detection probability for survey i
EPS(i) : probability of species extinction from survey i to i+1 (= 1-PHI)
PHI(i) : probability of species survival from survey i to i+1 (= 1-EPS)
GAMMA(i) : probability colonization just after survey i

Two-species models

PSI-A : Initial occupancy rate for species A (regardless of occupancy of species B)
PSI-B1 : Initial occupancy rate for species B, given occupancy of species A
PSI-B2 : Initial occupancy rate for species B, given non-occupancy of species A
pA : detection probability for species A, given only A present
pB : detection probability for species B, given only B present
rA : prob. detect only species A, given both species present
rB1 : prob. detect species B, given both species present and species A also detected
rB2 : prob. detect species B, given both species present and species A not detected

Multi-method models

Theta : Prob. species is locally available, given presence

Multi-state, single-season models

PSI1 : Initial occupancy rate (proportion of sites occupied)
PSI2 : Prob. site is in state 2, given occupancy
p1 : Detection prob., given site is in state 1
p2 : Detection prob., given site is in state 2
dlta : Prob. of identifying site as belonging to state 2, given it's in state 2

Multi-state, multi-season models

Psi : Vector of initial occupancy rates (indexed by state)
psi(i-j): Vector of subsequent occupancy rates (indexed by prev. state, subsequent state)
p(i-j) : Vector of detection probs (indexed by detected state, true state)

Multi-state, multi-season models(R,dlta parameterization)

Psi0 : Initial occupancy rate
R0 : Initial prob. of being in state 2
Cpsi0 : Vector of subsequent conditional occupancy rates (Pr(occ|prev.state=0)
Cpsi1 : Vector of subsequent conditional occupancy rates (Pr(occ|prev.state=1)
Cpsi2 : Vector of subsequent conditional occupancy rates (Pr(occ|prev.state=2)
CR0 : Vector of subsequent conditional occ2 rates (Pr(state2|prev.state=0)
CR1 : Vector of subsequent conditional occ2 rates (Pr(state2|prev.state=1)
CR2 : Vector of subsequent conditional occ2 rates (Pr(state2|prev.state=2)
p1 : Vector of detection probs, given site is in state 1
p2 : Vector of detection probs, given site is in state 2
dlta : Prob. of identifying site as belonging to state 2, given it's in state 2

Royle-Nichols models

Lambda : Population size
p : Species detection probability

Survey Design

Occupancy studies fall into two categories: (1) single-season, or (2) multi-season. In the single-season study, sites are surveyed multiple times over a short period of time to estimate the proportion of sites which are occupied and detection probability. In the multiple-season study, sites are surveyed multiple times in two or more seasons, where there is an interval of time between seasons for changes in occupancy to occur. These changes in occupancy are reflected by the values of GAMMA (species colonization rate) and EPS (species extinction rate).

Both survey designs can be implemented by GENPRES. Default input values are provided for the multi-season design, and the single-season design can be implemented by changing all values of EPS to 0.0 and all values of GAMMA to 0.0. In fact, the program determines which surveys belong to which season by examining the values of EPS. Anytime the value of EPS is 0.0, the succeeding survey is in the same season as the preceeding survey. When EPS is greater than 0.0, the succeeding survey is the first survey of a new season.

Modeling heterogeneity

This program can be used to examine the effects of heterogeneity on the estimates of occupancy, detection, or change in occupancy (EPS, GAMMA). Although not all individuals are expected to have exactly the same probability of occupying a site, this probability is assumed to be approximately equal for all individuals. If a sub-group of the sampled population have a substantially different probability of occupying the surveyed sites than the rest of the population, then the population is said to be heterogeneous. If the sub-group can be identified when observed, then there is no problem. Each sub-group would be analyzed separately. If there is no way of identifying which group the observation belongs to, then the overall estimate of occupancy will be biased.

This situation can be easily modeled in the program. Simply specify the parameters for one of the sub-groups, then click the 'Add Group' button and enter the values for the other group.

Standard design vs Panel design

'Standard' design refers to the situation where each site is surveyed each season. In a 'panel' design, some sites are visited in some seasons, but not in others. The panel design might be used to cover a larger area at a smaller cost, or may be the result when a group of sites become inaccessible.

This situation can be handled by setting the detection probability to zero for a group of sites in a particular season(s). When data are generated, these sites will contain a '.' corresponding to the surveys when they were not visited.

Example:Distribution of sampling effort across 4 bi-weekly survey periods

Design 1                 Design 2                 Design 3
______________________   ______________________   ______________________
Num                      Num                      Num
of                       of                       of
Sites   1   2   3   4    Sites   1   2   3   4    Sites   1   2   3   4
12     xx  --  --  --     6     xx  xx  xx  xx     6     xx  --  xx  --
12     --  xx  --  --     6     xx  --  --  --     6     --  xx  --  xx
12     --  --  xx  --     6     --  xx  --  --     6     xx  --  --  --
12     --  --  --  xx     6     --  --  xx  --     6     --  xx  --  --
                          6     --  --  --  xx     6     --  --  xx  --
                                                   6     --  --  --  xx
                                                   
Total    s=48 sites               s=30 sites               s=36 sites

Program installation

Download the Windows setup program (genpres_setup.exe), and execute it (double-click from windows explorer). Data can be generated and analyzed with this program. Alternatively, program MARK can be used for data analysis. Program MARK can be downloaded (for free) from
http://www.cnr.colostate.edu/~gwhite/mark/mark.htm

Program operation

Once the program starts, a tabbed window appears with default values for each of the parameters (PSI,P(i)). Values can be changed by clicking on the value and typing in a new value (duh!). Changing the number of surveys adds or deletes columns for the survey-specific parameters.

The default scenario indicates that there are 100 sites which are visited a total of 5 times. Initial occupancy is 75% and detection probability is 50% for each survey.

To simulate heterogeneity amoung sites, create multiple groups with different occupancy (psi) or detection (p) probabilities. To test design methods (as in the example above), create groups with detection probabilities equal to zero for surveys which are skipped ('--' in the diagram), and the desired detection probability for surveys which were not skipped ('xx' above).

For example, to simulate 'Design 1' above, change the number of surveys to 8 and numbe of sites to 12. Enter the desired occupancy probability, and detection probabilities for surveys 1 and 2. Enter '0.' for the detection probabilities for surveys 3 through 8 Then, add a 2nd group and change the detection probabilites to 0. for surveys 1 and 2. Change the detection probabilities to the desired value for surveys 3 and 4, and leave the rest at 0. Add groups 3 and 4, changing the detection probabilities to 0. for surveys which are skipped for that group.

When the 'Analyze w/ expected values' button is clicked, data will be generated for this situation and analyzed with program MARK. The output from program MARK will appear in a new window. If you were to look at the input data file, you would see a sequence of '1's and '0's indicating detection (1) or non-detection (0) for each survey. The number following the detection history is the number of sites which had that exact detection history. (Although in the real world there cannot be fractions of a site, the expected number of sites can be a fraction depending on the input values.)

The parameter estimates from program MARK appear at the end of the output file (scroll down to the end using "cntl-end").

Running other models

Nine model-types are available in GENPRES which are listed in the definitions section above. Select a different type by clicking the 'Model-type menu and selecting the desired type.

There are several models available for each model type. Once a model-type is selected (for generation of data), a specific model must be chosen to analyze the data. This is done by selecting models under the 'Model' menu. You can generate the data by changing the parameter values in the input table, then choose the model to use to compute the estimates. This could be used to investigate the bias of the parameter estimates when data are generated with different values for each occasion, but analyzed with a model assuming constant values over time.

One of the last models in the 'model' menu, 'user-defined' allows you to analyze the generated data with your own customized model (e.g., model PSI(.),p(T), where p(T) indicates that detection probabilities are forced to a linear trend (logit scale) using the design-matrix). Click 'Help' to see a sample of a model using the design matrix.

The menu choices, 'Define model by name' allow you to specify models similar to the ones in the menu by entering a model name and letting GENPRES create the MARK input file based on whether '(.)' or '(t)' appears after each of the parameters, Psi, Gam, Eps, or p.

Options