FAS Online logo Return to the FAS Home page
The Global Food for Education
Pilot Program

February 2003

Report to the
United States Congress

Appendix 1: Survey Methodology

GFE Program Evaluation of
Private Voluntary Organizations:
Sample Design

Prepared by Stephen A. Kellogg, National Agricultural Statistics Service, U.S. Department of Agriculture


I. Introduction

Monitoring and evaluation are critical components of the U.S. Department of Agriculture’s (USDA) pilot Global Food for Education (GFE) program. In response to the need to monitor and evaluate implementation by private voluntary organizations (PVO’s), USDA’s Foreign Agricultural Service (FAS) asked its International Cooperation and Development (ICD) program area to hire qualified staff to manage and design a program to effectively accomplish this task. For statistical technical assistance with sampling and analysis, ICD asked USDA’s National Agricultural Statistics Service (NASS) to help design a plan to adequately fulfill this requirement.

The GFE plan (designed by NASS) relies heavily on survey design and statistical sampling to accomplish its objectives effectively within the limited resources. The general approach is to identify what estimate(s) of some characteristic(s) of the target population are required. For GFE, the strategy is to define the objective and design a methodology that will efficiently monitor the program performance by the PVO’s, and collect appropriate data for evaluation purposes to quantify program effectiveness.

To best accomplish the design objectives, the target population of interest is limited to the schools selected by the PVO’s participating in the GFE feeding program. This is the target population from which all estimates will make inference. Considering that GFE is a pilot program, non-GFE schools can be excluded as they represent a nonparticipating sub-set of the total population of schools.

FAS’ Export Credits program area is responsible for administering GFE through a series of agreements in partnership with a PVO. In some instances, a country may have several GFE programs, each with a separate and independent PVO. Due to the different and difficult nature associated with each participating country, each FAS agreement is unique.

The primary objective of a survey design is to specify the methodology used for making inferences of the target population. For GFE, the task is to specify a methodology to accomplish the monitoring and evaluation needs only. The first basic requirement is to establish a separate and unique domain for sampling and analysis purposes that categorizes each PVO project as a separate population domain (entity) for sampling and making required inference as to population characteristics. By definition, a pilot program is usually reduced in scope and scale. Lessons learned from the pilot will be applied to subsequent program expansions for improved program execution. In the case of the pilot GFE, the program execution is a dynamic situation with a steep learning curve. Each country/PVO agreement varies considerably, which naturally tends to increase management needs for USDA administration. In some cases, the original agreement required amending due to unforeseen conditions and/or circumstances. Taking all factors into consideration and based on the pilot nature of GFE, the design for monitoring and evaluation of each country/PVO project is best handled as a separate case study.

Consistent with the nature of any pilot program, the monitoring and evaluation component will be streamlined for each case study based on budgetary constraints as well as on the limited number of trained and experienced in-country field personnel and resources. Each case study requires creation of a target population subset and designating it the case study domain as the subset of schools participating in each country/PVO feeding program and referred to as the population size (Ni) for the ith agreement. Due to the limited monitoring and evaluation budget for each case study, the number of sample schools is limited to about 20 and will be referred to as the sample size (ni) for the ith country/PVO project.

For the case study’s small sample to be more statistically representative of the target population, the sample methodology design needs to specify selection of a sample of schools using a purposeful sample technique. This will avoid bias and ensure that the small number of schools selected in the sample is sufficiently representative as to allow the necessary inferences for measurement of program effectiveness. If and when GFE expands beyond the pilot stage, it will be important to allocate sufficient resources to an adequate number of sample schools to establish an efficient monitoring and evaluation program. With adequate resources supporting a fully operational GFE, the requirements and specifications for survey design will need to be more demanding to achieve an acceptable level of statistical confidence and precision. The results of the GFE pilot will be necessary and useful for determining a future adequate sample size for an operational program.


II. Sampling

Purposeful sampling becomes a powerful tool for the statistician and investigator when dealing with typically small sample sizes associated with case studies. In the context of GFE methodology in conjunction with a stratification matrix strategy, a purposeful sample is basically a random sample of schools that is more representative of the target population than a totally random sample with a small sample size. To further increase the efficiency of a purposeful sample, the form of stratification uses the matrix to sub-divide the target population (Ni) using as the matrix the important target population characteristics or "factors" to control variability inherent within those factors affecting the school feeding programs associated with GFE. The matrix approach facilitates selection from each matrix cell a purposeful sample of schools (nijk) from the matrix row j of the column k of the sub-divided target population (Ni). The sum of nijk sampled schools equals the total number of schools (ni...) selected for monitoring and evaluation.

The use of the matrix approach for GFE sampling methodology is an effective mechanism to collect representative data objectively with a small sample in order to measure the program’s effectiveness. Matrix factors are the most important elements within each country/PVO project and have the potential to contribute differences in program effectiveness. It is anticipated that the matrices will differ considerably from project to project, even within a single country, when PVO’s have uniquely different feeding programs in different areas of the country. The matrix factors will be identified by the GFE regional coordinators during the initial phase of their work as the program, field staff, and participatory government/private agencies are fully defined.

Once the matrix factors are defined for each country/PVO project, each ijk-th target population school is systematically assigned to one and only one of the ijk-th cells of the matrix. The total count of schools in each cell, Nijk, becomes the sub-target population size. During the analysis phase for modeling purposes, the sub-target population count, Nijk, will be used as the model weights for the purpose of indicator calculation used to measure program performance.

The sub-population count, Nijk, is used as the basis for allocation of the purposeful sample, nijk, within the matrix. Generally, the allocation will be proportional with a minimum of two schools selected from each matrix cell. The sample school selection process within each matrix cell will use systematic random sampling that requires the schools in each cell to be ranked and arrayed by student population size.

To determine whether the feeding programs achieved their program goals, the GFE methodology will use three measurement criteria (indicators): (1) enrollment, (2) attendance, and (3) performance. With operational programs, normally the target population variability will dictate the sample size necessary to achieve a certain precision of the estimate for the desired indicators. Statistically, the inherent target population variability can be controlled to a certain extent through stratification and classification factors for placing schools into groupings that are more homogeneous within groups than between groups or cells of the matrix.

When creating the matrix for each country/PVO project, the regional coordinators need to consider logical school groupings based on structure and environmental factors. Control of these factors is necessary because of the impact they can potentially have on program performance and success.


III. Background on WFP Sampling Methodology

The World Food Program (WFP) has prepared a paper that describes the approach used for calculation of the sample size for its School Feeding Baseline Survey. The WFP has based its survey design on a stratified simple random sample approach and will sample a total of 3,700 schools in 23 countries, or roughly 161 sample schools per country. The actual country sample sizes range from the smallest (60 schools) to the largest (388 schools). If one makes the assumption that the issues facing WFP in these 23 countries are not statistically different in school characteristics from those schools in countries participating in GFE, then one would expect that comparable sample sizes would be appropriate for GFE if USDA/FAS implements the same WFP survey design.

The FAS plan developed for the GFE monitoring and evaluation component is somewhat different from that implemented by WFP. Limited resources requires tailoring the GFE survey design to produce comparable results more efficiently. The solution is to make each country and PVO a separate case study using an appropriate, purposeful sample of schools stratified using a matrix of factors to control the target population variability. In the case of WFP, it has chosen to use two different independent samples in its design—one for the Baseline Survey and a different sample of schools for its Follow-up Survey.

As with any start-up program, the WFP survey design sample methodology paper discusses the possibility that it may be necessary to adjust the Follow-up Survey sample size. As stated in its paper: "This can occur for instance when the indicators observed in the baseline survey showed different levels from those that were used when calculating the required sample sizes prior to the baseline survey. This would mean that the sample size used in the baseline survey would be too small to satisfy the precision requirements for the evaluation effort if used for the follow-up survey." Based on the proposed GFE case study design described in this paper, making sample size adjustments is not relevant.


IV. The GFE Approach to Evaluation of PVO Projects

The GFE monitoring and evaluation approach in this pilot program is limited by available resources. While the GFE methodology is statistically sound and defensible, limited resources require adoption of a plan using a small, purposeful sample size tailored to a case study design requiring more stringent controls on sampling frame construction.

Rather than using the WFP’s survey design based on two independent samples, the GFE case study approach requires that a simplified repeat visitation for the Follow-up Survey be completed for each of the Baseline Survey sample schools. The repeat-sample design approach eliminates inherent survey variability in the indicators due to differences by chance alone associated with the use of two independent samples of different schools used for the Baseline and Follow-up Surveys.

Details for implementation of a case study design for GFE will follow and build on the general discussion at the beginning of this paper. Detailed instructions will be developed as additional information becomes available for field supervision by the GFE regional coordinators. It is important to keep in mind that references to the WFP survey design are being used only as a basis for comparison, and such reference should not be considered in any way as making the WFP design a standard of comparison.


V. GFE Methodology Guidelines for PVO Evaluation/Monitoring

There are basic guidelines that should be established for a design of the GFE case study methodology and for determining the optimum purposeful sample strategies. The following line items summarize the best approach for examination and determination of each country’s critical design factors; i.e., they assess the as-yet-unknown varying conditions, infrastructure, and environmental issues.

  1. The WFP form template of questions is used as the basis for developing the data collection form. For countries where specific data is not applicable, the questions should be dropped from the form used in that country. At a minimum, enrollment and attendance data will be collected.

  2. Each GFE/PVO country project is unique and should be evaluated separately to determine the most efficient design and appropriate sample size.

  3. If a PVO has collected "baseline data," this information can be useful if identical information was obtained from each participating school. This information, however, is not the GFE baseline data needed for evaluation, which must be collected using questions derived from the WFP form template. This is because even if each school asks for the same information, but asks for it using a slightly different question, then it is possible to get a different response. Thus, it is important that WFP and USDA use the same form template and follow the final questionnaire construction used in each country exactly as the questions come off the form template. This is another reason that PVO baseline data cannot be used as the basis for GFE baseline data.

  4. PVO baseline data could be useful for "classifying" each of the participating schools in the GFE program for sampling and estimation purposes. WFP classified each school as either a "new" school or an "existing" school, the idea being that existing school enrollment would have already increased from some lower baseline prior to the school-feeding program. If the purpose is to entice enrollment, it would be problematic to compare a "new" school with no prior feeding program. The WFP strategy is to summarize these two groups separately so that the analysis of the new schools’ overall performance will be most advantageously reflected in the report.

  5. If resources are available for only a very small sample of program schools, the WFP suggested that more than two classification criteria, as described in number 4 above, be used, because a small sample will not provide the same level of precision that the WFP has targeted; i.e., measure change with a precision of 10-20 percent with a .05 level of confidence. WFP suggested using a matrix approach with additional classification criteria—pre-school, primary schools (using the official government definition), and boarding schools. This approach requires scrutinizing the PVO information on each school to determine the relevant classification criteria. These would be used if the information is only available on every school in the program. Since the agreement signed with each PVO could have its own unique characteristics that could affect survey design and sampling, this process is required for each GFE/PVO project.

  6. Sample sizes and sample selection procedures should be determined on a country/project by country/project basis once the population counts are determined for each cell (Ni) in a country’s classification matrix.


VI. Detailed Discussion on GFE/PVO Project Sampling

Each country PVO project will have its own unique characteristics that will require tailoring the sampling design and data collection form to best accommodate the particular differences associated with each country PVO project. The basic data that must be collected relate specifically to the need to estimate the three measurement criteria (indicators). The WFP template needs to be scrutinized to ensure that only data that are needed are being collected and that the data collected will allow accurate estimation of the measurement criteria (indicators).

There are two general approaches to sampling: random selection, and purposeful selection. Generally, a random sample is used to make inferences about population characteristics and estimates of population totals, averages, ratios, etc. Purposeful samples are often used for expediency or to provide a cost-efficient indication of certain population characteristics, but will not produce unbiased estimates of population totals, averages, ratios, etc. For the purposes of GFE/PVO evaluation, a purposeful sample would accommodate the lower level of resources available for data collection, while providing a valid measure of change for the first two desired indicators. This is true when the survey design includes repeated sampling of identical observations to measure any possible change in population level of the desired indicators.

Random sampling is commonly used because it produces statistically sound population estimates. But a scientific basis does not guarantee that a random sampling will produce unbiased, accurate, and precise estimates. One never knows whether an estimate from a random sample is accurate, but one can calculate a confidence interval that allows a statement to be made with regard to the degree one can be confident that the true population value will fall within a range of values with a certain level of probability. Random samples will generally be less efficient as the sample size decreases. The advantage of a purposeful sample is that it will give a statistically defensible estimate of percentage change when calculated using repeated sampling of matched observations; i.e., repeat visits to identical schools. If one takes two random samples at two different periods of time, the ability of results to measure the true change in the population over the time period between surveys can be problematic. While each survey will make an independent estimate of the population characteristic, one does not know for sure whether any difference in level between the survey estimates is a true population level change or a change due to the difference in the different sample elements that compose each independent sample. The strength of the repeat sample to measure population characteristic changes is based on the strength of its application with purposeful sampling under GFE.

To help facilitate the effectiveness of using a small sample, it is essential to consider a strategy to stratify or classify the population (N) into smaller and more homogeneous sub-populations using a matrix with X&Y axis criteria to classify each school in the population (N) into one and only one of the matrix cells. Such a survey design allows making valid inferences to the percent change with respect to the desired indicators at the national level as long as the proper weights are all applied to estimates for each classification criteria (cell in the matrix). The weights are calculated using the number of schools in each cell.

The following steps will be applied for each country/PVO project:

  1. Decide on the classification criteria and assign each of the total N schools participating in GFE into its appropriate Ni strata or cells. The number of classification criteria is understandably important. For example, WFP has deemed it necessary to use two classification criteria—existing and new schools.

  2. Select a sample of schools (n). A total of 20 schools have to be selected as the target sample size. As a general rule of thumb, a minimum sample size per cell is two. A general approach to allocation of the total samples to cells, given that schools in each cell are homogenous, would be using a proportional scheme based on the number of schools in each cell. If the number of samples is sufficient, then a random sample of ni schools (ni = two minimum) can be selected from each of i strata or matrix cell. Depending on the type of classification data available and its quality, the schools in each strata could be ranked and a small sample size would provide a more representative, purposeful sample. This decision will to be made on a country/project-by-country/project basis. The extent to which the purposeful sample is representative and accuracy of the indicators are both contingent on careful selection of ni schools and proper weighting of the summarized data.

  3. Tailor the form template for each country to collect the appropriate data needed and available for the baseline survey calculations for the ni sample schools. The decision to collect four months of data for specific data items was a decision by WFP to best estimate the baseline from which to measure future change, or measure the effectiveness of the program. The GFE/PVO methodology is to collect data for measurement of the baseline (first survey), and to resurvey the identically sampled schools and collect corresponding data (follow-up survey). WFP suggested selecting four months during the school year that reflected seasonal trends in attendance to best estimate the baseline. Likewise, those same four months of data will be collected during the school year under the program to allow accurate measurement of the feeding programs’ effect on the education program.

WFP also qualifies the baseline survey to encompass the last complete academic year. Traumatic effects in the country anytime during that last complete academic year can cause participation in the educational program to be uncharacteristic or atypical for that year and can cause problems with analysis and interpretation of the resulting indicators. The same holds true for traumatic events that might occur during the feeding program academic year. Collecting data for more than one prior full academic year for baseline purposes was discussed by WFP as a solution to tempering the effects that traumatic events can have on indicator analysis.


VII. GFE Project Matrix Construction and Sampling

With the onset of project implementation, the regional coordinators investigated the conditions in GFE participating countries and other factors that might impact project effectiveness to develop a sampling matrix for unique classification of each participating school. Since each participating countries’ project is unique and operating under different conditions, the matrices should be tailored differently to meet each country’s specific conditions and project needs. Generally, the regional coordinators tailored each matrix to obtain as much information about the schools’ program and operational characteristics as could be obtained from a sample of twenty schools. Due to the limited number of samples, it was important to reduce the number of identified factors to an absolute minimum to maintain the number of matrix cells at a reasonable number (10 or fewer).

To illustrate the use of the matrix for sampling purposes, the following two examples will detail the process used by the regional coordinators to first create the matrix and then select the sample of schools:

1. The first example is Bosnia, where a complex set of factors was considered in the country for matrix creation. Due to recent armed conflict, one of the most important considerations was social vulnerability, which could potentially affect program implementation. Similarly, whether schools were rural or urban affects the ability of the PVO to effectively execute its food feeding activities. Third, whether participating schools had a parent-teacher Association (PTA) was deemed an extremely important factor in the school’s ability to execute and support its programs. These three major factors were considered important at the onset when little information was readily available. The whole purpose of the matrix is to ensure that representative data will be collected from the small sample to determine the project’s effectiveness.

The matrix used for sampling the Bosnia/Catholic Relief Services (CRS) program schools is below. Within each cell of the matrix are two numbers. The first number is the population of schools or total number in the CRS feeding program classified with that cell’s characteristics. The second number is the number of sample schools selected from the total population for that cell. In all cases, some manner of random selection was used by the regional coordinators to select the actual sample schools from each cell.


Sample for Bosnia/CRS Project Schools

PTA

Rural

Urban

Rural

Urban

TOTAL

Yes

Popn = 4
n = 1

2
1

13
3

14
3

33
8

No

7
2

7
2

25
4

34
4

73
12

TOTAL

11
3

9
3

38
7

48
7

106
20

Pop –Total schools participating in feeding program.
n – Number of samples.

The sample allocation within the matrix of participating schools was proportionate to the total number of schools in each cell. This method of sample allocation allows the analysis of the maximum amount of information with the least amount of resources expended on data collection.

2. A second example is the Vietnam Land O’ Lakes (LOL) project, which is a more typical of a developing country program situation. The regional coordinator met with education ministry officials to discuss the details and characteristics associated with the educational system in Vietnam to identify the best factors to use for developing the classification matrix.

The major factor identified was the significant difference between the administration of "main" and "branch" schools. The main schools were further classified by Ho Chi Minh City proper and two other major provinces. These schools also have large enrollments. The administration of the feeding program could be different between large and small schools. To examine and analyze these differences, the main schools were further classified by their enrollment size; i.e., less than 360 students enrolled, versus 360 or more students.

Fewer students were enrolled in the branch schools in the more rural areas. The administration of the rural educational system was more uniform across the country and the need for more definitive regional classification was not necessary. However, the size of rural schools was deemed important, and classification criteria based on enrollment were used to collect data for analysis of the project’s effectiveness in the rural economy compared to the more urban areas.

The matrix used for sampling the Vietnam LOL program schools is below. Within each cell of the matrix, the two numbers represent the same statistical characteristics as described for the Bosnia matrix.


Sample for Vietnam/LOL Main Schools

City/Province Main School
Enrollment
TOTAL
< 360 $360

Ho Chi Minh City

Popn = 15
n = 2

16
3

31
5

Long An Province

8
2

14
2

22
4

Dong Thap Province

18
2

25
2

43
4

TOTAL

41
6

55
7

96
13

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Vietnam/LOL Branch Schools

City/Province Branch School
Enrollment
TOTAL
< 30 31-99 $100

Country Total

Popn = 81
n = 2

101
2

23
2

205
6

Pop –Total schools participating in feeding program.
n – Number of samples.

The random sampling methodology used in Vietnam was essentially the same as that for Bosnia. The major difference between the two methodologies is the size of the feeding program’s school populations. In Bosnia, the total number of schools was 106 and a sample of 20 schools represents nearly a 20 percent sampling rate. In Vietnam, the total number of main and branch schools was 301, or a 6.6 percent sampling rate. The power of the matrix sampling approach is evident in Vietnam where a great deal of school characteristic data was collected with a limited number of sampled program schools.

The remaining GFE program country/PVO project matrices listed below are similar to the matrices for Bosnia and Vietnam.


Sample for Benin/CRS Project Schools

School
Gender

School Districts

TOTAL
Cobli

Materi

Copargo-
Djougou

Pehunco

Kerou

Girls

Popn = 14
n = 2

11
2

12
2

14
2

9
2

60
10

Boys

14
2

11
2

12
2

14
2

9
2

60
10

TOTAL

28
4

22
4

24
4

28
4

18
4

120
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Bolivia/Project Concern International (PCI)
Project Schools

Location

Rural

Urban

TOTAL

New
Program

Previous
Program

New
Program

Previous
Program

Potosi

n = 3

2

2

--

7

Oruro

5

--

--

--

5

Cochabamba

3

3

2

--

8

TOTAL

11

5

4

--

20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Congo/International Partnership for
Human Development (IPHD) Project Schools

City/Province

School Gender

TOTAL

Girls

Boys

Pointe Noire

Popn = 100
n = 2

100
2
200
4

Brazzaville

300
2

300
2
600
4

Pool

120
2

120
2
240
4

Nairi

40
2

40
2
80
4

Deloise

40
2

40
2
80
4

TOTAL

600
10

600
10
1200
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Eritrea/
Mercy Corps Project Schools

Location

Girls’ Enrollment

TOTAL

< 29%

30% -
39%

$40%

Highlands

Popn = 21
n = 4

45
4

24
2

90
10

Lowlands

7
4

32
4

21
2

60
10

TOTAL

28
8

77
8

45
4

150
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for
Georgia/International Orthodox
Christian Charities (IOCC) Project Schools

School Size

East

West

South

TOTAL

Urban

Rural

Urban

Rural

Urban

Rural

Large

7

--

2

1

2

1

13

Small

2

--

2

1

2

1

8

TOTAL

9

--

4

2

4

2

21

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Guatemala/
WorldShare Project Schools

Location

School Size

TOTAL

#99

$100

Region 1

9

5

14

Region 2

1

1

2

Region 3

2

2

4

TOTAL

12

8

20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Honduras/CRS Project Schools

Teacher
Location

PRAF Bonus

No PRAF Bonus

TOTAL

Vehicle
Access

No Vehicle
Access

Vehicle
Access

No Vehicle
Access

Resident

Popn = 10
n = 3

6
2

12
4

1
--

29
9

Non-Resident

14
5

4
2

5
2

4
2

27
11

TOTAL

24
8

10
4

17
6

5
2

56
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Kyrgyzstan/
Mercy Corps Project Schools

Community

Location

TOTAL

North

South

Bishkek

Popn = 92
n = 3

--
--

92
3

Urban

52
3

81
4

133
7

Rural

132
5

158
5

290
10

TOTAL

276
11

239
9

515
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Lebanon/IOCC Project Schools

Region

Boys

Girls

Both

TOTAL

Greater Beirut

Popn = 4
n = 2

11
2

18
6

33
10

South Lebanon/Bakaa

1
1

1
1

28
2

30
4

North Lebanon

2
2

3
2

20
2

25
6

TOTAL

7
5

15
5

66
10

88
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Moldova/
IPHD Project Schools

Location

Locality Size

TOTAL

Large

Small

North

Popn = 8
n = 3

7
3

15
6

Central

10
3

6
4

16
7

South

9
4

10
3

19
7

TOTAL

27
10

23
10

50
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Nicaragua/
PCI Project Schools

Municipality

Distance from School

TOTAL

0 - 3 km.

3+ km.

Yali

Popn = 32
n = 4

14
2

46
6

L. Concordia

2
--

21
3

23
3

S. Rafael N.

21
2

11
1

32
3

Pantasma

57
6

20
2

77
8

TOTAL

112
12

66
8

178
20

Pop –Total schools participating in feeding program.
n – Number of samples.


Sample for Uganda/
Save the Children Project Schools

Education
System

Predominant
Economic Activity

TOTAL

Fishing

Farming

Mixed

Formal

Popn = 7
n = 4

7
4

3
3

17
11

Non-Formal

4
4

5
4

1
1

10
9

TOTAL

11
8

12
8

4
4

27
20

Pop –Total schools participating in feeding program.
n – Number of samples.

 

Index

Executive
Summary

Introduction

Country
Reports

Summary
Tables

Appendices

Abbreviations

Reference
Terms

FAS Food Aid Page

 



Last modified: Monday, April 14, 2008 06:13:23 PM