Energy Information Administration Logo. If you need assistance viewing this page, please call (202) 586-8800 Sampling Methodology
EIA Home > Petroleum > Weekly Retail Gasoline Prices > Sampling Methodology
 
Sampling Methodology

The sample for the Motor Gasoline Price Survey was drawn from a frame of approximately 115,000 retail gasoline outlets. The gasoline outlet frame was constructed by combining information purchased from a private commercial source with information contained on existing EIA petroleum product frames and surveys. Outlet names, and zip codes were obtained from the private commercial data source. Additional information was obtained directly from companies selling retail gasoline to supplement information on the frame. The individual frame outlets were mapped to counties using their zip codes. The outlets were then assigned to the published geographic areas as defined by the EPA program area, or for conventional gasoline areas, as defined by the Census Bureau’s Standard Metropolitan Statistical Areas (SMSA) by using their county assignment..  

The gasoline outlet sample is an area sample that is comprised of both an augmentation to, and rotation of the previous sample cycle of the gasoline survey, the EIA-878. The augmentation outlets were obtained by first, sampling counties, and then, sampling the outlets from the gasoline outlet frame within those counties within each sampling cell[1]. Every county in the U.S. was assigned to the corresponding sampling cell as defined. After the counties were assigned, the standard deviations of gasoline prices for these sampling cells were estimated using the prices from the previous sample of the gasoline survey. These deviations and the number of stations from the Census Bureau's County Business Patterns (CBP) were used to determine the required number of outlets to be sampled. The statistical technique used was the Chromy allocation algorithm, an iterative procedure to determine the number of units required for each sampling cell. To select the sample of outlets, counties within each sampling cell were sorted within states, and the required number of outlets selected from the outlet frame randomly. Once the augmentation portion of the sample was obtained, standard deviations were re-estimated, combining the previous gasoline sample outlets and newly sampled outlets. The Chromy algorithm was applied again to determine the revised sample cell requirements. The previous sample’s outlets were then sub-sampled to insure a self-weighting sample within each stratum, and allocations satisfied by sampling half from each of the self-weighting sub-sample and the old sample.   

To estimate average prices, sample weights were constructed based on the sampled outlet's number of pumps, a proxy for sales volume. These weights are applied each week to the reported outlet gasoline prices to obtain averages for the specific formulations, grades and geographic areas. Weights used in aggregating grades, formulations and geographic areas were derived using volume data from the EIA “Monthly Report of Prime Suppliers Sales of Petroleum Products Sold for Local Consumption”, and demographic data from the Bureau of the Census and Department of Transportation on population, number of gasoline stations and number of vehicles.

Prior to the development of the outlet frame, only company level data were available. Therefore, the previous samples required a two-phase sample design to select the outlets. In the first phase of the design, retail gasoline companies were selected with probability proportional to the total volume in each state they sold gasoline, as reported in the EIA Monthly Petroleum Product Sales Report survey. The second phase, the selection of individual outlets from the selected companies, was performed using information obtained directly from the sampled companies during sample initiation. This design permitted the use of a simple average for estimating average prices for city and state gasoline prices, but required volume weighted prices for more aggregated published areas with respect to geography, formulation, and grade. However, with the publication of additional city and state averages prices, this design approach was insufficient and required redesign with the increase of the geographic detail to include nine states and ten cities. Further details of this previous design are contained in a published paper that can be found at: . http://www.eia.doe.gov/pub/oil_gas/petroleum/data_publications/weekly_on_highway_diesel_prices/current/html/2cycasr.htm. 


[1] Sampling cells are the smallest basic geographical units formed by the boundaries of the geographic and formulation areas for which average prices are published.  Sampling cells are mutually exclusive and collectively exhaustive.

Need Help?
phone: 202-586-8800
email: infoctr@eia.doe.gov
Specialized Services from NEIC
     Energy Information Administration, EI 30
1000 Independence Avenue, SW
Washington, DC 20585