Primary Navigation for the CDC Website
CDC en Español

Global Youth Tobacco Survey

Data Release Policy



Section 2—Data Collecting and Processing

Data Collecting

Agreement for National Data Collection

Before collecting data, all RCs must participate in a training workshop, organized by the WHO RO in collaboration with other partners, to ensure a common methodology and unified procedures. This training assures continuity across the regions; consistency in sample design, selection procedures, and questionnaire development (ensuring the core remains intact); and uniformity in field procedures for data collection.

Organization of the Training Workshop

  1. The ROs and CDC jointly set a date and name a place for the training workshop.

  2. The ROs in cooperation with CDC, and associate partners when applicable, arrange the logistics and timing for the regional GYTS training workshops.

  3. The ROs and CDC prepare all materials for the training, send school enrollment files and data needs to countries, and ask each country to prepare a short presentation for the workshop describing the country's educational system and its efforts to control tobacco use among youth.

  4. The ROs in collaboration with CDC, and associate partners when applicable, will conduct the GYTS training workshop.

  5. CDC and ROs (and associate partners when applicable) will work with each country to develop the sample frame and design. CDC in collaboration with ROs (and associate partners when applicable) works with countries on all issues concerning the sample design and sample selection.

Collection of National Data

At the end of the training workshop, the RCs for each participating survey site or country will have a clear understanding of all issues concerning the GYTS implementation. The national government ensures that survey data collection is completed within six months after the training workshop. To ensure successful collection of national data, the following steps are taken:

  1. The RO and, when applicable, associate partners follow up with the countries on all budget issues.
  2. The RO, CDC and, when applicable, associate partners work with the countries to review their questionnaire.
  3. CDC and the RO collaborate with the countries on all issues pertaining to sample selection.
  4. CDC provides the countries and/or survey sites with survey supplies (e.g., answer sheets and header sheets).
  5. CDC provides ongoing technical assistance to the ROs and RCs during the implementation phase.

Data Processing

After completing the data collection phase of GYTS, the RCs send the survey forms (answer and header sheets and school and classroom level forms) to CDC for data processing. For each survey site, the sheets are optically scanned, a data file is prepared and edited, and survey weight adjustments are applied. CDC staff and the RC and RO interact throughout the cleaning and editing of data files. After the data file is completed, CDC produces 100+ weighted frequency tables and 100+ preferred tables. CDC drafts a one-page fact sheet highlighting the main GYTS findings. The final data file, tables, and fact sheet are sent to the corresponding RC via e-mail and as hard copy.

Data Definitions

Raw Data—Non-Tabulated

The survey is conducted among students in selected schools. Each student completes a questionnaire with responses coded as filled-in bubbles on answer sheets. CDC uses optical scanning hardware to extract data from these sheets. Scanned data files proceed through a data-cleaning process that includes a match of record length to scanned format, review of faulty response to an item (i.e., out of range or missing), and logic edits. Each data record is weight-adjusted for school, class, and student non-participation. Finally, all records are adjusted for grade and gender stratification. Individual questionnaires are represented by a single row of data, each row containing responses from all questions. Additional identifiers on each row correspond to weight, STRATA, and PSU (primary sampling unit) (Fig. 1). Weight includes all final adjustments for sample selection, non-participation, and post-stratification. STRATA and PSU are based on the sample design. These rows of data are considered raw data.

Fig. 1. Example of a row of raw data
Responses from the questionnaire WEIGHT STRATA PSU
abdbcefda...for all questions... XX YY ZZ

Tabulated Data

The raw data are used in calculating tabulated data. As part of the data processing for GYTS, CDC prepares two types of tables: (1) weighted frequency tables and (2) preferred tables. The weighted frequency tables are produced as separate tables for each question in the country's questionnaires. Tabulations are reported for total participants, gender, and grade levels (Fig. 2). A codebook specific for each country's questionnaire includes a listing of all questions and all response categories for each question. Unweighted frequency counts are included for each category response, for each question. All GYTS questionnaires contain 56 core questions and any other questions added by the individual country. CDC produces a set of preferred tables. This set translates each core question, according to historical classification and including cross-comparisons, into variables used as indicators to monitor tobacco activity within the country. An example of a preferred table entry is the translation of the question "Have you ever smoked cigarettes, even one or two puffs?" into the variable "ESMOKER." Each country receives documentation describing how each preferred variable is created. Together, the weighted frequency and preferred tables are the tabulated data.

Fig. 2. Sample of Weighted Frequency Table (% of current smokers)
Survey Questions: Where do you usually smoke?
(Select only one response)
Total Male Female Grade
6
Grade
7
Grade
8
Responses At Home 27.7 20.9 32.2 43.3 26.6 20.5
At School 8.0 3.2 10.8 4.9 11.5 1.7
At friends' houses 26.3 25.2 26.4 21.3 21.4 38.3
At Social events 5.7 4.0 6.7 0.0 1.3 10.2
In Public spaces
(e.g., parks, shopping centers, street corners)
17.5 27.1 12.0 7.4 23.9 19.6

Data Analysis

After the data have been collected and processed, RO and CDC, in collaboration with associate partners, conduct data analysis workshops. Their purpose is to provide country coordinators with hands-on training to enable in-depth analysis of their data sets. Workshop participants include RCs who have completed the survey and have received their data files or those who have implemented the survey. Data analysis workshops provide training in using EpiInfo (free software that includes procedures for analyzing complex survey data) and in writing the country's report.


 

Page last reviewed 03/12/2007
Page last modified 03/12/2007