Global Youth Tobacco Survey
Data Release Policy
Section 2—Data Collecting and Processing
Data Collecting
Agreement for National Data Collection
Before collecting data, all RCs must participate in a training workshop,
organized by the WHO RO in collaboration with other partners, to ensure a common
methodology and unified procedures. This training assures continuity across
the regions; consistency in sample design, selection procedures, and questionnaire
development (ensuring the core remains intact); and uniformity in field procedures
for data collection.
Organization of the Training Workshop
-
The ROs and CDC jointly set a date and name a place for the training workshop.
-
The ROs in cooperation with CDC, and associate partners when applicable,
arrange the logistics and timing for the regional GYTS training workshops.
-
The ROs and CDC prepare all materials for the training, send school enrollment
files and data needs to countries, and ask each country to prepare a short
presentation for the workshop describing the country's educational system
and its efforts to control tobacco use among youth.
-
The ROs in collaboration with CDC, and associate partners when applicable,
will conduct the GYTS training workshop.
-
CDC and ROs (and associate partners when applicable) will work with each
country to develop the sample frame and design. CDC in collaboration with
ROs (and associate partners when applicable) works with countries on all issues
concerning the sample design and sample selection.
Collection of National Data
At the end of the training workshop, the RCs for each participating survey
site or country will have a clear understanding of all issues concerning the
GYTS implementation. The national government ensures that survey data collection
is completed within six months after the training workshop. To ensure successful
collection of national data, the following steps are taken:
- The RO and, when applicable, associate partners follow up with the countries
on all budget issues.
- The RO, CDC and, when applicable, associate partners work with the countries
to review their questionnaire.
- CDC and the RO collaborate with the countries on all issues pertaining
to sample selection.
- CDC provides the countries and/or survey sites with survey supplies (e.g.,
answer sheets and header sheets).
- CDC provides ongoing technical assistance to the ROs and RCs during the
implementation phase.
Data Processing
After completing the data collection phase of GYTS, the RCs send the survey
forms (answer and header sheets and school and classroom level forms) to CDC
for data processing. For each survey site, the sheets are optically scanned,
a data file is prepared and edited, and survey weight adjustments are applied.
CDC staff and the RC and RO interact throughout the cleaning and editing of
data files. After the data file is completed, CDC produces 100+ weighted frequency
tables and 100+ preferred tables. CDC drafts a one-page fact sheet highlighting
the main GYTS findings. The final data file, tables, and fact sheet are sent
to the corresponding RC via e-mail and as hard copy.
Data Definitions
Raw Data—Non-Tabulated
The survey is conducted among students in selected schools. Each student
completes a questionnaire with responses coded as filled-in bubbles on answer
sheets. CDC uses optical scanning hardware to extract data from these sheets.
Scanned data files proceed through a data-cleaning process that includes a match
of record length to scanned format, review of faulty response to an item (i.e.,
out of range or missing), and logic edits. Each data record is weight-adjusted
for school, class, and student non-participation. Finally, all records are adjusted
for grade and gender stratification. Individual questionnaires are represented
by a single row of data, each row containing responses from all questions. Additional
identifiers on each row correspond to weight, STRATA, and PSU (primary sampling
unit) (Fig. 1). Weight includes all final adjustments for sample selection,
non-participation, and post-stratification. STRATA and PSU are based on the
sample design. These rows of data are considered raw data.
Fig. 1. Example of a row of raw data
Responses from the questionnaire |
WEIGHT |
STRATA |
PSU |
abdbcefda...for all questions... |
XX |
YY |
ZZ |
Tabulated Data
The raw data are used in calculating tabulated data. As part of the
data processing for GYTS, CDC prepares two types of tables: (1) weighted frequency
tables and (2) preferred tables. The weighted frequency tables are produced
as separate tables for each question in the country's questionnaires. Tabulations
are reported for total participants, gender, and grade levels (Fig. 2). A codebook
specific for each country's questionnaire includes a listing of all questions
and all response categories for each question. Unweighted frequency counts are
included for each category response, for each question. All GYTS questionnaires
contain 56 core questions and any other questions added by the individual country.
CDC produces a set of preferred tables. This set translates each core question,
according to historical classification and including cross-comparisons, into
variables used as indicators to monitor tobacco activity within the country.
An example of a preferred table entry is the translation of the question "Have
you ever smoked cigarettes, even one or two puffs?" into the variable "ESMOKER."
Each country receives documentation describing how each preferred variable is
created. Together, the weighted frequency and preferred tables are the tabulated
data.
Fig. 2. Sample of Weighted Frequency Table (% of current smokers)
Survey
Questions: Where do you usually smoke?
(Select only one response) |
Total |
Male |
Female |
Grade 6 |
Grade 7 |
Grade 8 |
Responses |
At Home |
27.7 |
20.9 |
32.2 |
43.3 |
26.6 |
20.5 |
At School |
8.0 |
3.2 |
10.8 |
4.9 |
11.5 |
1.7 |
At friends'
houses |
26.3 |
25.2 |
26.4 |
21.3 |
21.4 |
38.3 |
At Social
events |
5.7 |
4.0 |
6.7 |
0.0 |
1.3 |
10.2 |
In Public
spaces
(e.g., parks, shopping centers, street corners) |
17.5 |
27.1 |
12.0 |
7.4 |
23.9 |
19.6 |
Data Analysis
After the data have been collected and processed, RO and CDC, in collaboration
with associate partners, conduct data analysis workshops. Their purpose is to
provide country coordinators with hands-on training to enable in-depth analysis
of their data sets. Workshop participants include RCs who have completed the
survey and have received their data files or those who have implemented the
survey. Data analysis workshops provide training in using EpiInfo (free software
that includes procedures for analyzing complex survey data) and in writing the
country's report.
Page last reviewed 03/12/2007
Page last modified 03/12/2007