README File for Census 2000 Summary File 1 Delivered via FTP Note: We are unable to provide one-on-one support for applications of the data to specific spreadsheets or data base software. However, we do have detailed instructions on loading an ASCII file into Access97 at www.census.gov/support/SF1ASCII.html. On some systems, there may be a request for a password. If this happens, select cancel. You should immediately go to the site. About the FTP Application This FTP (File Transfer Protocol) application is intended for experienced users of census data, compressed files, and spreadsheet/database software. It provides quick access to data users, such as State Data Centers and news media, who need to begin their analysis immediately upon data release. Due to the size of the files, the FTP user should have a fast file transfer capability. Each state directory provides all files available for the identified state. Once uncompressed, the data are in a flat ASCII format. The geographic file is in a fixed-field format; the two data files are in comma delimited format. No software is provided. Users of the FTP application need to unzip the compressed file after downloading, then import it into the spreadsheet/database software of their choice for data analysis and table presentation. Other Sources of the Data The Census Bureau releases most Census 2000 data on a state-by-state basis. Tables generally are available in American FactFinder (factfinder.census.gov) the day of the release of the designated state file. Within American FactFinder, individual tables can be downloaded in a text delimited or comma delimited format. For users without immediate need for the data, CD-ROMs containing the data and access software are scheduled for shipping shortly after the state file release. They can be ordered from the Census Bureau's Customer Services Center at 301- 457-4100. FTP File Transfer The FTP directory for Summary File 1 (SF1) is at ftp2.census.gov/census_2000/datasets/Summary_File_1 . When the SF1 data are added to the respective state directories, there will be 40 files for each state-- a geographic header file and thirty-nine data files. See the chart below for more information on the data segments. To facilitate transferring multiple files, we suggest using features commonly found in most vendor's FTP utility. In the UNIX environment, the "mget" subcommand allows transferring multiple files using a wildcard character. For example, once you have navigated into the SF1 directory for Nebraska, you can download all 40 SF1 files with the following two ftp subcommands: ftp>prompt off (to avoid being asked for verification of each file, optional) ftp>mget ne* When testing the download in a PC environment, we used the ws_ftp product. This product, and many other FTP products developed for the PC environment, allows individual multiple file selection using the control key or block multiple file selection using the shift key. File Naming Conventions File naming conventions have changed since the release of the Redistricting data. The new convention is ss000yy_uf1.zip where ss is the USPS state abbreviation and yy is the number (01-39) of the file segment. The geoheader file name is ssgeo_uf1.zip . File Information Once uncompressed, these files are in flat ASCII format. The geographic header file (see below) contains fixed fields while the data files (File01 through File39, see below), including the geographic link fields, are in comma-delimited format. These files have been constructed in a UNIX environment. They use an ASCII linefeed, chr(10), to indicate a new record. For successful use with many programs running in a Windows environment, these files need to be modified to use the ASCII carriage return/linefeed sequence, chr(13) + chr(10) as a record terminator. This is an easy step in the UnZIP process using any UnZIP software which offers the conversion option. We tested PKZIP for Windows, version 4.00 following the steps outlined below. This PKZIP shareware can be downloaded from www.pkware.com. After installing PKZIP, do the following: --Select the file --Select the Extract option on the tool bar --Select the options button at the bottom of the Extract page --Under the Miscellaneous section, select the "DOS - convert to CR/LF" The resulting file will meet the ANSI MS-DOS/Windows standard used by Access 97 and other MS Windows-based programs. If the data are being processed in a UNIX environment, they can be unzipped using any standard ZIP/UnZIP package. These FTP data are available as compressed files at the 90% (approximately) file compression ratio. If you are using a modem/telephone line link to the Internet, we do not recommend using the FTP option. Segmented Data The data in the redistricting files and other Census 2000 summary files are segmented. This is done so that individual files will not have more than 255 fields, facilitating exporting into spreadsheet or database software. In short, to get the complete data set for SF1 files, users must FTP all forty files in the state directory. These test files contain: File Name Number of Data Items Starting Matrix Ending Matrix 01 222 P1 P5 02 238 P6 P18 03 236 P19 P33 04 149 P34 P45 05 245 P12A P12E 06 241 P12F P16I 07 234 P17A P27C 08 247 P27D P28E 09 244 P28F P30H 10 229 P30I P34I 11 180 P35A P35I 12 235 PCT1 PCT9 13 45 PCT10 PCT11 14 209 PCT12 PCT12 15 203 PCT13 PCT17 16 209 PCT12A PCT12A 17 209 PCT12B PCT12B 18 209 PCT12C PCT12C 19 209 PCT12D PCT12D 20 209 PCT12E PCT12E 21 209 PCT12F PCT12F 22 209 PCT12G PCT12G 23 209 PCT12H PCT12H 24 209 PCT12I PCT12I 25 209 PCT12J PCT12J 26 209 PCT12K PCT12K 27 209 PCT12L PCT12L 28 209 PCT12M PCT12M 29 209 PCT12N PCT12N 30 209 PCT12O PCT12O 31 245 PCT13A PCT13E 32 235 PCT13F PCT15C 33 228 PCT15D PCT17B 34 225 PCT17C PCT17E 35 225 PCT17F PCT17H 36 75 PCT17I PCT17I 37 217 H1 H20 38 207 H11A H15I 39 171 H16A H16I It is easiest to think of the file set as a logical file. However, this logical file consists of forty physical files: the geographic header file and file01- file39. This structure is a change from previous decennial census files. The explanation below for linking the summary file 1 files requires specific location information for the geographic header. These are located in chapter 7 of the technical documentation www.census.gov/prod/cen2000/doc/sf1.pdf . A unique logical record number (LOGRECNO in the geographic header) is assigned to all files for a specific geographic entity; all records for that entity can be linked together across files. Additional identifying fields are also carried over from the geographic header file to the table files. These are file identification (FILEID), state/U.S. abbreviation (STUSAB), characteristic iteration (CHARITER), characteristic iteration file sequence number (CIFSN). The geographic header record layout is identical across all electronic data products from Census 2000. Since the SF1 files are relatively simple, some of the fields, including some geographic header fields that appear in all forty files (geographic header, file01-file39) are not used. For example, the character iteration (CHARITER) field is only used in SF2/SF4. In SF1, it is always coded as 000. File Record Layout For a layout of the individual tables for each file, see www.census.gov/prod/cen2000/doc/sf1.pdf . Select Chapter 6, Summary Table Outlines. Spreadsheet and Data Base Aids We are unable to provide one-on-one support for applications of the data to specific spreadsheets or data base software. However, we do have detailed instructions on loading an ASCII file into Access97 at www.census.gov/support/SF1ASCII.html Estimated File Sizes These size estimates are for the total file package for SF1. State SF1 GeoHeader and File01-File39 unzipped zipped Alabama 1.7G 87M Alaska .3G 13M Arizona 1.5G 75M Arkansas 1.5G 75M California 5.1G 260M Colorado 1.5G 75M Connecticut .6G 28M Delaware .2G 10M District of Columbia .6G 30M Florida 3.4G 170M Georgia 2.3G 110M Hawaii .2G 10M Idaho .9G 45M Illinois 4.1G 209M Indiana 2.1G 108M Iowa 1.7G 86M Kansas 1.7G 88M Kentucky 1.1G 58M Louisiana 1.5G 77M Maine .5G 25K Maryland .9G 44M Massachusetts 1.2G 58M Michigan 2.7G 136M Minnesota 2G 105M Mississippi 1.4G 70M Missouri 2.5G 125M Montana .9G 45M Nebraska 1.4G 70M Nevada .7G 34M New Hampshire .3G 16M New Jersey 1.7G 82M New Mexico 1.4G 68M New York 3.6G 180M North Carolina 2.5G 123M North Dakota .9G 42M Ohio 2.8G 138K Oklahoma 1.8G 90M Oregon 1.4G 71M Pennsylvania 3.5G 174M Rhode Island .24G 12M South Carolina 1.5G 75M South Dakota .8G 42M Tennessee 1.9G 95M Texas 6.8G 340M Utah .8G 41M Vermont .25G 12M Virginia 1.5G 77M Washington 2G 100M West Virginia .9G 44.2M Wisconsin 2G 100M Wyoming .7G 32M Puerto Rico .8G 36M 1 This is the number in field CIFSN, beginning in position 17.