Skip to content.Skip to side navigation.
About.Help. A-Z Resource List. Locate a Federal Depository Library. Buy Publications. Other Services. Legislative. Executive. Judicial.
GPO Access Home Page.
Go
Navigation Bar
About Government.
Ben's Guide Logo.

General Searching Instructions

Boolean Operators | Quotation Marks | Complex Queries with Multiple Boolean Operators
Truncation | Stopwords | Maximum Responses | Relevance Ranking and Document Score
Identification Codes | Document Size | 1,000-Point Documents | Query Reports

The information on this page will help you to understand the basic concepts involved in searching for documents on GPO Access. It contains general instructions, covering topics such as how to construct a query and how to interpret a results list.

For specific instructions on how to use a particular database, as well as sample searches, please consult the Search Tips for that database. Search Tips are available from the main search page for each database and from the GPO Access Databases page.

For information about file formats, see GPO Access File Formats.

Boolean Operators

Boolean operators (AND, OR, NOT, and ADJ) establish logical relationships among concepts expressed in a query. In other words, they are used to make searches more specific. The more specific your search is, the fewer number of extraneous hits you will receive.

Note: You do not have to capitalize the Boolean operators when you are using GPO Access via the World Wide Web. However, Boolean operators are capitalized on this and other help pages to distinguish their special function within a query.

AND AND restricts a search when a particular pair of terms is known. For instance, the query weather AND navigation returns only the documents that contain both words.
OR OR is the default operator for the WAIS server. Therefore, if no Boolean operators or quotation marks are used in a query consisting of more than one word, the query is treated as if OR were inserted between the words. The query transportation OR highway (or, alternatively, transportation highway) returns documents containing the word "transportation," the word "highway," or both. A higher relevance ranking is generally assigned to documents that contain both words.
NOT NOT rejects documents that contain specified words. For example, the query education NOT secondary returns documents that contain the word "education" but do not contain the word "secondary."
ADJ ADJ ensures that one word is followed by another in a document. It retrieves documents in which the query terms are immediately adjacent to one another, as well as those in which the second query term follows the first within 20 characters. For example, the query lead ADJ paint returns documents that contain the phrase "lead paint" or the word "paint" within 20 characters after the word "lead." The WAIS server will not change word order; therefore, lead ADJ paint is not the same as paint ADJ lead. A higher relevance ranking is generally assigned to documents that contain exact phrasing (i.e., documents in which the query terms are immediately adjacent to one another).

[ Top ]

Quotation Marks

Quotation marks have a function equivalent to that of the ADJ Boolean operator within a search query. Thus, the queries "Government Printing Office" and Government ADJ Printing ADJ Office return the same results. You may use quotation marks in combination with Boolean language.

[ Top ]

Complex Queries with Multiple Boolean Operators

Complex queries may be constructed with multiple Boolean operators. For clarity, parentheses should be used to group sections of the query and to ensure that the WAIS server parses the query as intended.

"Department of Education" AND ("bilingual education" OR "foreign language") AND (grants OR "cooperative agreements")

If the above example lacked parentheses, the WAIS server would process the phrases first, the AND operators next, and the OR operators last. The resulting query would be read by the server in the following manner:

("Department of Education" AND "bilingual education") OR ("foreign language" AND grants) OR "cooperative agreements"

A test of these searches in the 1998 Federal Register database retrieved 22 documents for the first query and 679 for the second, which demonstrates the importance of using parentheses for complex queries.

[ Top ]

Truncation

The asterisk (*) may be used to truncate words in a query in order to expand a search within a specified range. For example, a search for librar* will return documents that contain the word(s) "library," "library's," "libraries," "librarian," etc. Using truncation saves you time by eliminating the need to perform different searches for variations on a single word that differ only in their endings, or suffixes. When constructing a truncated query, try to include as many characters from the desired words (or phrases) as possible in order to reduce the number of irrelevant documents returned.

Note: Truncation may not be used for prefixes.

[ Top ]

Stopwords

Stopwords, such as "the" and "it," are words that occur so frequently in documents that they are not useful for distinguishing one document from another. Since they are not indexed, they cannot be used in searches. Therefore, stopwords that are included in queries are ignored by the system. For example, the query "National Council on Disability" returns the same documents as the queries "National Council Disability" and National ADJ Council ADJ Disability.

A comprehensive list of GPO Access stopwords follows:

all been have i.e. really then thus when
also could having into she there to where
an did hereby is should thereby too whereby
and do herein it so therefore unto which
any does hereof its some therein us who
are e.g. hereon me such thereof very whom
as ever hereto nor than therewith viz. whose
at from herewith not that these was why
be had him on the they we would
because hardly his or their this were you
been has however our them those what

Note: Occurrences of the words "and," "or," and "not" are processed by the WAIS server as Boolean operators. While they do operate as search terms, they are listed here as stopwords because of their special function.

[ Top ]

Maximum Responses

The maximum responses you may receive from a query is set at a default of 50. To locate a larger number of documents, you must change the setting. All of the GPO Access search pages provide a pull-box in which you may change the maximum number of returned documents up to a limit of 200.

Generally, 50 responses should be adequate to retrieve the document for which you are searching. If you cannot find a desired document with the default 50 responses, you may want to try making your query more specific before expanding the number of returned documents. Keep in mind that, by increasing the number of documents to be retrieved, you are also increasing the time that it takes to return your search results.

[ Top ]

Relevance Ranking and Document Score

Search results are displayed in an order that is determined by a system called relevance ranking. The most "relevant" document appears at the top of your results list with a score of 1,000; the least "relevant" appears at the bottom of the list with a score of one. As a general rule, document scores should decrease gradually from the top to the bottom of your results list. Typically, documents with a score of less than 500 are not very "relevant" to your search and are not worth retrieving, unless you are fairly certain of their contents and their significance to your topic.

"Relevance" is computed based on the following five factors:

Word weight is based on where your query term is located within a document. A word receives the highest rating if it appears in a headline or title. Within the text of a document, a word receives a higher rating if it appears in all capital letters or if the first letter of the word is capitalized than if it appears in all lowercase letters.

Word density is based on a query term's frequency of occurrence within a document in relation to the size of the document. If two documents contain the same number of occurrences of a particular query term, the smaller document will receive the higher rating. For this reason, it is important to take file size into account when you compare the relevance ranking of documents.

Term weight is based on a query term's frequency of occurrence throughout all documents in a database. Words that occur infrequently throughout a database receive a higher rating than words that appear frequently. Very common words are either ignored or devalued in the scoring.

Phrase matching is based on the similarity between a query phrase and the corresponding phrase in a document. A document that contains a phrase that is identical to the query phrase receives the highest rating. For instance, if you were searching for "foreign import", documents with the phrase "foreign import" would generally have a higher relevance ranking than documents with the phrase "foreign trade import."

Proximity relationship is based on the proximity of query terms to one another within a document. Query terms that are located close together in a document receive a higher rating than those that are located farther apart. Remember that the use of quotation marks and/or the Boolean operator ADJ in your query will retrieve documents that contain the query terms within 20 characters of one another. Thus, a search for "lead paint" will retrieve documents that contain the phrases "lead paint," "lead-based paint," and "lead-based type paint."

Note: A separate score for each of these factors is not available for any given document.

[ Top ]

Identification Codes

The WAIS server generates an identification code that is unique to each database on GPO Access. The sole purpose of this identification code is to identify the database from which a particular document is retrieved. You can find a database identification code to the left of a document's title in your search results list.

Although identification codes usually are not terms residing in the text of a document and, as a result, are not searchable, they are useful for differentiating among dates, sections of a single database, and, in the case of more complex searches, documents from multiple databases.

To learn more about the identification codes for a particular database, consult the Search Tips for that database.

[ Top ]

Document Size

A document's size (in bytes) is listed below its title in your search results list. The size applies to the ASCII text file of that document. If another type of file, such as a PDF file, is available for the same document, it will typically be larger. Please keep this generalization in mind when you are attempting to download large documents.

[ Top ]

1,000-Point Documents

In addition to identifying the document most "relevant" to your search, 1,000-point documents are used occasionally to alert you to structural database enhancements. Whenever two 1,000-point documents appear in your results list, the first is an online message from GPO and the second is the 1,000-point document that applies to your query. These messages, in the form of a returned document, may give the status of periodic database upgrades and enhancements; announce new databases, applications, and features; state when a database is expected to be back online; or supply other information deemed important for users of that database. They do not interfere with the results of your search.

[ Top ]

Query Reports

A query report always appears as the final document with a score of one in your search results list. This document contains information on how your query was parsed, the fields and number of documents in the database you searched, the number of words in the database that conformed to your search request, the total number of relevant documents identified, and the speed of retrieval.

[ Top ]