PRODUCTS   
Products and Services > FTP Weekly Bibliographic Raw Data from Patent Grants and Patent Application Publications

FTP Weekly Bibliographic Raw Data from Patent Grants and Patent Application Publications

Go to FTP Server

Patent Grant Red Book Files (XML International Common Elements (ICE) v4.0 - Production) Beginning April 19, 2005

Beginning April 19, 2005 - This product contains the full text including tables, sequence data and "in-line" mathematical expressions of each patent issued. The file is a concatenation of extensible Markup Language (XML) documents in accordance with the ST.36 US Patent Grant Document Type Definition (DTD) V4.0 (us-patent-grant-v40-2004-12-02.dtd). Sequence data XML text in accordance with the ICE SEQLST V1.2 DTD (us-sequence-listing-2004-03-09.dtd) is concatenated next to the containing grant XML text. References to the following external files are present but the external files are not present: mega sequence listing data files; Mathematica Notebook (NB) files; CS ChemDraw (CDX) and MDL Information Systems (MOL) files; and drawings, mathematical expressions, and chemical structures image (TIFF) files.

Patent Application Red Book Files (XML International Common Elements (ICE) v4.0 - Production) Beginning April 21, 2005, to date

Beginning April 21, 2005, to date - This product contains the full text including tables, sequence data and "in-line" mathematical expressions of each application published. The file is a concatenation of patent application Extensible Markup Language (XML) document in accordance with the US Patent Application Document Type Definition (DTD) v4.0 (ICE) - (us-patent-application-v40-2004-12-02.dtd). Sequence data XML text in accordance with the ICE SEQLST V1.2 DTD (us-sequence-listing-2004-03-09.dtd) is concatenated next to the containing application XML text. References to the following external files are present but the external files are not present: sequence listing data files; Mathematica Notebook (NB) files; CS ChemDraw (CDX) and MDL Information Systems (MOL) files; and drawings, mathematical expressions, and chemical structures image (TIFF) files.

Patent Grant Red Book Files (XML v2.5) 2002 to April 12, 2005

January 1, 2002, to April 12, 2005 - Patent Grant Red Book XML is available and compliant with the V2.5 Grant Red Book DTD. Reference the Red Book Information page for documentation, DTD's, entity files, sample data, and a description of the changes between SGML V2.4 and XML V2.5.

To assist customers in migrating to Grant Red Book XML, two conversion utilities are provided.

The utility V25xml2V24sgml.pl (updated on February 2, 2002) converts Grant Red Book V2.5 (xml) back to Grant Red Book V2.4 (SGML). It is distributed "as-is" and is available free for download in the Grant Red Book Conversion Tools and Sample Data page. The software is provided as-is.

The utility RBxml2GB.pl converts Grant Red Book V2.5 (xml) to Green Book. It is being distributed "as-is" and is available free for download in the 2002 FTP directory. The conversion utility is a PERL script which can be modified (with some effort) to convert Grant Red Book to formats other than Green Book. The software is provided as-is.

Patent Application Red Book Files (XML v1.6) March 15, 2001 to April 14, 2005

March 15, 2001, to April 14, 2005 - Application Red Book v1.6 (XML) bibliographic data is available on the FTP server. The data is located within the "pgpub" sub-directory located in the FTP server folder for their respective years (2001 - 2005)..

Patent Grant Red Book Files (SGML) 2001

Patent Grant Red Book files are SGML files based on the Grant Red Book DTD v2.4 which appears on the Red Book Information page. Grant Red Book, although it is SGML, avoids all constructs forbidden in XML. The files available for download from the FTP site consist of a single zip file for each week's issue. The file contains the concatenated *.SGM files for each patent in the issue, except that the following elements have been removed: BRFSUM, SDOCL (except for Design patents), DETDESC, RELAPP, DRWDESC, SDODR, SDOCR. The resulting file contains the so-called "front page" information only. Although there are references to external entities in the DOCTYPE declaration at the start of each document in the file, none of those entities are available via FTP. To obtain the complete Grant Red Book file, including all external entities, you must subscribe to Patent Data/SGML. Standard character entities referenced in the files are available from public sites on the Internet, from ISO, and, in the future, at the Red Book Information page.

To assist customers to migrate to Grant Red Book, a Red Book to Green Book conversion utility RB2GB is being distributed "as-is" and is available free for download in the 2001 FTP directory. The conversion utility is a PERL script which can be modified (with some effort) to convert Grant Red Book to formats other than Green Book. The software is provided as-is, with no support. Written documentation is included.

Patent Grant Green Book Files 1996 to 2000

The data content of Green Book is identical to the patent bibliographic magnetic tapes sold by USPTO, in a format known as the "Patent Full-Text/APS File" format, or "USPTO Green Book." The data is available as one zipped file for each weekly issue, beginning with week 36 of 1996 and ending with the last week of 2000. Within each zip file, the data appears in USPTO Green Book formatted as either fixed-length (blank padded) or variable-length, linefeed or carriage return/linefeed - terminated ASCII records. Each file is approximately 2 to 4 MB zipped, and unzips to a single 20 to 50 MB ASCII file.

For Your Information

The data is provided "as is." Neither the United States Government, nor any agency thereof, nor any of their contractors, subcontractors or employees makes any warranty, expressed or implied, of this data. The USPTO is only the data provider and will provide limited technical assistance concerning data content of file(s). The USPTO does not "debug" processing software developed by the user of these files.


Patent Data FTP Directory:

This directory contains raw patent data for each weekly issue in the current calendar year.

The data types are as follows ["nn" is a two-digit, fixed-length number (i.e., with leading zero), which represents the sequentially-numbered week of issue]:

  • 99weeknn.rpt -- ASCII text file listing unused sequential patent numbers and summarizing weekly contents by patent type.


  • 99weeknn.txt -- ASCII text file containing a list of all patent numbers in the issue, one per line. (A UNIX "wc" of this file should yield a line count which equals the total patent number in the .rpt file.)


  • 99weeknn.zip -- ASCII text file, zipped, USPTO Green Book tagged data format (1996--2000), containing variable-length, linefeed-terminated records. A UNIX grep for "^WKU" piped to "wc" (grep "^WKU"|wc) should yield a line count which equals the total patent number in the .rpt file.


  • 01weeknn.zip -- ASCII text file, zipped, USPTO Grant Red Book SGML data format (2001-- ), containing a single *.SGML file. Within the file, each document consists of a DOCTYPE declaration followed by the start tag <PATDOC> followed by additional markup and content followed by the end tag </PATDOC> which terminates the document. The number of occurrences of PATDOC indicates the number of documents in the file.

  • The USPTO has been officially paralleling the following file formats since October 2004:

    • Patent Grant Bibliographic Data/XML Version 2.5 (WIPO ST.32) with a previous format file name yyweeknnrb.rpt .txt .zip
    • Patent Grant Bibliographic Data/XML International Common Element (ICE) Version 4.0 (WIPO ST.36) with a current format file name ipgbyymmdd.rpt .txt .zip
    • Patent Application Bibliographic Data/XML Version 1.6 (WIPO ST.32) with a previous format file name pabyymmdd.rpt .txt .zip
    • Patent Application Bibliographic Data/XML International Common Element (ICE) Version 4.0 (WIPO ST.36) with a current format file name ipabyymmdd.rpt .txt .zip

    As of April 15, 2005 the previous format, XML Versions 1.6 and 2.5 (WIPO ST.32), were discontinued. The ICE Version 4.0 (WIPO ST.36) is now production.

 

Go to FTP Server
KEY: e Biz=online business system fees=fees forms=formshelp=help laws and regs=laws/regulations definition=definition (glossary)

Is there a question about what the USPTO can or cannot do that you cannot find an answer for? Send questions about USPTO programs and services to the USPTO Contact Center (UCC). You can suggest USPTO webpages or material you would like featured on this section by E-mail to the webmaster@uspto.gov. While we cannot promise to accommodate all requests, your suggestions will be considered and may lead to other improvements on the website.


|.HOME | SITE INDEX| SEARCH | eBUSINESS | HELP | PRIVACY POLICY