Datasets
6076
Charts
1
Joined on May 02, 2011 Last Logged In March 11, 2013
No description provided
My Site http://www.data.gov
Name | Popularity | Type | |
---|---|---|---|
1. |
Patent Grant Data/XML (2001 - Present)
Patent Grant Data/XML (2001 - Present)
Business Enterprise
Patent Grant Full Text SGML XML with Embedded TIFF...
Contains the full text, images/drawings, and complex work units (tables, mathematical expressions, genetic sequence data, and chemical structures) of each patent grant issued weekly (Tuesdays) from January 2, 2001 to Present. The file formats are Standard Generalized Markup Language (SGML) in accordance with the U.S. Patent Grant Version 2.4 Document Type Definition (DTD) and eXtensible Markup Language (XML) in accordance with the U.S. Patent Grant Version 2.5; 4.0 International Common Element (ICE); 4.1 ICE; and 4.2 ICE Document Type Definitions (DTDs). Tables and sequence data are included using CALS markup. Mathematical expressions are included using MATHML markup and external Mathematica Notebook (NB) files. Chemical structures are represented by external CambridgeSoft Corp. ChemDraw (CDX) files and MDL Information Systems (MOL) files. Drawings, mathematical expressions, and chemical structures are also included as external Tagged Image File Format (TIFF) Revision 6.0 with CCITT Group 4 Compression image files. Each weekly file contains approximately 4,000 patent grants. There can be an optional weekly Supplemental zipfile that contains lengthy genetic sequence listings (anything over 300 pages) or a lengthy tables (anything over 200 pages). Approximately 836 MB per week.
|
270 views | |
2. |
Patent Application Publication Data/XML (2001 - Present)
Patent Application Publication Data/XML (2001 - Present)
Business Enterprise
MDL Information Systems (MOL) files, tables, ...
Contains the full text, images/drawings, and complex work units (tables, mathematical expressions, genetic sequence data, and chemical structures) of each patent application publication (non-provisional utility and plant) published weekly (Thursdays) from March 15, 2001 to Present. The file formats are eXtensible Markup Language (XML) in accordance with the U.S. Patent Application Version 1.5; 1.6; 4.0 International Common Element (ICE); 4.1 ICE; and 4.2 ICE Document Type Definitions (DTDs). Tables and sequence data are included using CALS markup. Mathematical expressions are included using MATHML markup and external Mathematica Notebook (NB) files. Chemical structures are represented by external CambridgeSoft Corp. ChemDraw (CDX) files and MDL Information Systems (MOL) files. Drawings, mathematical expressions, and chemical structures are also included as external Tagged Image File Format (TIFF) Revision 6.0 with CCITT Group 4 Compression image files. Each weekly file contains approximately 5,000 published patent applications. There can be an optional weekly Supplemental zipfile that contains lengthy genetic sequence listings (anything over 300 pages) or a lengthy tables (anything over 200 pages). Approximately 1.5 GB per week.
|
232 views |
Drag the markers to set a bounding box to find only datasets contained in that box