Announcements to NLM Data Licensees: Year 2009
(02/02/09) Catfile, CatfilePlus, and Serfile Licensees
(12/19/08) DOCTYPE Line in 2009 MEDLINE/PubMed Baseline Files
(12/15/08) 2009 MEDLINE/PubMed Baseline Data
2008 Announcements
Catfile, CatfilePlus, and Serfile Licensees
February 2, 2009
This notice is being sent separately to both primary and secondary (if available) license representatives.
Please see http://www.nlm.nih.gov/bsd/licensee/access/ for links to
access instructions and information.
FOR MARC SUBSCRIBERS
The updated MARC base files for Catfile, CatfilePlus, and Serfile are now available on the NLM ftp server.
These base files are each complete in a single file. Loading the base files on an annual basis is optional
for MARC subscribers. If you have loaded each of the monthly updates, there is no need to reload the base
files.
The MARC file containing all the bibliographic records deleted by NLM between January 1, 2008 and December 31, 2008,
is also available. There are 5,665 records in this file. Licensees who are new recipients of NLM's MARC
bibliographic records in 2009, as well as ongoing licensees who are discarding their pre-2009 records and reloading
with the 2009 base files, do NOT need the delete file. The records in this file were removed from NLM's database prior
to the pull of the 2009 base files.
If loading new baseline files, you should then load the 2009 update files dated after the date of the base files.
FOR XML SUBSCRIBERS
The updated XML basefiles for CatfilePlus and Serfile have been available on the NLM ftp server since mid-December.
The baseline files should be used to completely replace all records previously distributed to continuing licensees.
After the new baseline files are loaded, you should then load the 2009 update files. See the XML Update
Charts at
http://www.nlm.nih.gov/bsd/licensee/catrecordxml_stats_2009.html.
DOCTYPE Line in 2009 MEDLINE/PubMed Baseline Files
December 19, 2008
This message is intended for licensees who downloaded the 2009 MEDLINE/PubMed baseline files before 7:37pm December 17, 2008.
The 2009 MEDLINE/PubMed baseline files became available to licensees on Tuesday December 16, 2008
(see http://www.nlm.nih.gov/bsd/licensee/announce/2009.html#d12_15).
It was subsequently discovered that the DOCTYPE line in the baseline data files was incorrect. For those who validate
the XML data with the DTD from the DOCTYPE line, the correct DOCTYPE is:
<!DOCTYPE MedlineCitationSet PUBLIC "-//NLM//DTD Medline Citation, 1st January, 2009//EN"
"http://www.nlm.nih.gov/databases/dtd/nlmmedline_090101.dtd">
The original baseline files were replaced with new baseline files containing the corrected DOCTYPE line at 7:37pm ET
Wednesday December 17. The size of each new baseline file is slightly smaller because of the corrected DOCTYPE; the total
file byte size is now 68,645,424,728 bytes. Accordingly, the chart at
http://www.nlm.nih.gov/bsd/licensee/2009_stats/baseline_med_filecount.html
will be edited within several days.
2009 MEDLINE/PubMed Baseline Data
December 15, 2008
- AVAILABILITY OF 2009 MEDLINE/PUBMED BASELINE DATA
I am pleased to inform you that the 2009 MEDLINE/PubMed baseline files which replace all previously distributed
MEDLINE/PubMed data are now available for FTP. Licensees have been e-mailed the location of the FTP access instructions and additional information.
- 2009 UPDATE FILES
The first group of 2009 update files and the special PMID list text file (see item 3 below) are also available.
Please be sure to read the _notes.txt file that is on the server accompanying the first update file medline090594.
Update files should be processed after the baseline files in ascending file name numeric sequence (see item 3 below
for exception) to ensure that all new records are added and the most current and accurate version of each record is
retained. FTP access instructions with additional information are available at
http://www.nlm.nih.gov/bsd/licensee/access/medline_pubmed.html.
- ADDITIONAL PMID LIST FILE
**NOTE: This file may not be available until Wednesday Dec. 17, 2008**
A text file containing PMIDs of records in MedlineCitation Status = In-Process and MedlineCitation Status =
In-Data-Review that have been retained in the 2009 version of PubMed at the time the 2009 baseline files were
loaded and that are not exported to licensees in the first batch of update files is available. These records will
eventually be exported in update files as completed records in MedlineCitation Status = MEDLINE or MedlineCitation
Status = PubMed-not-MEDLINE or as deleted PMIDs in DeleteCitationSet. Licensees who wish to create a database as
close as possible to the current record content in PubMed will want to include these records now.
The file, named SpecialPubMedPMIDList_2009.txt, resides in the update file directory. Licensees may use the
Entrez Utilities to download
the records using the list
of PMIDs.
*IMPORTANT*: If you elect to add these records to your version of MEDLINE/PubMed, they must be added to your 2009
MEDLINE/PubMed database either 1) immediately after the baseline files and before any update files or, 2) after update
files medline09n0594 through medline09n0626 to ensure retaining the most current version of those records as subsequent
update files are loaded. Do not add the records identified in SpecialPubMedPMIDList_2008.txt after you have processed
medline09n0627 as this may result in retention of an earlier and inaccurate version of the records.
- 2008 MEDLINE/PUBMED FILES TO MOVE TO NEW DIRECTORY
The last 2008 update file, medline08n0876, was placed on the server for licensees December 11, 2008. The 2008 update
files have moved to ftp://ftp.nlm.nih.gov/nlmdata/.medlease2008
where they will remain for several weeks for licensees who need access to them while working with the 2009 baseline
files.
- DOCUMENTATION
Documentation for the MEDLINE/PubMed baseline database is available from links in the Data Availability and Maintenance
section of NLM’s information page for MEDLINE/PubMed licensees at
http://www.nlm.nih.gov/bsd/licensee/medpmmenu.html.
The direct URLs to those pages are
http://www.nlm.nih.gov/bsd/licensee/2009_stats/baseline_doc.html and
http://www.nlm.nih.gov/bsd/licensee/2009_stats/baseline_med_filecount.html.
Also see the MEDLINE/PubMed Maintenance Overview at
http://www.nlm.nih.gov/bsd/licensee/medline_maintenance.html for
information about and points to consider for processing update files.
- MEDLINE/PUBMED BASELINE REPOSITORY (MBR)
The 2009 baseline data will be included at a later date in the MEDLINE/PubMed Baseline Repository (MBR) resources at
http://mbr.nlm.nih.gov/. If you wish to search the baseline data via the MBR Query Tool,
be sure to use the same IP address registered with NLM for access to MEDLINE/PubMed from NLM’s FTP server.
Please do not hesitate to contact me with questions as they may arise. I look
forward to working with you during 2009 and send best wishes for peace and good health during New Year.