The Meeting was opened on Monday, September 17, 1990 by Dr. Fran�ois Conand, Directeur du Centre ORSTOM, who welcomed the participants (Annex B) from the international data centres and the oceanographic community. The Meeting was then turned over to Dr. Ron Wilson, Director of the Marine Environmental Data Service, Canada. Mr. Chris Noe of the US National Ocean Service agreed to serve as rapporteur. Dr. Wilson presented a brief history of GTSPP. His remarks emphasized the value of GTSPP as a cooperative project and described the potential benefits arising from the project to all IOC and WMO member states. The invitation for other countries to actively participate was re-iterated as this is a central theme within GTSPP.

The Meeting also discussed the user community that would benefit from the project. The Project Plan contains a diagram (section 7.1 figure 3) which shows the clients for GTSPP data. These clients include operational users (forecasting, operations, fishing), science and engineering users (national and international), and, as a particular example, the WOCE UOT Programme. Clearly all these users will benefit from the more timely, more complete and higher quality temperature and salinity database of GTSPP. In particular, GTSPP has responsibilities in the data management and flow for the WOCE UOT Programme.

The Meeting noted there is a concern that several of the unique aspects of GTSPP are not sufficiently understood in the GTSPP community. Because many of the current activities are related to putting in place the necessary mechanisms to capture the real-time data and stop losses of some of these data, impressions have developed that the major concern of GTSPP is the real-time data. The Meeting therefore re-iterated the unique aspects of GTSPP.

1. The project is assembling a complete global data set.

2. Additional levels of QC are being developed, including a review of the data by scientists knowledgeable of the ocean areas.

3. The historical data sets going back many decades are being quality controlled to the level of the most recent data.

4. Improved and more sophisticated data flow monitoring is being implemented to capture more data for the databases and prevent loss of data.

5. Projects are being developed to identify, digitize, QC, and insert into the databases data that is presently only available in manuscript form.

6. Long term time-series important to a multitude of research programmes are being identified and made available.

Dr. Wilson then outlined the main purposes of the present Meeting as follows:

* Review of the project plan and brochure for publication,
* Review of the format for the interchange of data between the various centres in GTSPP,
* Review of the GTSPP Implementation Plan,
* Review and finalization of the QC Manual,
* Review of the Continuously Managed Database and duplicate identification algorithms,
* Data Flow Monitoring,
* Product lists.

2. Presentations

Five informational presentations were made by Australia, Canada, France, USA and the USSR. These are included in this document as Annex C.

Under this agenda item the Meeting discussed the possibility of sharing software and algorithms between centres. It was felt that this was certainly possible, particularly for systems adhering to standards such as Fortran 77. It was therefore decided that an exchange of information on software systems that could be made available within GTSPP in particular and the IGOSS/IODE community in general should be carried out. Representatives of the IODE and IGOSS centres present at the Meeting were asked to submit their information to MEDS by 15 November 1990, allowing MEDS to circulate the package to IODE centres and GTSPP Scientific QC Centres by the end of November.

3. WOCE/TOGA GTSPP Interfaces

Though there has been general agreement on the fact that the GTSPP will be of considerable help for both the TOGA and WOCE Programme, some concerns were raised and comments were made including:

- that it is imperative that the data flow be clearly defined so that confusion does not arise about where the data are to be submitted.

- that the present TOGA data flow should be maintained and if possible improved within the Centres participating in the GTSPP.

- that QC planned within the GTSPP and within WOCE at the Regional Centres should be coordinated to minimize duplication and to maximize mutual benefits.

- the concept of a history table (or processing history) attached to each profile, as proposed by MEDS, has been adopted in the new data management scheme of the TOGA/WOCE Centre. As soon as the WOCE Centre is operational, it is proposed that an experimental transmission of such a flagged data set be performed. A preliminary test of readability of such a format should be undertaken before the end of the year by the NODC and the TOGA/WOCE Centre.

- difficulties with the time scale of real-time data collection, qualification and transmission to the TOGA Centre and the operational constraints on such a Centre.

- the TOGA Centre will continue to collect the real-time data set at the French IGOSS centre for operational purposes and merge it, in due time, with the MEDS qualified real-time data set for completeness of its collection.

4. GTSPP Role in the WOCE Upper Ocean Thermal Data Assembly Centre

The Upper Ocean Thermal DAC activities are planned to begin in January 1991. In the meantime, the Regional Centres are discussing quality control procedures to be employed and data transfer amongst the Regional Centres, GTSPP and the Global Centre (Brest). The specific procedures to be used and the flow of data are discussed in the report of the coordination meeting held in Tallahassee in February, 1990. The data from 1990 will be QC'd by the Regional Centres in the delayed mode, not the usual near-real time mode to start in 1991. A test of some parts of the system will be conducted on data from November, 1990 during December, 1990. WOCE will arrange for the cooperation of the science centres and define with them the goals of the test. GTSPP will use the opportunity to do some format and data exchange tests to be arranged between centres. Even though some of the funding for the DAC is still under negotiation, the Centres have sufficient resources to begin their operations.

STATUS OF CENTRES

Global Centre, Brest - Approved

Regional Centre, Miami - Approved

Regional Centre, La Jolla - Agreed in Principle, operational, outyear funding pending

Regional Centre, Hobart - Agreed in Principle, outyear funding proposed, will begin operations on schedule

GTSPP Centres (Canada, USA, Australia, USSR) - Cooperating with WOCE, funded through non-WOCE sources

Besides the usual concern that funds may not materialize, there has been some concern raised about the possible duplication of the QC being planned within the GTSPP and within WOCE at the Regional Centres. The GTSPP has immediate users that must be assured that the data they receive have been checked but the GTSPP must also provide the WOCE Centres with flagged data that has not been filtered. In view of this, some duplication or unnecessary steps, in the view of WOCE, may be taken by GTSPP. The GTSPP QC manual has been seen by the Regional Centre Directors. There are no immediate steps to be taken regarding this issue but WOCE may suggest some system changes once operations are tested.

Some national practices may vary from the suggested data flow and QC. The only known documented variance is included in the USA WOCE plan issued July 1990. Some data from the high density lines in the Pacific may be routed to a national Quality Review Panel prior to its submission to the International Centres. These data would not go on the GTS even though the USA plan indicates they would because they would already be two months old by the time they reach the international system. They would either be handled by the Regional Centres in a special run or be relegated to the delayed mode which will only be done on an annual basis. All variances must be addressed by both the WOCE VOS and DMC on a case-by-case basis to ensure the most efficient operations. Variances will have to be tolerated but they should be identified and coordinated with the international system as early as possible.

The Meeting noted the concern about possible duplication of QC between GTSPP and the WOCE Regional Scientific QC Centres. The Meeting was of the opinion, however, that the scientific QC being carried out by the WOCE Regional Centres was designed to use the knowledge of these Centres to detect the more subtle errors which a data centre would not be qualified to judge. This additional level of QC is considered to be an important element of GTSPP and the Regional Centres should be assured that this is an enhancement to the databases and not a duplication.

5. Status of BUFR

BUFR is presently in use for collecting data from FNOC and NMC into the US RNODC IGOSS to be forwarded to MEDS. MEDS is copying these files on a daily basis and has implemented software to read BUFR and translate it to a MEDS internal format. There are some problems with the FNOC data because TESAC data is broken into separate BUFR messages for temperature and salinity. For purposes of data flow monitoring, MEDS is developing software to put these messages back together. This aspect of the processing will be operational shortly.

6. Finalization of Documents

6.1. QC Manual

The IOC Senior Assistant Secretary informed the Meeting of plans by the IOC to organize an editorial group meeting to discuss the existing draft of the IODE QC Manual prepared by the IODE Task Team on Quality Control, supplemented by other published documents on quality control of oceanographic data for particular marine scientific projects. The Meeting agreed on the need for such a meeting for the purpose of making existing quality control procedures consistent and to avoid controversy. It was suggested having this meeting, not earlier than the Spring of 1991, to allow the implementation of GTSPP QC procedures that have been developed. Experience gained in the months after 1 January 1991 would allow GTSPP to make a greater contribution to such a meeting.

The Meeting recognized the valuable work undertaken by a number of groups, particularly MEDS (Bob Keeley) in developing the Quality Control Manual. A number of points arose from the discussions which address the draft manual and provide recommendations for future versions of the quality control system. In particular the Meeting agreed that the procedures described in the manual for inferring values for position, time and parameter data were more appropriate for use by the user during the analysis phase and that erroneous data should only be flagged with no attempt to infer a value.

Quality control has always been identified as a crucial element of GTSPP and a continuing dialogue between data managers and scientists (originators) will ensure the success of this phase of the project.

The Meeting recognized the urgency of ratifying the Quality Control Manual to prepare for the operational phase of GTSPP commencing in January 1991. It was also recognized that the Manual would undergo continuing review with new versions produced as required. The first version of the manual describes the mechanisms that will be used to implement changes. Version numbers will also be carried with the data.

The Meeting felt that the draft manual is directed more towards real-time data, which correctly reflects the urgency of establishing the real-time system by January, 1991. It was suggested that a manual oriented towards delayed mode data quality control also be developed. This manual would maintain the structure of the real-time manual and would reflect the greater scope available for error analysis and recovery with higher resolution delayed mode data. It was noted that much of the real-time data would eventually be replaced by higher quality delayed mode data and therefore extensive efforts in the quality control of real-time data may not be justified.

As a result of these discussions, the Working Group suggested that the existing draft manual should be identified as a real-time data quality control manual. The need to produce a delayed mode data quality control manual should be made widely known.

This would confirm GTSPP's commitment to both real-time and delayed mode data streams.

MEDS indicated that another centre should consider the development of the delayed mode manual.

In addition to the above discussions, a number of other recommendations were made. These addressed four main areas.

1. Specific modifications to the draft quality control manual. These are given in Annex I,

2. Re-orienting the existing draft manual to cover real-time data only,

3. General guidelines for development of a delayed mode quality control manual are given in Annex I,

4. Suggestions for the next version of the real-time quality control manual are given in Annex I.

6.2. GTSPP Project Plan

The Meeting reviewed the Project Plan. It was noted that the RNODC's IGOSS were identified in the delayed mode flow. In fact the RNODC's IGOSS are part of the real-time data flow. The diagrams in the plan should be modified to show this. The plan was approved for submission to the IOC for publication after these corrections and a final proof-reading by MEDS.

6.3. GTSPP Brochure

The Meeting reviewed a draft brochure produced by the US NODC. IODE XIII recommendation number 4 was referenced as GTSPP objectives to be included in the brochure. The Meeting recommended that the IOC use the GTSPP Project Plan and brochure along with a suitable letter to urge member states to become active participants of the GTSPP and support its activities with data and suitable analyses.

Comments and suggestions made by the group will be used by the US NODC to produce a revised version of the brochure before the end of September. The Meeting requested NODC to adjust the Brochure as necessary to include the unique aspects of GTSPP as re-iterated in section 1 of this report. This revised version will be distributed for review by the Chairmen of the IOC/WMO Committee on IGOSS, the IOC/IODE Committee, and the IOC Secretariat. The Meeting also requested that the US NODC consider publishing the brochure in a multi-color format on high quality paper by 31 January 1991. If this is not possible, the Meeting requested the IOC to consider the possibility of publishing the brochure with equal quality.

7. Methodology for Management of Duplicate Data and the CMD

The Meeting reviewed the various methods in use in the participating data centres for the identification of duplicates and operation of the continuously managed database. For the most part, the methods for identification of duplicates and operation of the CMD were quite similar for the phase of processing data into the databases. Descriptions of the algorithms used at VMIIGMI-WDC, TSDC, US NODC and MEDS are included in Annex G.

The TSDC and the US NODC, however, have incorporated additional checks on header information that are carried out on an annual or semi-annual basis. These checks are particularly effective in detecting problems with mis-identification of ships and major data errors.

The Meeting agreed that all necessary elements for duplicates management are being addressed by one or more of the centres. Specifically, duplicate checking for GTSPP must include:

* comparison of identifiers and space-time coordinates
* comparison of space-time coordinates
* comparison of sub-surface profiles

These tests for duplications are to be coupled in the most appropriate manner with the QC necessary to make them effective.

During the implementation of GTSPP, the real-time data centre (MEDS), the CMD Centre (US NODC), and the TOGA/WOCE Centre (BREST) are to exchange information with all participants on progress with the implementation and performance of duplicate management and CMD systems.

Since the CMD and TOGA/WOCE centres receive data produced by the previous centre in the data flow, each can evaluate the success of the processing in the previous centre and report back problems. In this manner the algorithms can be improved and standardized to the extent possible. Although this may seem to be an unsatisfactory solution in that immediate standardization is not achieved, the Meeting felt that this goal could only be achieved through operational experience.

As a final point, the TOGA/WOCE Centre and the CMD Centre - NODC were encouraged to collaborate on operating the best possible annual or semi-annual review of header data to achieve elimination of duplicates and resolution of the arrival of delayed mode data with the replacement of the real-time data.

The performance of the duplicate detection/CMD operating systems are to be reviewed at the next GTSPP Meeting.

8. Review of Data Exchange Formats to be used Between Centres

The Meeting discussed the matter of formats for the exchange of data between the various data and scientific QC centres of GTSPP. It was recalled that the GTSPP had considered the use of the WMO BUFR format for the exchange of data within GTSPP and GF-3 for the exchange with others.

It was noted that BUFR is still evolving and that the operational nature and time scale of some aspects of GTSPP does not permit waiting for finalization of BUFR. The present use of BUFR for input of data to GTSPP will be reviewed on a regular basis and GTSPP will consider further implementation of this format for data exchange between centres and for data archival.

The Meeting also felt it appropriate that GTSPP inform the WMO of the features needed in an ocean format to be implemented in GTSPP. WMO can then decide whether to include such features as BUFR is developed.

* GTSPP needs a stable format that will not change in a fundamental manner for a number of years.

* The format must provide for formatting and compression of oceanographic profile data which are hierarchical logical structures, header parameters (space-time characteristics and general information) and the variable portion (aggregates of observations at depth levels).

* Only necessary sections of the format need be included. (For example, using only section 4 of BUFR for formatting of historical T/S data sets; with a fixed list of parameters, as a subset of BUFR.)

* Inclusion of code tables for oceanographic data.

In regard to the formats to be used between data centres and scientific centres in GTSPP, it was decided that there was not sufficient time to prepare a single agreed upon format before operation begins in January, 1991. Centres will therefore exchange data in a mutually convenient format which will carry the data and metadata provided for in the MEDS format. These mutually convenient formats are to be negotiated and finalized between centres involved in the real-time flow before the end of October, 1990.

Simultaneously, the development of a GF-3 subset for GTSPP data will be carried out. The USSR will develop, for consideration by GE-TADE at its meeting in November, 1990, a draft subset based on the present subset for BATHY/TESAC data and the content of the MEDS format. This subset can then be considered for adoption by GTSPP as the exchange format between centres at the next GTSPP Meeting. Additionally, the subset can serve as the format of data to users from various IOC and WDC centres. If the subset does not prove suitable for use in the operational flow, and if BUFR does not progress so that it can be implemented for this purpose, then GTSPP will have to pursue development of its own agreed format and supporting software package. The prime reason that the GF-3 subset could not be used for such exchange would likely be its volume, as it is intended that it be telecommunicated.

9. GTSPP Monitoring of Real-Time Data Flow

MEDS presented a progress report and plans for comparing IGOSS data sets from several centres for the period of September, October and November, 1989. Data has been received from the USSR, France, USA and Canada. The RNODC IGOSS in Japan and the AODC will also be asked to submit data for this time period. Due to lack of knowledge on availability of data in the Southern Hemisphere, MEDS will approach the RNODC-SOC Argentina to obtain information on data submissions to IGOSS. GTSPP data monitoring information should be passed to interested South American centres as there is apparently a serious lack of knowledge on what BATHY/TESAC data are available.

After the completion of conversion software, the three reports outlined in Annex F will be prepared by MEDS before December 31, 1990 for review by member states. These reports will allow comparisons of BATHY and TESAC bulletins, provide counts of casts, and demonstrate success of GTS distribution of real time data.

The objectives of the monitoring programme cover three aspects:

1. To assist with the development of duplicate algorithms;

2. To establish areas within the communications system that need improvement;

3. To acquire the most complete data set.

The Meeting suggested that if the initial monitoring programme was successful that it should be continued on a regular basis and this would be a valuable contribution to the project. MEDS has agreed to continue the monitoring for as long as needed.

10. Historical Aspects of the GTSPP

The US, USSR, and ICES will continue to develop the historical aspects of the GTSPP project. They will prepare a strategy and plan for the location, acquisition, QC, and loading of historical temperature and salinity data, including suggested centres to undertake the various tasks and based on the acquisition of long term time-series data as a priority. The Meeting participants noted the value of this project and recommends that other countries become involved.

The USSR presented a proposal for joint USSR and USA work to build a file of temperature and salinity data which were collected at long term time-series stations. The first stage is to compose lists of data at the catalog level to identify those data which can be included in the time-series. The Meeting recommended that GTSPP participants study the list of oceanic time-series stations contained in the document submitted to the Meeting by the USSR and contained in Annex J . Comments are to be provided to the USSR by the end of December, 1990. A meeting in the USSR of representatives of USSR and USA is planned for June 1991 to define the final list of approximately 100 time-series stations and sections.

Criteria for including stations in the data set are that they were occupied for at least 10 to 15 years, that the frequency of observations is sufficient, and that continued data collection is expected.

Several GTSPP Meeting participants had attended the US Workshop on Ocean Data Archaeology, September 13-14, 1990, and reported informally on results of that Meeting. Final recommendations of that Meeting are not yet available, but the group welcomes the work being done to develop historical data sets and looks forward to incorporation of the T/S data in the GTSPP databases.

The Meeting noted that many of the stations to be contributed to the GTSPP database (in delayed mode) would contain additional parameters, in particular oxygen and nutrients, but no action has been taken to preserve these parameters in the database. Furthermore, it was noted that the Archaeology workshop had agreed that, for a number of reasons, these additional parameters when available should be included in the historical data set. Since some sources of data important to the success of GTSPP (e.g., ICES and the USSR Sections Programme) would contain significant amounts of these additional parameters, it was agreed that methods of dealing with them should be discussed at the next GTSPP Meeting.

11. GTSPP Products

This item was introduced by the USA. The importance of GTSPP products was outlined and their value in an operational and scientific context was described.

The participants discussed the need to identify specific products that were unique to GTSPP. It was felt that there could be some confusion as to the identity of some products that were developed under the auspices of other projects such as WOCE. The USA was requested to develop a discussion document to address these problems. This document would describe:

1. Which immediately available products would be considered,

2. When these products would be produced,

3. What is the frequency of product distribution,

4. Intended products.

Due to the limited time available before GTSPP commences in January, 1991, a list of products that would be available immediately was obtained from each representative. This list is included as Annex H.

The Meeting agreed that MEDS would provide a paper on GTSPP products for presentation at the IGOSS Products Workshop (15-19 April, 1991, Tokyo).

12. Communications

Under this agenda item the participants reported on progress and plans for implementing enhanced electronic mail and network connections for GTSPP.

AODC is currently on Telemail (Omnet) for electronic mail and is investigating a connection to the Australian Academic Research Network which is expected to have a node on SPAN. However, data exchanges with other GTSPP participants will proceed using magnetic tapes or floppy disks for at least another year.

Canada is currently on Telemail (Omnet) for electronic mail and has had success in exchanging telexes with the USSR in electronic form. MEDS has acquired an additional VAX 750 computer which will be used as a SPAN node. This facility should be operational by the end of October, 1990.

The TOGA Sub-surface Data Centre in France is currently on Telemail and BITNET. TSDC has a low speed connection to SPAN and is investigating a higher speed connection to SPAN.

The USA is operational on both Telemail (Omnet) and SPAN. The USA also can exchange telexes in electronic form with the USSR.

The USSR is operational on a national network IASNET. Also they have successfully connected to the Oceanic System at the University of Delaware. Connections to the US and Canada with E-Mail and data transfer are being explored through the San Francisco - Moscow Teleport (SFMT).

The USSR suggested a single mail box for all GTSPP participants would be appropriate and should be developed.

13. Approval of GTSPP Implementation Plan

The GTSPP Implementation Plan was reviewed and found to be generally acceptable in form and content. The chairman was asked to update the plan to include the additional action items decided on at this Meeting and to include new target dates, also as decided at the Meeting. The updated version of the Implementation Plan is included as Annex D.

14. Other Business

The representative of WOCE, Mr. Jim Crease reported on two sources of errors that are occurring with XBT probes. The first problem is that the fall rate for certain probes has been discovered to be in error leading to significant discrepancies between the actual and reported depths. Another problem is the so-called "bowing" of the XBT trace in the surface layer due to a design problem in the system. Discussions concluded that the GTSPP should include in its format and databases, space for identification of the type of probe and type of shipboard receiver used and the version of the model of each. This additional information will not be available in the real-time data streams as the BATHY and TESAC code forms do not provide for it. If TOGA/WOCE or other programmes decide to gather probe details, GTSPP should be prepared to receive it and add it to the database.

As a final item of business, GTSPP will investigate the possibility of having the members bring copies of their documents to the next Meeting on diskettes. If feasible, this approach will greatly reduce the work involved in preparing the summary report and annexes.

15. Approval of the Summary Report

The report was reviewed page by page. The Meeting agreed on the contents generally, but instructed the chairman, rapporteur and IOC Secretariat to restructure certain sections to better present the results of the Meeting. It was agreed that this would be done by the end of September, 1990, and the report would be circulated to the participants for final comments. Final comments are to be provided to MEDS by the 15th of October, 1990, and the report will be passed to the IOC Secretariat by the end of October.

It was noted by the participants that the agenda for the Meeting was too large for a three day meeting and that the next meeting should be planned as a four day meeting.

16. Time and Place of the Next Meeting

The USSR representative offered to investigate the possibility of holding the next GTSPP Steering Committee Meeting in Obninsk in the USSR in July, 1991. The Meeting expressed its appreciation to the USSR for their offer.

17. Closure of the Meeting

The Meeting was closed at 1645 on September 19, 1990.

ANNEX A
AGENDA

1. Approval of Agenda and Workplan

2. Presentations by GTSPP Participants

3. WOCE/TOGA GTSPP Interface

4. GTSPP Role in the WOCE Upper Ocean Thermal Data Assembly Centre

5. Status of BUFR

6. Finalization of Documents

7. Methodology for Managing Duplicate Data and the CMD

8. Review of Data Exchange Formats to be Used Between Centres/Scientific Centres

9. GTSPP Monitoring of Real Time Data Flow

10. Historical Aspects of the GTSPP

11. GTSPP Products

12. Communications

13. Approval of GTSPP Implementation Plan

14. Other Business

15. Approval of Summary Report

16. Time and Place of Next Meeting

17. Closure of Meeting

ANNEX B
LIST OF PARTICIPANTS

Mr. Jim CREASE
WOCE International Project Office
Institute of Oceanographic Sciences
Deacon Laboratory
Wormley, Godalming
SURREY GU8 5UB
UNITED KINGDOM

Mrs. Valery DETEMMERMAN
ITPO, WMO
CP 2300
1211 Geneva 2
SWITZERLAND
Phone : 41-22-730-8242 or 8234
Fax : 41-22-734-2624 (2)
Telemail : V.LEE

Mr. Harry DOOLEY
ICES Secretariat
Palaegade 2-4
1261 Copenhagen
K. DENMARK
Phone : (45) 3154225
Telex : 22498
Fax : (45) 33934215
Telemail : ICES.DENMARK

Mr. Doug HAMILTON
DOC/NOAA/NESDIS
National Oceanographic Data Centre
1825 Connecticut Avenue N.W., Room 422
Washington, D.C.
20235 USA
Phone : 202-673-5636
Fax : 202-673-5586
Telemail : NODC.WDCA

Mrs. Melanie HAMILTON
DOC/NOAA/NESDIS
National Oceanographic Data Centre
1825 Connecticut Avenue N.W., Room 422
Washington, D.C.
20235 USA
Phone : 202-673-5636
Fax : 202-673-5586
Telemail : NODC.WDCA

Mr. Gary HOPWOOD
Australian Oceanographic Data Centre
P.O. Box 1332
North Sydney
NSW 2059
AUSTRALIA
Phone : (2) 925-4220
Fax : (2) 925-4835
Telemail : B.SEARLE

Dr. Catherine MAILLARD
IFREMER/SISMER
BP 70
29280 PLOUZANE
FRANCE
Phone : (33) 98-22-42-79
Fax : (33) 98 22 45 46
Telemail : IFREMER.BNDO

Dr. Vyacheslav I. MELNIKOV
Chief of Laboratory of Specialized
Project Programmes Technology
All-Union Research Institute of
Hydrometeorological Information
- World Data Centre B
USSR State Committee for Hydrometeorology
6, Korolev Street
Obninsk, Kaluga Region
249020 USSR
Phone : (08439) 259-09
(095) 546-39-50
Fax : 255-22-25
Telex : 412633 INFOR
E.mail IASNET : WDCBI

Mrs. Martine MICHOU
ITPO
WMO PO Box 2300
CH6121 Geneva 2
SWITZERLAND
Phone : 41-22-7308-430
Telemail (OMNET) : INTL.TOGA

Mr. Nikolai MIKHAILOV
Chief of Laboratory, National
Oceanographic Data Centre
All-Union Research Institute of
Hydrometeorological Information
- World Data Centre B
USSR State Committee for Hydrometeorology
6, Korolev Street
Obninsk, Kaluga Region
249020 USSR
Phone : (08439) 259-09
(095) 546-39-50
Fax : 255-22-25
Telex : 412633 INFOR
E. mail IASNET : WDCBI

Mr. Chris NOE
DOC/NOAA/NOS
Office of Ocean Services
9010 Executive Blvd.
Room 923
ROCKVILLE, MD.
20852 USA
Phone : (301) 443-8110
Fax : (301) 443-8208
Telemail : C.NOE

Mr. Iouri OLIOUNINE
IOC Secretariat
UNESCO
7, Place de Fontenoy
75700 Paris
FRANCE
Phone : (331) 45-68-39-63
Telex : 20-44-61 Paris
Fax : (331) 40-56-93-16
Telemail : IOC.SECRETARIAT

Mr. Jean-Paul REBERT
Centre ORSTOM de Brest
IFREMER BP 70
29280 Plouzane
FRANCE
Phone : (33) 98-22-45-13
Fax : (33) 98-22-45-45
Telemail : ORSTOM.BREST

Mr. Ben SEARLE
Australian Oceanographic Data Centre
P.O. Box 1332
North Sydney
NSW 2059
AUSTRALIA
Phone : (2) 925-4230
Fax : (2) 925-4835
Telemail : B.SEARLE

Dr. Yuri SYCHOV
Head
National Oceanographic Data Centre
All-Union Research Institute of
Hydrometeorological Information
- World Data Centre B
USSR State Committee for Hydrometeorology
6, Korolev Street
Obninsk, Kaluga Region
249020 USSR

Mr. Shin TANI
Chief, International Programmes Group
RNODC - 16055
Japan Oceanographic Data Centre (JODC)
5-3-1, Tsukiji, Chuo-ku
Tokyo 104
JAPAN
Phone : 81-3-541-3811 ext 132
Fax : 81-3-545-2885
Telex : 252-2452
Telemail : T.MORI/OMNET

Mr. Bert THOMPSON
WOCE International Project Office
Institute of Oceanographic Sciences
Deacon Laboratory
Wormley, Godalming
SURREY GU8 5UB
UNITED KINGDOM

Dr. J.R. WILSON
Director
Marine Environmental Data Service
Physical and Chemical Science
Fisheries and Oceans Canada
1202-200 Kent Street
Ottawa, Ontario
K1A 0E6 CANADA
Phone : 613-990-0264
Fax : 613-996-9055
Telex : 613-053-4228
Telemail : R.WILSON.MEDS

ANNEX C.1
AUSTRALIAN PROGRESS IN IMPLEMENTING GTSPP

A cooperative effort between the Australian Oceanographic Data Centre (AODC), the Bureau of Meteorology Research Centre and the CSIRO, Division of Oceanography has been arranged through a series of meetings and discussions to provide a WOCE Thermal Data Assembly Centre for the Indian Ocean. This venture will provide the basic oceanographic data management, quality control and scientific validation that is necessary to provide research quality data requirements for the region specifically for the World Ocean Circulation Experiment (WOCE). The following structure and plan resulted from a meeting held in Hobart on 6 September 1990 between Rick BAILEY, Gary MEYERS (CSIRO), Neville SMITH (BMRC) and Ben SEARLE (AODC).

INTRODUCTION

The Australian WOCE DAC will assemble and provide scientific data quality control for thermal data in the region 15 E to the dateline and 26 N to Antarctica. The activity will be maintained by collaboration among AODC, BMRC, and CSIRO, following as much as possible the guidelines set by the WOCE International Project Office (1990). Some of the activity of a WOCE DAC is already carried out within these organizations. Additional funding to raise the level of activity has not as yet been obtained, so that CSIRO and BMRC do not have on-site people to do the work. In order to get started, the participants have agreed to begin in 1991 as outlined below in a way which will partially fulfill the requirements of a WOCE DAC.

CLIMATOLOGY

It has been agreed that development of a new climatology of upper ocean thermal structure for the region of interest will be required. This is because existing climatologies are inadequate in that they do not include errors for the estimated values and they are biased relative to means of recently collected thermal measurements. The new climatology will be developed at BMRC using all the historical thermal data available at NODC.

REAL-TIME DATA STREAM

The real-time data stream will be the responsibility of BMRC and AODC. Processing will be automated as much as possible in order to economize on manpower. The WOCE IPO has recommended three QC steps. The Australian DAC will initially only undertake two steps.

1. Statistical testing which will generate flags 1, 3 and 4 as defined by the Quality Control Manual of the GTSPP (R. Wilson, personal communication, 1990) with the following rules:

* At each test level, data which passes a 3-sigma will be given flag 1.

* Data which differs from the mean by 3 to 6-sigma will be given flag 3. Data which differs from the mean by more than 6 sigma will be given flag 4.

* The data which emerges from the statistical test with a flag of 1 or 3 will then be submitted to the Mapping test.

2. Mapping test will determine by an objective method whether or not each observation is compatible with neighboring data. The mapping will be undertaken using the Oceanic Subsurface Thermal Analysis Scheme, a BMRC/CSIRO system developed by J. BLOMLEY, N. SMITH (BMRC) and G. MEYERS (CSIRO). The Mapping Test will be able to downgrade profiles which originally had a flag of 2, indicating that it was not compatible with neighboring data.

These two levels of tests will be undertaken at least during the first year of operation or until additional funding can be obtained.

OCEANIC SUBSURFACE THERMAL ANALYSIS SCHEME

This scheme arose from the need for timely analysis of oceanic data in the Australian region, particularly from the tropical Pacific and Indian Oceans. The method of univariate statistical interpolation is used to interpret information from observations distributed irregularly in space and time to a regular space-time grid. The scheme provides measures for quality control of data against climatology.

DELAYED MODE STREAM

The delayed mode stream will be primarily the responsibility of CSIRO and AODC. A large portion of the data in the Australian centre's region of interest is generated by CSIRO's TOGA VOS network, which will be given the statistical test and profile review as soon as possible after the data are collected. These data will then be submitted to the central WOCE thermal assembly centre and merged with other data for the region. The completed data set will be sent to Australia annually, where the statistical and mapping tests will be run on the whole data set and profile review given to the non-CSIRO data. This work will be carried out by AODC personnel under the supervision of CSIRO scientists and with the assistance of BMRC for mapping. While this is not an ideal arrangement due to the great distances between Sydney (AODC), Melbourne (BMRC) and Hobart (CSIRO), it is the only one feasible with present funding. It is viewed as an interim measure. Ultimately, we are aiming for high level scientific appraisal of the delayed mode stream. In the meantime, the following steps will be taken to build-up oceanographic expertise at AODC and enhance supervision by oceanographers. AODC personnel will have an extended training period in Hobart to become familiar with the QC procedures, in particular the third step of the procedure recommended by the IPO-Profile review. During the period, they will also set up the CSIRO QC software system for transplanting to AODC. The delayed mode work load will come in a large lot, once per year. While preliminary work will be carried out in Sydney, supervision by CSIRO will be maintained by AODC personnel visiting Hobart during this part of the year and bringing material for discussion with CSIRO staff. A crucial part of this scheme is working out a mechanism which will allow CSIRO to guarantee the research quality of the data.

AODC'S ROLE

AODC will play a role in this system by providing two personnel to undertake the initial levels of quality control check and the administrative aspects required for data management. AODC will log the data and keep track of how it is handled, including reference to quality control flags. An audit system will be put in place to assist this process.

BMRC AND CSIRO'S ROLE

Both BMRC and CSIRO will provide the essential scientific input into this system. Their expertise will provide the necessary level of confidence to the data from both real-time and delayed mode sources.

AREAS FOR FURTHER DISCUSSION

What should be done about the variation of quality-flags with depth if one overall flag is to be assigned to the profile?

What should be done with a profile that receives a 3 in the statistical test but passes the mapping test?

Several solutions were discussed, however the problem seems to need further discussion with the DAC scientists for other oceans.

SUMMARY

Australia can make a major contribution to WOCE by assisting with the quality control and management of data from our region of interest. Both CSIRO and BMRC have recognized expertise in their fields and, together with the data management experience and developing capabilities of the AODC, will be able to provide a well-balanced approach to this problem.

REFERENCES:

WOCE International Project Office (1990): WOCE VOS/DMC Ocean Thermal Data Assembly Centre Meeting. WOCE-IPO, Wormley, 27 pp.

An Oceanic Subsurface Thermal Analysis Scheme (1989); BMRC Research Report No. 18; J.E. BLOMLEY, N.R. SMITH, G. MEYERS.

ANNEX C.2
MEDS PROGRESS IN IMPLEMENTING GTSPP

During the interval since the Second ad-hoc Meeting on the GTSPP held in MEDS in July 1989, MEDS has worked on several items related to development of the systems that will be used for management of the global real-time BATHY and TESAC data set, and on documents related to the operation and structure of GTSPP.

The GTSPP QC Manual has been updated to incorporate comments from the previous GTSPP Meeting and from various reviewers in WOCE and TOGA. In particular, the diagrams that detail the operation of the tests have been added. MEDS also prepared diagrams for the GTSPP Project Plan and incorporated the diagrams into the text prepared by Ben SEARLE of AODC. MEDS also has prepared and submitted the current version of the Implementation Plan for consideration by the meeting.

In regard to systems development, a proposed format for use in the archival of the GTSPP data and for exchange between centres was developed, tested and circulated for comment. Further work included the development and testing of a duplicates management system for use as a QC tool for identification of duplicates, and as a fundamental piece of software for data flow monitoring and operation of the Continuously Managed Database (CMD). A description of the duplicates management algorithm and the CMD strategy has been prepared for consideration by the meeting.

ANNEX C.3
TSDC PROGRESS IN IMPLEMENTING GTSPP

During the period extending from the meeting held in Ottawa to now, the major task undertaken by the TSDC has been to analyse, design and test on a breadboard model the feasibility of a global upper layer data management model for WOCE. This new system should be operated on a work station (SUN) using ORACLE as the DBMS. The TSDC programmer has been trained on this system which proved to work efficiently if correctly sized. This system will replace that designed for the TOGA data set management, and include the TOGA base, when implemented.

A new data dictionary has been prepared, taking into account:

- the experience gained during the TOGA period on the missings and deficiencies of the present format.

- the amount of new information to be transferred in the database, information relative to the activity of the WOCE data analysis Centres in terms of data qualification.

- the necessary compatibility with the existing formats used by the other participants to the GTSPP, and mainly the MEDS format to guarantee a simple data exchange policy.

- the existence of the GF3 format data exchange tables which have been adopted to the largest possible extent.

The current NODC UBT format, TSDC format and MEDS provisional (April version) format have been loaded into a data base for compatibility checking. The final format will be adopted after this meeting.

Meanwhile the TSDC continued its activities of data collection and exchange for TOGA and has documented its procedures of quality and duplicate controls in a report distributed in January at the GTSPP Workshop in New York.

The TSDC catalogs and inventories have been implemented on the OMNET mail service where they can be consulted through an access menu and are updated monthly.

The transfer of the real-time Atlantic data set to the experimental numerical model run at the LODYC laboratory in Paris has been working on a monthly basis since January 90. The feedback in terms of data qualification by this model has, however, not yet been implemented.

The historical 1979-1985 BNDO French cruises data set has been sent to NODC for inclusion in the historical data file.

ANNEX C.4
US PROGRESS IN IMPLEMENTING GTSPP

The U.S. National Oceanographic Data Center (NODC) is moving ahead in several areas to be prepared for full GTSPP implementation. In this pre-GTSPP environment, procedures and systems were set up to acquire near real-time BATHY messages and delayed mode data, create ocean basin files, and deliver data to Regional Science Centres once a month, beginning in April 1990. Data volumes for 1990 are outlined below.

Basin/Month April May June July August

Atlantic 708 2,086 1,798 1,902 1,766

Indian 478 507 257 138 312

Pacific 2,809 3,409 2,746 2,662 3,366

The flow of data to MEDS was started in June. Both BATHY and TESAC data flow from the U.S. National Meteorological Center (NMC) and the Navy Fleet Numerical Oceanography Center (FNOC) through NODC to MEDS in BUFR format.

Data formats for managing GTSPP data at NODC have been evaluated, and the MEDS format was chosen. This format meets all GTSPP requirements. In addition, NODC use of the MEDS format makes it possible to freely transfer data from MEDS to NODC with no data format transformations needed.

The Continuously Managed Database (CMD) is being developed at NODC in two stages. In the first stage, a modified TOGA data management system, using the MEDS format, will be used to manage data and remove duplicate records. In the second stage this will be managed with a database machine.

US scientists Warren White and Bob Molinari are working on a common set of QC procedures for scientific quality control to be applied to GTSPP data.

ANNEX C.5
USSR PROGRESS IN IMPLEMENTING GTSPP

A Project of compiling a global deep-sea data for the world ocean (GLOBAL) is being implemented at the VNIIGMI-WDC Oceanographic Data Centre, USSR in 1989-1990.

1. Data Sources Used

The set GLOBAL contains the following deep-sea observation types:

Hydrological (Nansen bottle or similar instruments): temperature, salinity, PH, O2, alkalinity, nitrites, nitrates and other hydrochemical characteristics;

Bathythermograph (MBT, XBT) - temperature;

Sounding sets (of CTD and STD type) - temperature and salinity.

Data sources:

BATHY and TESAC messages, via GTS;

Delayed data from the USSR r/v's, in the form of tables or on magnetic tape;

Historical data in the form of tables or on magnetic tape:

a. Data for the USSR which have been submitted to VNIIGMI-WDC;

b. Foreign data made available to VNIIGMI-WDC;

c. WDC-B1 data;

d. Data of other centres in the USSR.

The total historical data exchange as of 1990 amounts to about 2 million soundings, out of which 0.7m to 0.8m are BT soundings.

2. Data Set Composition and Structure

GLOBAL is based on cruise observations which means that data from the above sources are first recorded on magnetic tapes organized by cruises. In essence, the set GLOBAL is a derived one computed as a suite of sets arriving in sequence:

Level I (hydrological and hydrochemical data arranged by Marsden squares);

Level II (temperature and salinity data for standard depths);

Level III (climatic characteristics of temperature and salinity in Marsden squares);

Level IV (climatic characteristics of temperature and salinity in grid point form).

For Level I and II, data sets respective metadata sets are to be created.

At present the preparation of GLOBAL, Level I data with minimum quality control (mainly syntactical due to the great number of input data formats) is being finalized.

3. GLOBAL Data Recording Format

For recording GLOBAL I data a format is used based on the Hydrometeorological Data Description Language (developed at VNIIGMI-WMC, USSR in 1977) with the following characteristics:

a. Data organization and data types - Physical records of variable length are used (one sounding - one record). The data are binary coded and justified to the byte boundary. The records are ordered according to key parameters: region, number of 10, 5 or 1 Marsden square, year, month, day and time of observation;

b. Record structure - Record Header, spatial, temporal characteristics and general information (instrument code, data sources and so on). The variable record portions are recurrent observation aggregates (cycles) for depths with references to the observation parameter list. Data aggregate type: aggregate length in bytes, observation depth level, water temperature value and characteristic; parameter code (KP1) according to the list, parameter value (P1) its quality characteristic (P1), then KP2, P2, P2, etc...

Record format on the basis of BUFR - i.e., BUFR subset with a fixed selection of parameters (without Section 3. Data Description) is being developed for the sets GLOBAL II and III.

4. Data Quality Control

GTSPP quality control procedures are to be used for developing a set of quality control subroutines based on algorithms contained in the GTSPP Data Quality Control Manual (11 November 1989 version) is being finalized:

1.1. Impossible time
1.2. Impossible coordinates
1.3. Land - sea
1.4. Impossible depth
1.5. Instrument code
2.1. Impossible parameter values
2.2. Profile envelope
2.3. 10 - square profile envelope
2.4. Regional profile envelope
2.5. Increasing depth
2.6. Constant profile
2.7. Spike (top, within the profile, bottom)
2.8. Kink
2.9. Density Inversion

The subroutines were developed in FORTRAN-77 in the form of a program package called by the statement CALL. The subroutine does not provide for the interactive mode with the operator performing data control. The control results are accumulated in error sets which are the program output parameters.

The major package functions are:

Data control process management through the vector (set) containing the ordinal numbers of control procedures to be carried out;

Detecting and eliminating duplicate oceanographic stations;

Data control proper;

Obtaining station of the quality control results and visualization of control results.

Duplication checks are made with the special algorithms (see Annex G.4). Quality flags resulting from the control are similar to those used in GTSPP.

5. Data Base Management System

To computerize the maintenance of GLOBAL in the form of database as well as their management and use for solving system (ODPS) is being developed in a relational DBMS system.

The database consists of a set of tables connected with one another through some common fields. Maintenance of 16 table types containing metadata and observation data proper or their processing results are supposed to be maintained. The tables are oriented to solving certain tasks:

Computerization of database composition and contents monitoring for databases and obtaining fragments database;

Applied statistical processing of data obtaining climatic characteristics.

6. Statistical Analysis of Water Temperature and Salinity (methods and software)

Methods for obtaining point estimates of sea water temperature and salinity climatic characteristics on seasonal and interannual scales have been developed for statistical processing of data.

Long-term observation data are used which are divided into set size squares. Statistical analysis technique is based on the use of periodic non-steady random processes as applied to studying natural processes characterized with certain rhythm. Due to the small sampling size (usually 8-10 values) robust estimation is used. The methods are realized in the program package (called with the help of CALL) in FORTRAN-4. In the near future translating the package into FORTRAN-77 is planned. The above considered programs and methods form the basis of the ODPS data processing subsystem.

ANNEX D
THE GTSPP IMPLEMENTATION PLAN

INTRODUCTION

The implementation of GTSPP can be described in terms of the following 4 general tasks.

1. Develop an improved real-time data capture to prevent loss of data; provide an awareness of the existence of data to assist with later acquisition of the data; provide more data to operational programmes and managers of research programmes.

2. Implement a documented and uniform QC for data being archived in the CMD including a rational approach to the management of duplicates, and the inclusion of the necessary meta data in the database and associated files.

3. Identify sources of delayed mode and historical temperature and salinity data, acquire and digitize it where necessary, QC and input it to the CMD.

4. Develop, prepare and distribute data, data products, and data flow monitoring products at all time scales to meet the needs of users at all time scales.

OVERVIEW OF PRIORITIES

A major cause for concern is the possible loss of temperature and salinity data because it is used for its primary purpose and is not archived or submitted to the IODE systems.

This is primarily a problem for operational data. Data submitted through IGOSS may not reach the RNODCs for IGOSS because of routing failures on the GTS. Therefore, the establishment of data flow monitoring and reporting for the real-time data flow is of highest priority to prevent data losses.

The second highest priority is to implement an agreed, extensive and uniform system of quality control, including duplicates management for data being input to the GTSPP temperature and salinity database. Included in this QC is the use of the data by competent scientific organizations to prepare analysed products, thus identifying the more subtle errors that cannot be identified by the data centre QC.

Once these two tasks are in hand and some simple products and data retrievals are in place, GTSPP can begin to serve clients in a useful manner. The first such clients include the WOCE and WCRP in general and the WOCE Upper Ocean Thermal (UOT) programme in particular. This implementation plan has been designed to begin to deliver services to the WOCE UOT beginning in January 1991 as agreed at the WOCE UOT meeting, Tallahassee, FL, USA, 15-16 February, 1990.

The implementation plan for GTSPP has been developed in terms of the eight elements of GTSPP in the following sections. Target dates are included. In general terms, the first two tasks described in the previous sections are moderately well developed and will be implemented to a great extent by January 1991. It now appears that scientific quality control for all areas for temperature and salinity will not be in place by that time and will have to follow later. It is assumed that once this element is implemented, all data back to January 1, 1990, the beginning of WOCE, will be processed through scientific QC.

IMPLEMENTATION OF THE ELEMENTS OF GTSPP

1. Near Real-Time (RT) Data Acquisition

- MEDS to implement processing of Canadian GTS, US NWS, and FNOC real-time data streams received from the US RNODC for IGOSS with QC according to the GTSPP QC Manual, and transfer of data on a weekly basis to NODC for inclusion in the CMD and for distribution to the Scientific QC Centres.

Target Date: January 1, 1991
Lead: MEDS

- The USSR RNODC for IGOSS will try to implement transfer of data to and from MEDS on a weekly basis by electronic communications in the manner of the data exchanges with the US RNODC for IGOSS. It will be necessary to investigate methods of implementing and funding the communications.

Target Date: January 1, 1992
Lead: USSR, MEDS, IOC Secretariat

- The Chairman of the GTSPP Steering Committee will explore with the Director of the RNODC for IGOSS (Japan), the possibility and means of receiving data from that Centre on a regular basis to augment and enhance the real time data set.

Target Date: November 30, 1991
Lead: Chairman, GTSPP Steering Committee

- MEDS and NODC to conduct a test of the data flow, formats, and QC system with the WOCE Scientific QC Centres and the TSDC during the month of December 1990 using the November 1990 dataset.

Target Date: December 1990
Lead: MEDS, NODC

- NODC to implement processing databases of monthly real-time BATHY and TESAC data files to the WOCE Scientific QC Centres (AOML, SIO, CSIRO), and return of flags to the database.

Target Date: January 1, 1991
Lead: US NODC

- As a one-time special project, MEDS, USSR, France, Australia and USA to compile best real time data sets back to January 1, 1990, and produce comprehensive data flow monitoring reports.

Target Date: End 1991.
Lead: MEDS

- As a one-time, special project, NODC is to update the 1990 near real-time and delayed mode science QC data.

Target Date: End 1991.
Lead: US NODC

2. Delayed Mode Data Acquisition

- Participating Member States to contact countries in their region, as agreed at the First ad-hoc Meeting of the GTSPP, and encourage and arrange more complete and rapid submission of delayed mode data to the GTSPP or to the IODE system

Target Date: March 1, 1991
Lead: All Participants

3. Communications Infrastructure

- US RNODC for IGOSS to establish links to provide the NWS and FNOC data streams to MEDS for implementation of the daily BATHY and TESAC data acquisition and QC.

Target Date: Complete
Lead: NODC

- MEDS to complete establishment of an efficient link to the US SPAN network for the transfer of real-time BATHY and TESAC data to and from the US RNODC for IGOSS.

Target Date: October 31, 1990
Lead: MEDS

- AODC to specify means of exchange of the real-time BATHY and TESAC data with other GTSPP participants and the WOCE Scientific QC Centre in CSIRO.

Target Date: October 31, 1990
Lead: AODC

- France and USSR to specify communications links that are available for data exchange with other participants in GTSPP and requirements for data exchange on each. (May be exchanged by tape or floppy disks if appropriate or necessary.)

Target Date: October 31, 1990
Lead: France and USSR

4. Quality Control

- GTSPP to complete development and review of QC Manual for Real-Time Data and submit to the IOC for publication.

Target Date: October 31, 1990
Lead: MEDS

- MEDS and NODC to implement the tests and flagging of data as described in the GTSPP QC Manual in the processing and archival of the temperature and salinity data in the CMD.

Target Date: April 1, 1991
Lead: MEDS and US NODC

5. Continuously Managed Database

- GTSPP to complete review of the initial strategy and algorithms for the operation of the continuously managed database.

Target Date: September 30, 1990
Lead: GTSPP III Meeting, Brest, France, 17-19 September 1990.

- MEDS and US NODC to implement the strategy and algorithms for the operation of the continuously managed database, evaluate its adequacy during operation of the project, and report and submit recommendations to the next meeting of the GTSPP Steering Committee.

Target Date: Implementation January 1, 1991, next GTSPP Steering Committee meeting for balance.
Lead: MEDS and US NODC

6. Project Products and Information

- GTSPP to complete development and review of the Project Plan and submit it to IOC for publication.

Target Date: October 31, 1990
Lead: MEDS.

- GTSPP to complete development and review of the Brochure and submit it to IOC for publication.

Target Date: December 31, 1990
Lead: U.S. NODC.

- GTSPP to review and update the Implementation Plan.

Target Date: September 30, 1990
Lead: GTSPP III Meeting, Brest, France, 17-19 September 1990 and subsequent Steering Committee meetings.

- GTSPP to prepare an initial list of GTSPP products to be prepared and circulated during the period from January 1, 1991 until a new list is available.

Target Date: September 30, 1990
Lead: GTSPP III Meeting, Brest, France, 17-19 September 1990.

- Review previous GTSPP meeting reports, and known activities of participants and others, and prepare a plan, schedule, and responsibilities for the development and preparation of further GTSPP products and information.

Target Date: Next GTSPP Steering Committee Meeting
Lead: U.S. NODC.

7. Data Flow Monitoring

- MEDS to complete data flow monitoring study for the September, October, and November 1989 data already submitted by France, the USA, and the USSR augmented by data to be requested from the RNODC for IGOSS (Japan), the AODC, and the RNODC-SOC (Argentina).

Target Date: December 31, 1990
Lead: MEDS

- MEDS to implement monthly analysis and reporting of all real-time data flow comparisons between Canadian GTS, NWS and FNOC streams and others as they become available. MEDS is to continue this function for as long as it is needed.

Target Date: January 1, 1991
Lead: MEDS

8. Historical Data Acquisition and Processing

- USSR, ICES, and USA NODCs to work together to prepare a strategy and plan for the location, acquisition, QC, and loading of historical temperature and salinity data including suggested centres to undertake the various tasks and based on the acquisition of long term time series data as a priority.

Target Date: Available for circulation and review before next GTSPP meeting.
Lead: USSR NODC, ICES, USA NODC

- GTSPP participants should study the USSR document regarding time series stations and provide comments and suggestions to the USSR.

Target Date: October 31, 1990
Lead: All participants

- USSR to compile comments and prepare a new time series document recommending 100 time series stations and sections for consideration at the next GTSPP meeting.

Target Date: Next GTSPP meeting
Lead: USSR

9. Other

- GTSPP Centres and the WOCE Scientific QC Centres to consult and agree on the formats to be used for the first year of the project for the exchange of temperature and salinity data pending further study of format issues.

Target Date: October 31, 1990
Lead: MEDS, NODC, TSDC in consultation with all other centres.

- USSR to prepare a draft of a GF-3 Subset for GTSPP data that will provide for the content of the MEDS format and submit it to the GE-TADE Meeting in November 1990.

Target Date: GE-TADE Meeting, November 1990
Lead: USSR NODC

- MEDS to submit data monitoring reports to interested South American centres to indicate to them the data which are available through the IGOSS system.

Target Date: January 31, 1991.
Lead: MEDS

- GTSPP participants to prepare and forward to MEDS, information on software systems and algorithms that can be made available to GTSPP, IODE, and IGOSS centres. MEDS to circulate the information to all interested centres.

Target Date: November 15, 1990
Lead: All participants

ANNEX E
LIST OF DOCUMENTS

Report of an Ad-Hoc Consultative Meeting on the Global Temperature Salinity Pilot Project (A Proposed IGOSS-IODE Program) (Washington) - January 1989

Summary Report - Second Ad-Hoc Consultative Meeting on the Global Temperature-Salinity Pilot Project (A Proposed Cooperative IGOSS-IODE Project) (Canada) - July 1989

SUBMITTED BY AUSTRALIA

GTSPP The Global Temperature-Salinity Pilot Project Plan - August 1990

SUBMITTED BY TOGA

Duplicates Control at the TOGA Subsurface Data Centre - September 1990

SUBMITTED BY NODC

The Continuously Managed Database (CMD) at NODC - September 1990

GTSPP Brochure

SUBMITTED BY USSR

List of Oceanic Stations for Creating Data base "GTSPP TIME SERIES"

On The Question of Data Exchange Organization Through Telecommunications

On Incorporating Historical Data in the Baseline Dataset

First Session of Steering Group on GTSPP - GTSPP Project Implementation Plan (Section "Observation Time Series") - September 1990

The First Session of Steering Group on the Global Temperature-Salinity Pilot Project (GTSPP) - Comments on the Agenda Items - September 1990

SUBMITTED BY WOCE

WOCE VOS/DMC Upper Ocean Thermal Data Assembly Centre Meeting (Tallahassee) - February 1990

Upper Ocean Thermal DAC - September 1990

SUBMITTED BY MEDS

MEDS Sequential and Indexed Formats for Ocean Data - August 1990

Status of BUFR Processing in Canada - September 1990

The MEDS Continuously Managed Database (CMD)

GTSPP Implementation Plan - September 1990

Monitoring of GTSPP Real Time Data Flow - September 1990

Report on MEDS Activities in Regard to GTSPP Since Last Meeting

GTSPP QC Manual (Report) - September 1990

Draft Work Plan - September 1990

Quality Control Manual - September 1990

Draft Provisional Annotated Agenda - September 1990

Provisional List of Participants - September 1990

ANNEX F
MONITORING OF GTSPP DATA FLOW

1. BACKGROUND

At the Second ad-hoc Meeting on the GTSPP held in Ottawa in July 1989 it was decided that a study be conducted on the real time global temperature-salinity data flow. This study was to be carried out by having France, the USSR, the USA and Australia prepare magnetic tapes of the data they had received in real time via IGOSS, or any other means, and submit the data to the Marine Environmental Data Service in Canada. MEDS was to then add the Canadian data and produce a report on the total available data set for each month and various statistics on what had been received in each country.

MEDS had originally intended to carry out an ad-hoc project and complete the study in two or three months. However, it soon became evident that the comparisons and data flow studies could best be effected by using the duplicates algorithms being developed as part of the GTSPP quality control procedures, which were also intended for use in operating the continuously managed database in the US NODC. For this reason, the study was delayed until a workable duplicates identification algorithm was available to MEDS.

2. STATUS OF MONITORING PROJECT

The status of the monitoring project is as follows:

- MEDS has received the data for the monitoring period from the USSR, France, and the USA as requested.

- Software has been developed and tested and the Canadian, French, and USSR data have been converted to the common format to be used by the monitoring software.

- The duplicates algorithm has been developed and tested in MEDS for use in the quality control of the real time data and in operating the MEDS real time database.

- Software for processing the file produced by the duplicates checking algorithm and preparing the monitoring report has been designed, coded, and is being tested. This software will produce the report described in section 3 below.

The work to be carried out to complete this monitoring study is described in section 4.

3. FORMAT AND CONTENT OF THE MONITORING REPORT

The monitoring report as it is now perceived is composed of three sections.

The first section consists of counts of BATHY and TESAC reports received each month under each GTS header as indicated below.

GTS Header # BATHYs #TESACs

SOVD01 KWBC 155 35

SOVD10 RUMS 162 43

etc.

This report allows the centres to identify any headers they are not receiving, as well as to verify that they are getting all the data that are reported under each header.

BATHY and TESAC reports are generally routed on the GTS according to the content of the header. Lack of knowledge of the headers under which the reports are being transmitted at any time seems to be the single largest cause of loss of data along the GTS. This report should enable the centres to know what GTS headers are not being passed along at the relevant hubs on the GTS.

The second section consists of counts and percentages of the total available BATHY and TESAC reports that are received at each participating GTSPP centre. The main feature of the procedures developed and described here for monitoring the GTSPP real time data flows is the identification of the total numbers of unique reports available in the system. Previous counts of reports were based on simple counts of BATHYs and TESACs. In fact, two centres could have received the same number of BATHYs and TESACs but they could be two totally different sets of BATHYs and TESACs. This second report is designed to report in terms of unique reports. Therefore, a centre should know accurately its success in acquiring the totally available set of observations.

The second report is designed as follows:

STREAM_IDENT No. BATHYs % of Total No. TESACs % of Total

URBA 1500 96% 0 0

URTE 350 75% 0 0

FRBA 1550 98% 0 0

etc.

The STREAM_IDENTs refer to the centre (UR = USSR, FR = France) and the type of report (BA = BATHY, TE = TESAC).

The third section is the long report and it consists of several to many lines of text for each unique group of messages received and thus it really is a long report. The purpose of the long report is to allow a reader to look at the groups of messages identified as duplicates and to understand in detail the workings of the duplicates algorithm which is the basis of the monitoring report. A sample of this report is given below. Please note that the sample reports given here are not based on data but are only illustrations of the format and content of the reports to be prepared at the conclusion of the study.

******************** UNIQUE MESSAGE GROUP 1

GTS Header Call Sign Date/Time Latitude Longitude Message Stream

SOVD1 RUMS CGBV 1990/09/06/0615 45.69 136.58 URBA
SOVD1 RUMS CGBV 1990/09/06/0615 45.69 136.58 MDBA

etc.

AUBA AUTE MDBA MDTE FRBA FRTE URBA URTE USBA USTE

SOVD1 RUMS
SOVD1 RUMS
SOVD1 RUMS
SOVD1 RUMS

SOVD1 RUMS

The first series of lines in this report show the messages that were identified as a unique group of duplicates. By inspecting the identification and position information, one can determine why the messages were identified as duplicates.

The second set of lines in the report show who received the report and under which GTS header. In the example Australia, Canada, France, and the USSR received the report and the US did not. Australia, in fact, received the report twice.

This report is very useful in giving the user an accurate picture of GTS data flow and the root of problems which are causing loss of data. As noted, however, the report will be very long (of the order of 20000 lines per month of data). It is suggested that for the study it can be produced in hard copy (100 pages double sided, small print). If it is determined it is desirable to continue to produce it, perhaps it can be circulated as ASCII files on tape to be browsed with a computer text editor.

4. WORK TO BE COMPLETED

The following tasks must be accomplished to complete the monitoring study:

- Software must be completed to convert the US and Australian data to the common format to be used by the monitoring software.

- The Australian data must be obtained.

- Testing of the software for producing the monitoring report must be completed.

5. RECOMMENDATIONS

i) It is recommended that the study be completed as described here. There is approximately 2 weeks of software development and testing left before final processing and production of the report can be undertaken. At that time, MEDS will undertake to provide a complete set of data to each participant in a format to be agreed on along with the report. Participants are also welcome to copies of any or all software developed in the monitoring project.

ii) It is further recommended that if the results of the monitoring study indicate that there is significant additional data to be gained by combining the real time data sets from the five participants, then this should be done for all months beginning with December 1989 and continuing until such time as the GTS data flow can be improved so that all centres are receiving all the data. This would also be considered as a valuable contribution to the WOCE programme, as it would provide the best available real time data set beginning January 1, 1990.

ANNEX G.1
DATABASE AND DUPLICATES MANAGEMENT AT THE MEDS REAL TIME CENTRE

The growing requirement for the availability of ocean data at all time scales from real time operations and forecasting to research into long term climate fluctuations has led to the concept of the continuously managed database. Operational forecasting of ocean conditions makes more and more use of ocean variables. Many national and international oceanographic experiments need a "quick look" facility for the data to verify instrument performance and adjust data collection plans in the light of early results. Finally, high quality research programmes need the most accurate databases with the best available spatial and temporal coverage for researching processes and change in the ocean.

The continuously managed database concept attempts to address all these needs by capturing data immediately that it comes available in a documented and standardized form, and then allowing for replacement of early, less accurate and reliable versions by later, fully processed and quality controlled versions.

As a result of these requirements MEDS archives are being constructed on this model. As data are received and processed they are put into the archive so that they will be available as soon as possible to the various users. As more carefully scrutinized data or higher resolution data are received, they will replace the earlier copies. For example, temperature profile data from an XBT may first be received in a BATHY message. These data are quality controlled and archived as soon as possible. Then when the XBT data are received, perhaps 2 years later, they "replace" the BATHY data in the archive. In practise, MEDS has adopted the policy that incoming data duplicating versions of lower quality will deactivate the lower quality data but not remove it.

In designing the newest MEDS processing/archival systems considerable effort has gone into the development of algorithms for operation of a CMD. These algorithms and the considerations that led to certain of the methodologies are described here for consideration in designing the GTSPP CMD.

The design of the MEDS CMD for ocean data is based on the following two concepts and the practical requirement of subsection c) below.

a) Each observation in the database and the input stream for new data must have attached to it a field which describes its origin. This field, which is referred to in the MEDS system as the STREAM_IDENT, is used to identify the source of the observation in terms of such items as the standard associated with its preparation, its transmission to the data centre, its level of quality control, etc.

As an example, an IGOSS TESAC message taken from the GTS by MEDS might have a STREAM_IDENT of MDTE. This would be defined to indicate the message was received and processed by MEDS after being prepared as a TESAC message to the format and standard of IOC/WMO Manuals and Guides No. 3. A delayed mode XBT observation from SIO might have a STREAM_IDENT of SIXB.

b) Associated with the database there must be a replacement priority list which defines which version of an observation replaces which other version in the CMD.

This list consists of a priorized list of the STREAM_IDENTs that can occur in the database. If the database were only to contain MEDS BATHY and TESAC messages, and SIO delayed mode XBT observations, then the replacement priority list would be

SIXB
MDTE
MDBA

In this case, an SIO XBT delayed mode observation would replace a MEDS BATHY from the GTS. Similarly, if MEDS received both a BATHY and TESAC message for an observation, the TESAC would be chosen as the version to be kept in the database.

c) There must be a scheme to identify in the databases (and in the input stream) the occurrences of observations that must replace observations of a lower replacement priority. This not only involves finding that the input stream contains an observation that should "replace" one already in the database, but also finding that a "higher priority" observation is already in the database and the one in input should not "replace" it.

This "finding and replacing" function of the system must be highly automated. Significant manual intervention in this operation would demand more person-time than most data centres could accept.

In general, the archiving strategy is to preserve all versions of an observation with different STREAM_IDENTs and to flag duplications. On the other hand, only one version of copies of the same data with the same STREAM_IDENT is kept. Thus, temperature and salinity profiles received in a TESAC and the same data received in delayed mode as a CTD station are both preserved with the TESAC profiles being flagged as an inactive duplicate. There are several reasons for doing this. For example, the "higher quality" observations may in fact prove to be wrong and one may wish to re-activate the "lower quality" version. A second reason concerns being able to compute statistics on receipts of TESAC data. If the TESAC versions of the data are discarded when the fully processed CTD observations enters the database, then one could not, for example, count TESACs.

Details of the Data Handling

This section describes the functions of the modules required within the processing system.

Figure 1 is a schematic diagram that documents the processing of data in the MEDS CMD system. The following paragraphs describe the various modules. Note that these modules are now under development and may change as development proceeds.

Stage 1 Quality Control Module (Step 1)

This software will follow the mandatory identification/position/date-time tests as outlined in the GTSPP QC Manual. These tests are designed to discover and resolve problems with the station identifiers, position and time information only. Such things as simple range checks and time/distance checks are included here. This also includes checks of positions on land and agreement between observed soundings and bathymetry files, etc.

These tests are carried out at this stage because the duplicates checking algorithm uses the identification, position and date-time fields.

Duplicates Checking Preparation Module (Step 2 and 3)

This software scans the incoming data file to determine the time range of the data in question. It then extracts the data for that period from the appropriate archives, sort/merges the data from the databases into the input data file being processed.

Incoming data should be partitioned according to time spans so that the database retrieval does not qualify the whole database. A suggested criterion is to split the incoming file into separate files such that there is not a time gap of more than 6 months between stations.

The prescan of the file is essentially an examination of the date span covered by the data in the new input file. This information is passed to a retrieval program which retrieves the data for the date range from all the MEDS ocean ISAM databases. This retrieved data is merged with the input file to permit the duplicates checking program to "know" what is in the database already. There is a field in the processing format used at this stage to indicate whether the data being processed through the duplicates checking program came from the new input stream or from the database itself.

Duplicates Checking Module (Step 4)

The concepts of duplicates identification and a continuously managed database cannot be treated separately. In fact, the automatic identification of duplicates is the basis of finding and replacing earlier, less reliable observations in the database.

The following algorithm is used to identify duplicates in the input stream and between the input stream and the database. When duplicates are found, the STREAM_IDENT and the "Replacement Priority List" are used to determine which copy of the observation is retained in the database, and what actions are taken during the database update.

The identification of duplicates is based on two approaches similar to those adopted in the TOGA Subsurface Centre in Brest. There can be an "identification duplicate" and a "fuzzy area/fuzzy time" duplicate. Within the algorithm it is assumed that position and date/time are correct. It is therefore important that as much of the QC as is feasible be done on the identification, position and date/time fields before duplicates checking is carried out.

The program starts through the prepared file, checking if a given station has a duplicate in time and location within a tolerance of 15 minutes and 5 nautical miles. If a duplicate is not found, the station is written to the output file. If a duplication is found, different actions take place depending on the nature of the duplication. First, a comparison is made of the type of data and a disposition assigned for the record if they are of different type. If of the same type and and from the same instrument, the source of the data is used to determine the final disposition. As well, the unique key (cruise number and station number) must be examined to determine whether a record is replaced or removed from the archive.

The module may be operated in two modes. The first, manual mode, displays any duplications that are found to the users terminal. The user must then decide whether the computer's decision will be accepted or he can enter commands to change the decision. This mode of operation is acceptable for small files such as the daily BATHY/TESAC data received in MEDS. In automatic mode, the program selects the first occurrence of the highest priority station and flags all of the other stations as duplications. In automatic mode an additional output file is created which can be used along with the Post Duplicate Processor to review the decisions the computer made in automatic mode. This mode of operation is used for dealing with large input files of several tens of thousands of stations where it is impractical for an operator to sit at a terminal during processing to respond to occasional duplications.

Post Duplicate Processing Module (Step 4)

This software is used to interactively review the results of the duplicate checking software operating in automatic mode. The user must review all of the ambiguous results identified by the duplicates processor and resolve the actions to be taken in each case.

General Quality Control Module (Step 4)

There will be two options for the QC at this point. One option will be for an automated QC to be applied only to data QC'd elsewhere (e.g., SIO, CSIRO). This QC will be for the purpose of identifying such major problems as format mismatches or other such glitches. The scientific QC from the scientific centres will not be second guessed by the data centres.

The other option for the QC will be the full QC as documented in the GTSPP QC manual and including such man/machine interactions and AI systems as are appropriate and agreed in the QC manual as it is further developed during the course of GTSPP.

All quality control procedures described as Stage 2 or higher in the GTSPP Quality Control Manual are eligible for application at this stage. The suite of tests that are implemented are reflected in the Version Number stored with the observation.

Station Numbering and Key Checking Module (Step 5)

The processing that takes place in this module is primarily housekeeping in nature and is concerned with putting the data in order by primary key for more efficient update operations against the suite of MEDS ocean databases, and with insuring that the cruise number/station number key will be unique in the database.

This software operates in different ways depending on the data source. In preparation for this step, the incoming data are ordered by cruise number and chronologically (for real-time data) or by station number (for delayed mode data). For delayed mode data of all kinds, it reads the cruise and station number in the incoming file and determines if these duplicate keys in the appropriate archive. If they do, the duplicate stations are listed, a warning issued and processing will continue with the next station. If the data are real-time sources, the software finds the lowest station number in the correct archive and numbers the stations starting at the next lowest number, decreasing the station number for each station treated in a cruise. All of the data are passed on to the next processing stage.

Update Software for Profile Data Archives (Step 5)

The data would be sorted by one degree square to ready it for the update. The software takes each station in turn in the incoming file and checks the UPDATE_FLAG field for the disposition of the station. If the field is set to "S", no action is taken; that is the record is skipped. If the field is set to "U" the station should update into the appropriate archive. If the field is set to "D" the record should be removed from the archive. To replace an observation in the archive, there must be two copies of the record in the input file at this stage. One will have an update flag of "D" to remove the database copy and the other will have an update flag of "U" to update the replacement copy into the database.

In no case will a history record be written by the update software. The UPDATE_DATE will always be set to the date of the last change in the record.

ANNEX G.2
DUPLICATE CONTROL AT THE TOGA SUBSURFACE DATA CENTRE

PRE-PROCESSING DUPLICATE CONTROLS

Given the large amount of data exchanged and the terms of commitment of the TOGA Centre (replace the real-time data by the delayed mode data), this represents the most complex part of the controls and the longest to achieve.

Basically there is strictly no way to automatically detect all the duplicates (we mean "not exact duplicates"), since the sources of errors are random and unknown. The aim is to minimize their number. The minimum acceptable level of duplication is unknown. The only limit is the maximum acceptable time that can be devoted to this task. We have therefore adopted the following principles:

- The duplicates should be eliminated before entering the database.
- The loading time of a data set must not exceed one night.
- Better accept a duplicate than reject a non duplicate.
- A delayed mode profile replaces a real-time profile.
- A real-time profile doesn't replace a delayed mode profile.
- A profile does not replace a profile of the same type.
- The procedure is automatic.

Practical rules: To reduce the time of research of all the profiles which can be duplicates, a first selection on keys is done for possible duplicates. The key contains the Ocean abbreviate and the 10 degree latitude/longitude square containing the profile. The profiles contained in this key and the contiguous one are selected.

The comparison is done on date, time, position and type of profile, not on the data.

1. If the profile comes from an NODC data set and is composed of XBT or SBT.

- Transform the NODC vessel codes into call signs using a cross reference table.

- If the call sign is not found, place the profile in a temporary file where the call sign can be modified. If the call sign is not found after further investigations, it is replaced by a string composed of "NODC" and the NODC code.

- Begin the loading

2. If the call sign is not "SHIP"

If the call signs are identical
If year, month, day are identical
If the difference in time is less than one hour
THEN
Profiles are considered as duplicates and an XBT replaces the real time profile, otherwise the profile is rejected and placed in an auxiliary file.

3. If the call sign is "SHIP"

If year, month, day are identical
If the difference in time is less than one hour
If the difference between latitude and longitude is less than 0.5 degrees
THEN
Profiles are considered as duplicates.
The same rule applies and the call sign replaces "SHIP" if the data base profile was labelled "SHIP".

Main deficiencies of the present procedure:

This procedure eliminates more than 90% of the duplicates but it is inadequate to detect the following discrepancies:

- Differences in call signs

- Large position error (more than 10 lat/long), quadrant errors

- Difference on year, month, day. Particularly bad detection of duplicates for data collected around midnight.

Furthermore this procedure rejects some non-duplicate data which must be reloaded further without control: particularly XBT's sent in time sequential form where frequently the first measurement is bad and repeated just after.

It is therefore safer to load a small data set where the auxiliary files containing rejected XBT are not too large and can be carefully inspected. In any case, during the loading, messages on causes of rejection are delivered for each rejection.

POST-PROCESSING DUPLICATE CONTROLS

As the pre-processing of duplicates leaves in the database some amount of redundant profiles and unknown call signs, we implemented additional duplicate controls which are achieved when large data sets have been loaded. These second sets of controls must be different from the first one, so their principle is based not on index but on sorting.

These controls are currently done off-line on a microcomputer in a two step procedure after selecting a yearly headers data set and transferring it to a microcomputer DBMS.

1. First step
-sort the data set on call sign and time
-scan the data set
-apply a speed test

If two consecutive profiles for the same vessel are different by:

-less than 20 minutes in time
-less than 3 miles in distance
-or if speed exceeds 25 knots
-and not sent by the same Institution
THEN
-eliminate the real time profile if met by an XBT
-stop and wait for the operators decision for profiles of the same type.
-put all the deleted headers in a file for transfer to the main frame where the profile will be canceled.

2. Second step
-sort the new header data set on time irrespective of the call sign
-scan with the following rules:
if two consecutive profiles are separated by
-less than 15 minutes
-less than 5 miles in latitude and longitude
THEN
-same as before
-print all headers duplicating

3. Third step

-If repeated duplications occur for two different vessels
THEN
-Put a filter on these vessels names and repeat the operation. If two different vessels have been found to be fully duplicated for some cruises, check and correct the headers in the database and, if needed, the cross reference table NODC code/call sign, inform the NODC of the decision taken and possible erroneous identifiers in their data -Transfer the deleted headers file on the main frame and cancel all the profiles.

Remark: all the constants used above have been determined by experience, as representing the best compromise between speed/number of duplicates detected/number of non-duplicates erroneously detected.

Advantages of this procedure:

* This procedure is very powerful; it regularly accomplishes detection and suppression of 4% of duplicates in the database which were not detected by the entrance procedure.

* It also allows detection of unknown vessels and corrects erroneous vessel identifiers (call sign).

* We found that 2% of the identifiers do not match in a merged real-time delayed mode database (more than 100 000 profiles inspected).

Drawbacks of this procedure:

* Highly interactive and therefore slow.

* Does not look at the data, therefore unable to choose the good profile in case of doubt.

* The operator needs a good knowledge of oceanographic data exchange problems from the operator and the origin of the errors that might have occurred. Otherwise this procedure can be dangerous.

* Needs additional information about the history of vessel call signs.

ANNEX G.3
THE CONTINUOUSLY MANAGED DATABASE (CMD) AT THE US NODC

The CMD at NODC is being developed in two stages. In the first stage, procedures and computer systems used for the TOGA project are being expanded to include data from the entire globe. In addition, salinity as well as temperature data will be incorporated into the database. The database is maintained on VAX disk files and managed by FORTRAN programs and computer utilities.

In the second stage, the CMD will reside on a Sharebase database machine, which manages the data through relational tables. NODC is currently developing this capability for many data types through the prototype project POSEIDON. This on-line database will greatly simplify the process of updating, checking for duplicates, and retrieving data for science centers. It also will make the CMD truly a continuously managed database.

The first stage CMD procedures are outlined in the attached figure. There are six steps to the process, which are carried out once a month at this time. The steps are:

1. Accumulate near real-time data from MEDS and merge with delayed mode data to produce a monthly file. The file is sorted by WMO quadrant, latitude, longitude, date, time, and data source quality.

2. Eliminate exact duplicate records (see below), retaining records with the highest data source quality, in the following order:
a. Delayed mode data
b. Near real-time data from NMC
c. Near real-time data from FNOC
d. All other near real-time data.

3. Merge data from the monthly file with data from the CMD, producing a preliminary update and records for the off-line tracking system.

4. Review tracking system records for exact duplicates and near duplicates (see below); resolve position/date/time and call sign problems; create update control records.

5. Apply update control records to the CMD, producing the updated CMD and tracking system update records (which are applied later).

6. Retrieve and send records from each ocean basin (Atlantic, Indian, Pacific) to respective Regional Science Centers.

When the CMD resides on a database machine, many steps in the above process will be greatly simplified. In step one (above), it will not be necessary to create monthly files; data will be merged into the CMD as they are received. Also, rather than sorting the monthly file in order to check for duplicates, data in the CMD can be quickly sorted whenever needed.

Duplicate checking (step two) will be done daily as data are received and merged into the database. There will be no need for a separate tracking system (steps three, four and five). The database itself will serve as a tracking system.

DUPLICATES MANAGEMENT AT NODC FOR THE CMD

Duplicate records are managed through two processes; the first identifies exact duplicates and the second identifies near or inexact duplicate records. In the first process, computer program ELIMDUPS examines data from near real-time and delayed mode sources. It tags exact duplicates when two or more records have exactly the same data in year, month, day, time, latitude, longitude, and platform identifier fields. The program keeps the record which has the highest quality indicator and deletes all others. At this time the data quality indicators are set for TOGA needs, but they can be changed for GTSPP. Program ELIMDUPS also rejects records for later review if there are problems in certain fields.

In the second process, records which are near duplicates are examined manually and decisions are made about which records to keep and which to reject. Using the tracking system, near real-time observations which match delayed data ship and time period are printed with the delayed mode data tracking records. The report is sorted by platform, date, position, and data quality code. Near duplicate records most often are present when delayed mode data, which may have slightly different date, time, and position information, is added to the CMD. A review of these records helps identify near real-time data which are near duplicates of, and, therefore, should be replaced by, delayed mode data. Ship speed between stations is also reviewed, which helps in this manual process; it also identifies problems in position or date fields.

Duplicates management will be a much simpler process when the CMD resides on a database machine. Then, duplicate record management can be a weekly or even daily process. The database will serve as the tracking system, which will make the process easier and more accurate. Also, it will not be necessary to wait for a monthly update cycle to check for duplicate records.

ANNEX G.4
DUPLICATE CHECKING AT VNIIGMI

Two approaches can be specified in the methods of data duplicate detection and elimination:

* Checking and further elimination of duplicates using the r/v cruise characteristics;

* Checking and eliminating duplicates of sounding observation data.

1. Checking Cruise Duplicates

Methods of checking cruise duplicates are based on the assumption that coincidence of cruise identifiers (country code, ship code, data source code) and their space-time characteristics (latitude, longitude, year, month, day and time of the first and last soundings) is an indication they are duplicates. If this is the case, differences in the values of parameters can occur due to the large number of data sources. The data used are:

a. Information on the r/v cruises - identifiers and space-time characteristics;

b. Data source code tables.

Checking procedures:

a. Two cruises are compared as to their identifiers and space-time characteristics. If the test is a success, one of the cruises is to be eliminated.

b. Otherwise, a sequential comparison of space-time characteristics is carried out, with the account of a possible difference, equal to a preset value D, for each characteristic.

c. If differences between cruise characteristics exceed the corresponding D, the cruises are not considered to be duplicates. Otherwise, checking of duplicate soundings is started. The allowed values of D are assigned with the account of the specific character of data. The values are determined empirically.

2. Checking Sounding Duplicates

According to the methods the coincidence of water temperature sounding profiles in a small geographical area within a short time interval is hardly probable. The data used are sounding observation data, ordered with respect to such characteristics as geographical area (for instance, 1- degree Marsden square), year, month, day and time of observation.

Checking procedures:

a. Two adjacent soundings are compared with respect to space-time characteristics. If the test is a success, one of the soundings is to be eliminated (depending on the data source priority). Otherwise, the comparison of soundings is carried for a permissible departure DP for each characteristic.

b. If the differences between sounding characteristic values lie within the permissible limits then:

- check the equality Ti1 = Ti2 at the same depths;

- determine the number of cases Ti1 = Ti2 (KN) and Ti1 ne Ti2(KM). If KN/KM <PN(PN is given with respect to the number of the same depths in the temperature profiles) then one of the soundings is to be eliminated. Otherwise, the soundings are not considered to be duplicates.

3. General Scheme of Duplicate Checking and elimination

The scheme consists of the following stages:

a. Duplicate cruise checking

b. Duplicate sounding checking

c. Making a decision of duplicate elimination through comparing visually or using special algorithms of the checking result processing.

In principle, separate use of the above stages is possible:
stage a ------> stage c
or
stage b ------> stage c.
The experience of use of the scheme " stage b stage c " showed the efficiency of performing 2-3 cycles of the scheme.

ANNEX H
LIST OF PRODUCTS

COUNTRY PRODUCT AVAILABLE

Australia Indian Ocean Jan. 1, 1991
SST, TI50, T 400

Canada Data Monitoring Report Jan. 1, 1991
West Coast Analysis
Real Time Database

France Ship of Opportunity Jan. 1, 1991
TOGA Products
Level III Data set
Isotherm depth charts
T at various depths
Heat content in tropics

USA Global T-S data Jan. 1, 1991
Ocean-basin T-S data
Data distribution charts
(Monthly)

USSR Time series file June 1992
Means and Variances Fields
for Historical T-S Data

ANNEX I
COMMENTS ON QUALITY CONTROL MANUAL

Removal of inferences - i.e., linear interpolation of position from adjacent stations. Position must only be interpolated in conjunction with the use of a cruise track plot that indicates a predictable track. In the case of delayed mode, the data collector must be consulted.

Climatology test flagging is inconsistent.

Tests to be conducted effectively in parallel in a bank of related tests, i.e., time / date / position / speed / land / track plot, with "alerts" rather than flags set automatically. The tests being iterated with changes being only temporary attempts to correct data, until "alerts" are satisfied or data accepted, at which time changes are committed and flags explicitly set.

Test criteria to be made less stringent and/or require operator viewing and action rather than automatic flagging. This is necessary to prevent an excess of flagged data.

Graphical techniques should be incorporated (e.g., track plot) and illustrated (water fall test), both of which should be conducted before any other test.

1. Specifics for First Version of the Real-Time QC Manual

Threshold for density inversions should be non-zero (e.g., allowance should be made for measurement resolution and adiabatic warming).

Depth test should be made against a bathymetry database plus a tolerance.

Data, that had been previously flagged = 5, must interrogate original value database and/or history database before allowing another charge.

If the platform code cannot be determined, the observation should remain in the data set.

2. Format Related Matters, (Convention With Respect to Sign of the Longitude)

Longitude from 0 degrees going east is + ve, from 0 degrees going west is - ve
Latitude from 0 degrees going north is + ve, from 0 degrees going south is - ve
Longitude 180 degrees is - ve.

Although some centres have systems that handle -180 and +180 in retrievals, etc., standardization will be beneficial to the wider user community.

3. Guidance for Future Real-Time and Delayed Mode QC Manuals

Exchange format should allow for nutrient flags to be carried with data that will not strictly be used by GTSPP. However, as US NODC will maintain the CMD, the format should be able to hold nutrient data to encourage the use of the GTSPP exchange format. The modifications required are very minor and would have little effect on pure GTSPP data.

Current tests are orientated towards spikes in data. Systematic correctable error detection techniques should also be developed.

Flow diagrams and code should be amalgamated into pseudo code. Current description is useful for FORTRAN-4 or AI languages such as Prolog, however a burden to FORTRAN-77, C, and PASCAL programmers which have flow control within their language.

Flag = 8 means QC done by a centre other than the originating centre and no flags were attached. Will not ultimately be used in GTSPP.

Add a water mass test as suggested by D. KOHNKE.

Add Kolmagorov-Smirnov test as suggested by N. MICHAILOV.

Add climatology test (example being ROBINSON/BAUER Atlas) as suggested by D. HAMILTON.

Use objective interpolation technique to calculate internal consistency suggested by R. KEELEY.

Actual data and statistics of data with similar time/spatial characteristics should be used rather than standard climatology. Described climatology set for deep ocean not the continental shelf.

Dialogue with originator used to suggest values to correct data rather than just flagging the data. Telemail may be used to speed the response to queries.

If a problem is found in data in the CMD, the originating data centre should be requested to investigate and correct the data and have the CMD updated. Updates must be promulgated then by NODC (US).

Flags limited in the number of values which is insufficient for nutrient bottle data. As (US) NODC is managing the CMD and data archaeology group will be reading/providing nutrient data. The flagging scheme probably requires a definition for nutrient flag size to allow nutrient data to be carried in the format.

Reasons:

1. If the nutrient data is stripped from the observations in GTSPP format, data centres will request the data in a format other than the GTSPP exchange format to facilitate a complete data set.

2. The GTSPP format will become obsolete when the project is expanded beyond T/S.

3. Data management will require duplication with increased cost and update problems with maintenance of QC'ed data. This will ultimately delay implementation of GTSPP because of the extra burden on the Data Centres.

4. US NODC will be storing nutrient data and will no doubt provide it outside of GTSPP.

WOCE/CSIRO, etc., flagging schemes are included in the exchange format, therefore descriptions of their methods should be an annex to the delayed mode manual.

A suggested list of minimum meta data to be supplied with data submissions as this meta data can influence the tests.

There needs to be some scope for value judgements because of climatology causing the rejection of El Nino affected data. Flagging must be at the discretion of an operator with proper scientific training.

ANNEX J
LIST OF OCEAN STATIONS FOR CREATING A TIME SERIES DATABASE

Information Sources

The following major materials have been used to prepare the list:

1. Reference materials "Catalogue of Observation Time Series". (WDC-A and USA NODC,1989);

2. Catalogue of Oceanic Station Data (IOC Guide No.2, 1963, 1975, 1986 );

3. Information on Oceanic Station Data, prepared at the USSR Oceanographic Data Centre;

4. Information from the WOCE Observation Programme (WOCE Implementation Plan, Vol 1, WCPP - 11, July 1988, WMO/TD-242.

Choice of Observation Location Requirements

In selecting the observation points to be included in the list, analysis has been carried out of information on the observation series guided by the following requirements:

1. Geographical area in which fixed observation points are located - Because the data base "Observation Time Series" is oriented to its use for the analysis of large-scale space-time variability and also taking into account the national interests of certain countries in studying the adjacent marine areas, points of standard sections off the Korean and Chinese coasts have not been included as well as other similar points.

2. Length of observation record - Only points (sets of points in the form of sections) whose observation series are at least ten years long have been included.

3. Current continuation of observations - When selecting points, preference was given to such fixed observation points at which cruise observations are still carried out or have been carried out until recently according to some national or international programmes. When information was not available (for example - the number of stations in the section) the respective fields of the table are either left blank or filled in with text. Thus, the coordinates of the oceanic data stations made in the course of the USSR national programme SECTIONS is shown in Fig.1 while the location of WOCE's sections (ARS1-ARS10) is described in the form of text (e.g. Azores).

Future Work on the List

The following seems to be useful to complete the preparation of the list:

1. Filling in information gaps in the oceanic data stations description.

2. Expanding the description of standard sections by introducing latitude and longitude values for each section station.

3. Expanding the station list to cover the areas of the South Atlantic, the Indian Ocean and the South Pacific (also making use of the USA NODC\WDC-A Catalogue of Time Series for these areas is highly desirable).

LIST OF OCEANIC STATIONS FOR CREATING A TIME SERIES DATABASE

NOTES

Reference :

Elements : T - temperature at 0 m
S - salinity at 0 m
Temp - temperature at standard depths
Sal - salinity at standard depths
BT - bathythermograph observations
Nutr - nutrients
Oceanogr. - standard hydrological observations

ANNEX K
MEDS SEQUENTIAL AND INDEXED FORMATS FOR OCEAN DATA

I. INTRODUCTION

The format described here is used for archiving all ocean station data managed by MEDS. There are two record types used for the data from a station. The first, the station record, contains information pertaining to the station as a whole including position-time coordinates, surface met. and ocean parameters, and a history of processing and editing for the record. There is one of these records for each station. The second record type, the profile record, contains the information about the profile observations made at the station. These are split into separate feiles based on the data type. The key of the station record is used as the key into the profile file. Because of record size restrictions, there is also the capability to split a profile into segments.

For processing purposes, a sequential version of this archiving format has been created. The sequential version is almost identical to the archive version with the addition of four variables to allow sorting of the sequential version as required by the processing systems, and as required to carry some additional information needed by the duplicates processing algorithms. Annex 1 contains a definition of the additional fields carried in the sequential version of the format. Annex 2 contains a sample subroutine for reading and writing the sequential format from FORTRAN NAMED COMMON areas as discussed below.

The records are structured using the basic VAX fortran elements of INTEGER*2, INTEGER*4, REAL*4, and CHARACTER*n. A record is read or written to or from a named common storage area with a single unformatted FORTRAN read or write. This is generally accomplished in a subroutine which reads or writes the data to or from NAMED COMMON storage areas and includes TYPE declarations for the GTSPP temperature and salinity application. These common storage areas and type declarations are copied from subroutine to subroutine and can be thought of as an object or structure.

The input processing systems read and write the data in sequential access mode. The database uses the indexed sequential access mode (ISAM). For retrievals simple area and cruise-date index keys have been constructed. The primary key is a simplified one degree square number (not COTED or MARSDEN squares) that is easily computed from the latitude and longitude. The secondary key is split into two parts. The first is the CRUISE information, the second contains the DATE information. The first part is completed by the TIME-DATA_TYPE combination that can qualify on a traditional oceanographic cruise or a portion of it or can qualify on a CALL SIGN/MONTH or portion of it in the case of IGOSS radio data.

This format is intended to have sufficient flexibility to cope with all types of station data held in the MEDS databases, including the BATHY/TESAC data, and doppler current profiler data as well.

II. DESCRIPTION OF THE STATION RECORD FORMAT

The format is made up of five sections. The sections can be described as two dimensional tables or relations if one prefers to think in terms of relational data models. The first section or table contains information that occurs only once per station, including the identification and location information. Thus section 1 always consists of a table with a single row. The other section contains variable numbers of rows depending on the data that has been observed at the station.

In the following sections the record format is described in terms of the structures and elements defined in VAX fortran and in the VAX Common Data Dictionary (CDD).

Section 1. Mandatory Key and "Header" Fields

The first section contains identification, position and time information for the station. It also contains information on the source and time of receipt of the data, flags that have been provided during the quality control process, and a stream identification code that is used to determine whether the current version of the data will replace one already existing in the database during an update. Finally, this section contains an availability flag that controls distribution of the data and a unique record identifier that is used to determine an exact match if the record is sent to someone else for QC and is then returned.

Section 1 consists of four structures. The first structure (KEY0) is the primary key (position) for indexing in the database ISAM. The second and third structures (KEY1 and KEY2) are the secondary keys (identification) for indexing in the database ISAM. The fourth structure (FXD) contains the remainder of the "header" information for the record or ocean station.

In retrieving data it has been found quite adequate and efficient to retrieve by simple position and identification keys and then further select records by examining the record content in the retrieval program.

KEY0 STRUCTURE. (mandatory)
	ONE_DEG_SQ	DATATYPE IS SIGNED LONGWORD.
END KEY0 STRUCTURE.

The ONE_DEG_SQ is a one degree square number that is computed directly from the latitude and longitude and which is used for retrieval of the data by area. The method of assignment of the number has been optimized to simplify calculation and problems with changes of quadrants, the date line, prime meridian, etc.

KEY1 STRUCTURE. (mandatory)
	CR_NUMBER	DATATYPE IS TEXT
		        SIZE IS 10 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(10)".

The CR_NUMBER is a MEDS assigned cruise number. It is composed as CCiiYYnnn where CC = 2 digit country code, ii = 2 character ship code for countries other than Canada, or a 2 character institute code for Canadian cruises, YY = last 2 digits of year of the cruise and nnn = a MEDS assigned sequence number that is often the same as the originator's cruise number. It is blank filled on the right.

For IGOSS data the form is SSSSSSSSYY where SSSSSSSS is the call sign, and YY is the last 2 digits of the year. Note that if a platform call sign is less than 8 characters long it is right filled by blanks.

The CR_NUMBER is also used as a key in an information file containing metadata about the cruise.

	STN_NUMBER	DATATYPE IS SIGNED WORD.
END KEY1 STRUCTURE.

The MEDS assigned consecutive station number. The station number is not used for the real time IGOSS data (set to zero).

KEY2 STRUCTURE. (mandatory)
	OBS_YEAR	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

The 4 character year of the observations (e.g. 1990, 2001).

	OBS_MONTH	DATATYPE IS TEXT
		        SIZE IS 2 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(2)".

The month of the observations represented as 2 characters. Any leading zero must be encoded into the field using the format I2.2.

	OBS_DAY		DATATYPE IS TEXT
		        SIZE IS 2 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(2)".

The 2 character day on which observations were made. Any leading zero must be encoded into the field using the format I2.2.

	OBS_TIME	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".
END KEY2 STRUCTURE.

The time of observation as 2 digit hour and 2 digit minute integer. Any leading zeros must be encoded into the field using the format I2.2.

FXD STRUCTURE.

	DATA_TYPE	DATATYPE IS TEXT
		        SIZE IS 2 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(2)".

A 2 character code to indicate the type of instrument/trace used in the data collection or the type of IGOSS radio message used to report the data. The list of codes follows. Note that this list is expandable as necessary.

BO = Bottle
CD = CTD down trace
CU = CTD up trace
XB = XBT
DT = Digital BT
MB = MBT
BA = BATHY message
TE = TESAC message

	LATITUDE	DATATYPE IS F_FLOATING
		        MISSING_VALUE FOR DATATRIEVE IS 9999.9999
		        EDIT_STRING FOR DATATRIEVE IS "-----.9999".

The latitude (in decimal degrees) of the station. (Negative is south)

	LONGITUDE	DATATYPE IS F_FLOATING
		        MISSING_VALUE FOR DATATRIEVE IS 9999.9999
		        EDIT_STRING FOR DATATRIEVE IS "-----.9999".

The longitude (in decimal degrees) of the station. (Negative is east from Greenwich)

	Q_POS		DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

A one character QC flag for the position. The IGOSS quality flags given in IOC/WMO Manuals and Guides 3 are used.

	Q_DATE_TIME	DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

A one character QC flag for the date-time of the observation. The IGOSS quality flags given in IOC/WMO Manuals and Guides 3 are used.

	Q_RECORD	DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

A single character quick quality control flag to indicate the result of the data quality checks undergone by the data in the record. This flag also uses the codes contained in IGOSS Manuals and Guides #3. If the user wishes to review further the history of processing and quality control for the record, he must examine the history relation described below.

	UP_DATE  	DATATYPE IS TEXT
		        SIZE IS 8 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(8)".

This field is used when providing regular updates to users. Thus data can be retrieved for a given set of retrieval criteria further qualified by the date the data was added to the database or last modified in the database. This provides the ability to send a user all data entered or modified in the database after the date of his last shipment. The format of the date is 4 character year, 2 character month and 2 character day. As the data are reprocessed or edited, this date changes.

	BUL_TIME	DATATYPE IS TEXT
		        SIZE IS 12 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(12)".

The time at which the bulletin was inserted on the GTS. This is recorded as a 12 character field of yyyyMMddhhmm with yyyy being the 4 character year, MM the 2 character month, dd the 2 character day of the month, hh the 2 character hour and mm the 2 character minute. This field is used in providing data for the WMO twice yearly GTS monitoring activity.

	BUL_HEADER	DATATYPE IS TEXT
		        SIZE IS 6 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(6)".

For IGOSS data this field contains the GTS bulletin header indicating the bulletin type under which the data were reported. E.g. SOVD01

	SOURCE_ID	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

For IGOSS data, this is the 4 character identifier of the GTS node inserting the data (E.g. KWBC). For delayed mode Canadian cruises, this field is the 2 character institute code left justified.

	STREAM_IDENT	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

The stream identification parameter is used by the "Replacement Management Processors" for the continuously managed database to determine whether a message duplicating one already in the database should replace the database version. For example, a delayed mode record that has passed scientific quality control should replace a radio message version of a record.

The STREAM_IDENT variable is a four character field. The stream is identified as follows.

RM = IGOSS radio message
RQ = Radio message with scientific QC
DM = Delayed mode version from originator
DQ = Delayed mode version with additional
scientific QC

MEHI = MEDS historical data file

	QC_VERSION	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

This field contains a code identifying the last level of quality control that the record has passed. Note that the value of the quick quality control flag corresponds to this level of QC. If the user wishes to review further the history of processing and quality control for the record, he must examine the history relation described below.

	AVAIL	DATATYPE IS TEXT
		SIZE IS 1 CHARACTER
		PICTURE FOR DATATRIEVE IS "X(1)".

The data availability flag is used to restrict distribution of data if required by the person or organization that submitted the data to MEDS. The flag is set to "A" if the data are generally available or "P" (protected) if the data are not generally available.

	NO_PROF	DATATYPE IS SIGNED WORD.

The number of parameter profiles archived from the station and reported in the PROF STRUCTURE of section 2.

	NPARMS	DATATYPE IS SIGNED WORD.

The number of surface parameter variables reported in the SURFACE STRUCTURE of section 3.

	SPARMS	DATATYPE IS SIGNED WORD.

The number of surface code variables reported in the SURF_CODES STRUCTURE of section 4.

	NUM_HISTS	DATATYPE IS SIGNED WORD.

The number of history group entries to be found in the record.

END FXD STRUCTURE.

Section 2. Profile Information Structure

The profile information structure contains information about each of the profiles collected at the station. The parameter NO_PROF above contains the number of occurrences of the structure.

	OCCURS 0 TO 20 TIMES DEPENDING ON STN.FXD.N0_PROF.

	NO_SEG	DATATYPE IS SIGNED WORD.

A number which records the number of segments into which a profile has been split. A single profile record can contain up to a maximum of 1500 depths.

	PROF_TYPE	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

A 4 character parameter code indicating the type of profile. Where possible, MEDS uses the first four characters of the GF3 parameter code.

	DUP_FLAG	DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

A single character to indicate if this profile duplicates the information contained in another. This is set to 0 if there is no duplication or 1 if another profile of higher quality duplicates this profile.

	DIGIT_CODE	DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

The code specifying how the data were digitized. MEDS table follows with the IGOSS equivalent noted.

0 = Unknown
7 = Digitized at regular depth intervals (equivalent to k1=7 for BATHY/TESAC data)
8 = Digitized at inflexion points (equivalent to k1=8 for BATHY/TESAC data)
D = Digital data logger, unreduced

	STANDARD	DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

A code to indicate the standard to which the observations were made. Available codes for salinity are: 0 = No salinity measured
1 = In situ sensor, accuracy better than 0.02 (PSU assumed)
2 = In situ sensor, accuracy less than 0.02 (PSU assumed)
3 = Sample analysis (PSU assumed)
S = pre 1982 salinity units (PPT)
P = practical salinity units (PSU)
U = Unknown salinity units

	DEEP_DEPTH	DATATYPE IS F_FLOATING
		        MISSING_VALUE FOR DATATRIEVE IS 99999.9999
		        EDIT_STRING FOR DATATRIEVE IS "------.9999".

The depth in metres of the deepest observation in the profile.

Section 3. Surface Parameter Structure

The surface parameter structure contains all data that are observed at the surface including both oceanographic and meteorological variables. There is one occurrence of the structure for each variable observed. The parameter NPARMS above contains the number of occurrences of the structure.

The optional parameters would include such things as sounding depth, met. parameters, surface temperature (in addition to surface observation in table below to allow intake temperature or separate observation of SST), surface current speed and direction (IGOSS), BT surface reference temperature, etc.

The 4 character parameter code is used to differentiate between the variables reported in this structure. Where possible MEDS uses the first four characters of the GF3 parameter code. All variables are represented by real numbers and a one digit QC flag is provided.

SURFACE STRUCTURE
	OCCURS 0 TO 20 TIMES DEPENDING ON STN.FXD.NPARMS.

	PCODE	DATATYPE IS TEXT
		SIZE IS 4 CHARACTERS
		PICTURE FOR DATATRIEVE IS "X(4)".

The 4 character GF3 parameter code for parameters or a user assigned code. This repeats with PARM and Q_PARM, a total of NPARMS times.

	PARM	DATATYPE IS F_FLOATING
		EDIT_STRING FOR DATATRIEVE IS "-----.999".

A generic name representing the value of a parameter which has been specified using a parameter code. This repeats with PARM and Q_PARM, a total of NPARMS times.

	Q_PARM	DATATYPE IS TEXT
		SIZE IS 1 CHARACTER
		PICTURE FOR DATATRIEVE IS "X".

A single character code to indicate the level of data quality for all surface observations without specific data dictionary entries. This field uses the IGOSS flag codes contained in IOC/WMO Manuals and Guides #3 and repeats with PCODE and PARM, a total of NPARMS times.

Section 4. Surface Codes Structure

The surface codes structure contains all of the information collected at the surface but represented as alphanumerics. There is one occurrence of the structure for each item of information. The parameter SPARMS above contains the number of occurrences of the structure. An example of its use would be to record a special name for the station such as OWS PAPA.

SURF CODES STRUCTURE
	OCCURS 0 TO 20 TIMES DEPENDING ON STN.FXD.SPARMS.
	PCODE	DATATYPE IS TEXT
		SIZE IS 4 CHARACTERS
		PICTURE FOR DATATRIEVE IS "X(4)".

The 4 character GF3 parameter code for parameters or a user assigned code. This repeats with CPARM and Q_PARM, a total of SPARMS times.

	CPARM	DATATYPE IS TEXT
		SIZE IS 10 CHARACTERS
		PICTURE FOR DATATRIEVE IS "X(10)".

The 10 character field used to record alphanumerics information about the station. This repeats with PCODE and Q_PARM, a total of SPARMS times.

	Q_PARM	DATATYPE IS TEXT
		SIZE IS 1 CHARACTER
		PICTURE FOR DATATRIEVE IS "X".

Section 5. Processing History Structure or Relation

HISTORY STRUCTURE
	OCCURS 0 TO 100 TIMES DEPENDING ON STN.FXD.NUM_HISTS.

	IDENT_CODE	DATATYPE IS TEXT
		        SIZE IS 2 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(2)".

A 2 character code to indicate the organization responsible for creating the history record.

	PRC_CODE	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

A 4 character code to indicate the program through which the data in the history group passed which resulted in the creation of the history record.

	VERSION 	DATATYPE IS TEXT
		        SIZE IS 5 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(5)".

A 5 character code to indicate the version of the program indicated by PRC_CODE which resulted in the creation of the history record. This repeats with PRC_CODE, PRC_DATE, ACT_CODE, ACT_PARM, AUX_ID and O_VALUE a total of NUM_HISTS number of times.

	PRC_DATE	DATATYPE IS SIGNED LONGWORD.

The date, as YYYYMMDD, on which the history entry was made. This repeats with PRC_CODE, VERSION, ACT_CODE, ACT_PARM, AUX_ID and O_VALUE a total of NUM_HISTS number of times.

	ACT_CODE	DATATYPE IS TEXT
		        SIZE IS 2 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(2)".

A 2 character code to indicate what action was taken associated with ORIG_VAL. This repeats with PRC_CODE, VERSION, PRC_DATE, ACT_PARM, AUX_ID and O_VALUE a total of NUM_HISTS number of times.

	ACT_PARM	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

The 4 character GF3 or MEDS-GF3 parameter code against which an action was taken. This repeats with PRC_CODE, VERSION, PRC_DATE, ACT_CODE, AUX_ID and O_VALUE a total of NUM_HISTS number of times.

	AUX_ID	    	DATATYPE IS F_FLOATING
		        MISSING_VALUE FOR DATATRIEVE IS 9999.999
		        EDIT_STRING FOR DATATRIEVE IS "-----.999".

A real number which may be required to further specify the data against which an action was taken. For tides and waterlevel data, this is the time in hours of the data affected. This repeats with PRC_CODE, VERSION, PRC_DATE, ACT_CODE, ACT_PARM, and O_VALUE a total of NUM_HISTS number of times.

	O_VALUE	    	DATATYPE IS F_FLOATING
		        MISSING_VALUE FOR DATATRIEVE IS 9999.999
		        EDIT_STRING FOR DATATRIEVE IS "-----.999".

The original value (that is the CHAN_MULT and CHAN_ADD have been applied) that was present and against which an action was taken. This repeats with PRC_CODE, VERSION, PRC_DATE, ACT_CODE, ACT_PARM, and AUX_ID a total of NUM_HISTS number of times.

III. DESCRIPTION OF THE PROFILE RECORD FORMAT

The format is made up of two sections. The first section or table contains information that occurs only once per station and represents the single key into the data profile. It includes the identification, time, data type and segment number. Thus section 1 always consists of a table with a single row. Section 2 consists of a single field containing the number of depths for the profile. The third section contains a variable numbers of rows depending on the data that has been observed at the station.

In the following sections the record format is described in terms of the structures and elements defined in VAX fortran and in the VAX Common Data Dictionary (CDD).

Section 1. Mandatory Key Fields

Section 1 consists of one structure. This section is identical to the KEY1 structure in the station record with the addition of the PROF_TYPE and PROF_SEG fields.

KEY0 STRUCTURE. (mandatory)
	ONE_DEG_SQ	DATATYPE IS SIGNED LONGWORD.
END KEY0 STRUCTURE.

KEY1 STRUCTURE. (mandatory)

	CR_NUMBER	DATATYPE IS TEXT
		        SIZE IS 10 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(10)".

The CR_NUMBER is a MEDS assigned cruise number. It is composed as CCiiYYnnn where CC = 2 digit country code, ii = 2 character ship code for countries other than Canada, or a 2 character institute code for Canadian cruises, YY = last 2 digits of year of the cruise, nnn = a MEDS assigned sequence number that is often the same as the originator's cruise number.

For IGOSS data the form is SSSSSSSSYY where SSSSSSSS is the call sign, and YY is the last two digits of the year. Note that if a platform call sign is less than 8 characters long it is right filled by blanks.

The CR_NUMBER is also used as a key in an information file containing metadata about the cruise.

	STN_NUMBER	DATATYPE IS SIGNED WORD.

The MEDS assigned consecutive station number. The station number is not used for the real time IGOSS data (set to zero).

	PROF_TYPE	DATATYPE IS TEXT
		        SIZE IS 4 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(4)".

A 4 character parameter code indicating the type of profile. Where possible, MEDS uses the first four characters of the GF3 parameter code.

	PROFILE_SEG	DATATYPE IS TEXT
		        SIZE IS 2 CHARACTERS
		        PICTURE FOR DATATRIEVE IS "X(2)".

A two character digit indicating the segment number of this profile. Leading zeros must be encoded into the field using the format I2.2

END KEY1 STRUCTURE.

Section 2. Number of Depths

	NO_DEPTHS	DATATYPE IS SIGNED WORD.

The number of depths at which subsurface observations are to be found in the PROF STRUCTURE in section 2.

	D_P_CODE	DATATYPE IS TEXT
		        SIZE IS 1 CHARACTER
		        PICTURE FOR DATATRIEVE IS "X".

A single character code to indicate if the DEPTH_PRESS field contains depth or pressure observation. The code "D" is used to indicate depth and "P" to indicate pressure. This repeats with DEPTH_PRESS, PARM and Q_PARM a total of NO_DEPTHS times.

Section 3. Profile Structure

PROF STRUCTURE

	OCCURS 0 TO 1500 TIMES DEPENDING ON PROFILE.NO_DEPTHS.

	DEPTH_PRESS	DATATYPE IS F_FLOATING
		        MISSING_VALUE FOR DATATRIEVE IS 99999.999
		        EDIT_STRING FOR DATATRIEVE IS "------.999".

A field used to store either the observed depth or pressure of the observation. The content of D_P_CODE determines which variable is stored in this field. Values are in either metres for depth or decibars for pressure. This repeats with D_P_CODE, PARM and Q_PARM a total of NO_DEPTHS times.

	PARM	    DATATYPE IS F_FLOATING
		    EDIT_STRING FOR DATATRIEVE IS "-----.999".

A generic name representing the value of a parameter which has been specified using a parameter code. This repeats with D_P_CODE, DEPTH_PRESS, and Q_PARM a total of NO_DEPTHS times.

	Q_PARM	    DATATYPE IS TEXT
		    SIZE IS 1 CHARACTER
		    PICTURE FOR DATATRIEVE IS "X".

A single character code to indicate the level of data quality for all subsurface observations without specific data dictionary entries. This field uses the IGOSS flag codes contained in IOC/WMO Manuals and Guides #3. This repeats with D_P_CODE, DEPTH_PRESS, and PARM a total of NO_DEPTHS times.

END PROF STRUCTURE.

ANNEX 1
Additional Parameters for the Sequential Version of the Ocean Format

The following four parameters have been added to the indexed sequential archive format for ocean data to provide a sortable sequential version of the format which has been called the processing format.

MKEY (message number)

This message number is assigned as the file is created and serves as a convenient simple unique number sorting in the various stages of processing and as the key in a temporary ISAM file that is used in the duplicates identifications software.

IUMSGNO (unique message number)

This is a unique message number that is assigned arbitrarily by the duplicates identification software. The duplicates identification software will recognize groups of two or more messages that have the same identification-date-time-subsurface information or same fuzzy-area, fuzzy-time, subsurface information. Each such group of duplicates are assigned the same unique message number. Using the unique message number one can then sort the file and process the groups of duplicates for data flow monitoring and other purposes.

STREAM_SOURCE (stream source)

The stream source variable is used by the duplicates identification/management system to determine whether the data originated in an existing MEDS database, or from an input stream of new data. This is required for the system to make decisions about whether an observation needs to be updated into the databases.

UFLAG (update flag)

The update flag is assigned by the duplicates identification/management software. This flag is then used by the update program to determine whether an observation is to be updated into a database, ignored in the input run (already in database or lower priority copy) or flagged inactive in a database (a higher quality copy of the observation is now available in the databases).

ANNEX 2
Subroutine to Read the Sequential Version of the Ocean Format

      SUBROUTINE READ_OCEAN_STN_SEQ (MKEY,IUMSGNO,STREAM_SOURCE,UFLAG,
     + INUNIT3,NRTN3)
C
C     *******************************************************************
C
C     Common Storage areas for the Ocean Processing and Archive Formats
C
      COMMON/OCEANSTN/ ONE_DEG_SQ,CR_NUMBER,OBS_YEAR,OBS_MONTH,OBS_DAY,
     + OBS_TIME,DATA_TYPE,STN_NUMBER,LATITUDE,LONGITUDE,Q_POS,
     + Q_DATE_TIME,Q_RECORD,UP_DATE,BUL_TIME,BUL_HEADER,SOURCE_ID,
     + STREAM_IDENT,QC_VERSION,DATA_AVAIL,NO_PROF,NPARMS,NSURFC,
     + NUM_HISTS,NO_SEG(20),PROF_TYPE(20),DUP_FLAG(20),DIGIT_CODE(20),
     + STANDARD(20),DEEP_DEPTH(20),PCODE(20),PARM(20),Q_PARM(20),
     + SRFC_CODE(20),SRFC_PARM(20),SRFC_Q_PARM(20),
     + IDENT_CODE(20),PRC_CODE(20),VERSION(20),PRC_DATE(20),
     + ACT_CODE(20),ACT_PARM(20),AUX_ID(20),ORIG_VAL(20)
C
      COMMON/OCEANPRF/PROFILE_SEG(3),NO_DEPTHS(3),D_P_CODE(3),
     + DEPTH_PRESS(3,500),PROF_PARM(3,500),PROF_Q_PARM(3,500)
C
      CHARACTER*10 CR_NUMBER,OBS_YEAR*4,OBS_MONTH*2,OBS_DAY*2,
     + OBS_TIME*4,DATA_TYPE*2,Q_POS*1,Q_DATE_TIME*1,Q_RECORD*1,
     + UP_DATE*8,BUL_TIME*12,BUL_HEADER*6,SOURCE_ID*4,
     + STREAM_IDENT*4,QC_VERSION*4,DATA_AVAIL*1,PROF_TYPE*4,DUP_FLAG*1,
     + DIGIT_CODE*1,STANDARD*1,PCODE*4,Q_PARM*1,SRFC_CODE*4,
     + SRFC_PARM*10,SRFC_Q_PARM*1,IDENT_CODE*2,
     + PRC_CODE*4,VERSION*4,ACT_CODE*2,ACT_PARM*4,
     + PROFILE_SEG*2,D_P_CODE*1,PROF_Q_PARM*1
C
      INTEGER*2 STN_NUMBER,NO_PROF,NPARMS,NUM_HISTS,NO_SEG,NSURFC,
     + NO_DEPTHS
C
      INTEGER*4 PRC_DATE,ONE_DEG_SQ
C
      REAL*4 LATITUDE,LONGITUDE,DEEP_DEPTH,PARM,AUX_ID,ORIG_VAL,
     + DEPTH_PRESS,PROF_PARM
C
C     *****************************************************************
C
      INTEGER*4 INUNIT1,OUNIT1,INUNIT2,OUNIT2,INUNIT,OUNIT,INUNIT3,
     + OUNIT3,IUMSGNO
      CHARACTER*28 RETKEY,CRN*10,YEAR*4,MONTH*2,DAY*2,HOUR*4,PTYPE*4
      CHARACTER*8 MKEY,MKEY2,STREAM_SOURCE*1,UFLAG*1,KEY6*6
C
      DATA NRECS/0/
C
C
C     Entry to read an ocean record in sequential format
C
      READ (INUNIT3,END=180,ERR=170) MKEY,ONE_DEG_SQ,CR_NUMBER,
     + OBS_YEAR,OBS_MONTH,OBS_DAY,OBS_TIME,DATA_TYPE,IUMSGNO,
     + STREAM_SOURCE,UFLAG,STN_NUMBER,LATITUDE,LONGITUDE,Q_POS,
     + Q_DATE_TIME,Q_RECORD,UP_DATE,BUL_TIME,BUL_HEADER,SOURCE_ID,
     + STREAM_IDENT,QC_VERSION,DATA_AVAIL,NO_PROF,NPARMS,NSURFC,
     + NUM_HISTS,(NO_SEG(I),PROF_TYPE(I),DUP_FLAG(I),DIGIT_CODE(I),
     + STANDARD(I),DEEP_DEPTH(I),I=1,NO_PROF),(PCODE(I),PARM(I),
     + Q_PARM(I),I=1,NPARMS),(SRFC_CODE(I),SRFC_PARM(I),SRFC_Q_PARM(I),
     + I=1,NSURFC),(IDENT_CODE(I),PRC_CODE(I),VERSION(I),PRC_DATE(I),
     + ACT_CODE(I),ACT_PARM(I),AUX_ID(I),ORIG_VAL(I),I=1,NUM_HISTS)
      NRECS = NRECS + 1
      DO 18 J=1,NO_PROF
18    READ(INUNIT3) MKEY2,ONE_DEG_SQ,CR_NUMBER,OBS_YEAR,OBS_MONTH,
     + OBS_DAY,OBS_TIME,DATA_TYPE,IUMSGNO,PROF_TYPE(J),
     + PROFILE_SEG(J),NO_DEPTHS(J),D_P_CODE(J),(DEPTH_PRESS(J,I),
     + PROF_PARM(J,I),PROF_Q_PARM(J,I),I=1,NO_DEPTHS(J))
C
      NRTN3 = 1
      RETURN
180   NRTN3 = 2
      RETURN
170   WRITE (6,6001) NRECS
6001  FORMAT (' ERROR IN READING SEQUENTIAL FORMAT AFTER RECORD',I6)
      STOP
C
C     Entry to write an ocean record in sequential format
C
      ENTRY WRITE_OCEAN_STN_SEQ (MKEY,IUMSGNO,STREAM_SOURCE,UFLAG,
     + OUNIT3)
C
      WRITE (OUNIT3) MKEY,ONE_DEG_SQ,CR_NUMBER,OBS_YEAR,OBS_MONTH,
     + OBS_DAY,OBS_TIME,DATA_TYPE,IUMSGNO,STREAM_SOURCE,UFLAG,
     + STN_NUMBER,LATITUDE,LONGITUDE,Q_POS,
     + Q_DATE_TIME,Q_RECORD,UP_DATE,BUL_TIME,BUL_HEADER,SOURCE_ID,
     + STREAM_IDENT,QC_VERSION,DATA_AVAIL,NO_PROF,NPARMS,NSURFC,
     + NUM_HISTS,(NO_SEG(I),PROF_TYPE(I),DUP_FLAG(I),DIGIT_CODE(I),
     + STANDARD(I),DEEP_DEPTH(I),I=1,NO_PROF),(PCODE(I),PARM(I),
     + Q_PARM(I),I=1,NPARMS),(SRFC_CODE(I),SRFC_PARM(I),SRFC_Q_PARM(I),
     + I=1,NSURFC),(IDENT_CODE(I),PRC_CODE(I),VERSION(I),PRC_DATE(I),
     + ACT_CODE(I),ACT_PARM(I),AUX_ID(I),ORIG_VAL(I),I=1,NUM_HISTS)
      DO 181 J=1,NO_PROF
      READ (MKEY,3001) KEY6,NREC
3001  FORMAT (A6,I2.2)
      NREC = NREC + 1
      WRITE (MKEY,3001) KEY6,NREC
181   WRITE (OUNIT3) MKEY,ONE_DEG_SQ,CR_NUMBER,OBS_YEAR,OBS_MONTH,
     + OBS_DAY,OBS_TIME,DATA_TYPE,IUMSGNO,PROF_TYPE(J),PROFILE_SEG(J),
     + NO_DEPTHS(J),D_P_CODE(J),(DEPTH_PRESS(J,I),PROF_PARM(J,I),
     + PROF_Q_PARM(J,I),I=1,NO_DEPTHS(J))
C
      RETURN
C
C     Entry to list the contents of an ocean station record
C
      ENTRY LIST_OCEAN_STN
C
      WRITE (6,4002)
4002  FORMAT (/'******************************')
C
      WRITE (6,4003) ONE_DEG_SQ,CR_NUMBER,OBS_YEAR,OBS_MONTH,OBS_DAY,
     + OBS_TIME
4003  FORMAT (' ONE_DEG_SQ',1X,I6,3X,'CR_NUMBER',1X,A14,2X,'OBS_DATE',
     + 1X,A4,2A2,3X,'OBS_TIME',1X,A4)
C
      WRITE (6,4004) DATA_TYPE,STN_NUMBER,LATITUDE,LONGITUDE
4004  FORMAT (' DATA_TYPE',1X,A2,3X,' STN_NUMBER ',I5,3X,' LATITUDE',
     + F10.4,3X,'LONGITUDE',F10.4)
C
      WRITE (6,4005) Q_POS,Q_DATE_TIME,Q_RECORD,UP_DATE
4005  FORMAT (' Q_POS ',A1,3X,'Q_DATE_TIME ',A1,3X,'Q_RECORD ',A1,3X,
     + 'UP_DATE ',A8)
C
      WRITE (6,4011) BUL_TIME,BUL_HEADER,SOURCE_ID,STREAM_IDENT,
     + QC_VERSION,DATA_AVAIL
4011  FORMAT (' BUL_TIME',1X,A12,2X,'BUL_HEADER',1X,A6,2X,
     + 'SOURCE_ID',1X,A4,2X,'STREAM_IDENT',1X,A4/' QC_VERSION ',A4,3X,
     + 'DATA_AVAIL ',A1)
C
      WRITE (6,4006) NO_PROF,(NO_SEG(I),PROF_TYPE(I),DUP_FLAG(I),
     + DIGIT_CODE(I),STANDARD(I),DEEP_DEPTH(I),I=1,NO_PROF)
4006  FORMAT (' VECTOR OF ',I2,' PROFILE DESCRIPTORS - NO_SEG,'
     + ' PROF_TYPE, DUP_FLAG,'/'   DIGIT_CODE, STANDARD, DEEP_DEPTH'/
     + (I5,1X,A4,1X,A1,1X,A1,1X,A1,F8.1,'.'))
C
      WRITE (6,4007)  NPARMS,(PCODE(I),PARM(I),Q_PARM(I),I=1,NPARMS)
4007  FORMAT (' VECTOR OF ',I2,' STATION LEVEL PARAMETERS - PCODE,'
     + ' PARM, Q_PARM'/3(3X,A4,F10.4,3X,A1))
C
      WRITE (6,4014) NSURFC,(SRFC_CODE(I),SRFC_PARM(I),SRFC_Q_PARM(I),
     + I=1,NSURFC)
4014  FORMAT (' VECTOR OF ',I2,' STATION CHARACTER FIELDS - ',
     + 'SRFC_CODE, SRFC_PARM, SRFC_Q_PARM'/3(3X,A4,1X,A10,1X,A1))
C
      WRITE (6,4010) NUM_HISTS,(IDENT_CODE(I),PRC_CODE(I),VERSION(I),
     + PRC_DATE(I),ACT_CODE(I),ACT_PARM(I),AUX_ID(I),ORIG_VAL(I),
     + I=1,NUM_HISTS)
4010  FORMAT (' VECTOR OF ',I3,' HISTORY RECORDS'/
     + ' - IDENT_CODE, PRC_CODE, VERSION, PRC_DATE, ACT_CODE, ',
     + 'ACT_PARM, AUX_ID, ORIG_VAL'/
     + (1X,A2,1X,A4,2X,A4,2X,I8,2X,A2,2X,A4,2X,F9.3,2X,F9.3))
      DO 50 J=1,NO_PROF
50    WRITE (6,4012) NO_SEG(J),PROF_TYPE(J),NO_DEPTHS(J),D_P_CODE(J),
     + (DEPTH_PRESS(J,I),PROF_PARM(J,I),PROF_Q_PARM(J,I),I=1,
     + NO_DEPTHS(J))
4012  FORMAT (' PROFILE: SEG =',I3,3X,'TYPE ',A4,3X,'NO_DEPTHS ',
     + I5,3X,'D-P-CODE ',A1/' - DEPTH_PRESS, PROF_PARM, PROF_Q_PARM'/
     + 3(6X,F8.1,F8.3,1X,A1))
      RETURN
      END

Basin/Month	April	May	June	July	August
Atlantic	708	2,086	1,798	1,902	1,766
Indian	478	507	257	138	312
Pacific	2,809	3,409	2,746	2,662	3,366

GTS Header	# BATHYs	#TESACs
SOVD01 KWBC	155	35
SOVD10 RUMS	162	43
etc.

STREAM_IDENT	No. BATHYs	% of Total	No. TESACs	% of Total
URBA	1500	96%	0	0
URTE	350	75%	0	0
FRBA	1550	98%	0	0
etc.

AUBA	AUTE	MDBA	MDTE	FRBA	FRTE	URBA	URTE	USBA	USTE
SOVD1 RUMS		SOVD1 RUMS		SOVD1 RUMS		SOVD1 RUMS
SOVD1 RUMS