The HEASARC Database System
Accessing the HEASARC Database and Metadata
This section describes three mechanisms through which (primarily internal) users can query the HEASARC database interactively beyond standard Browse user table queries. Users may wish to examine the HEASARC metadata, perform queries that cannot be made by Browse directly, investigate problems with table queries, or have other reasons for such access. APIs for software access are discussed elsewhere.
The three mechanisms are the HDBmc command-line interface, the Browse Keyword Search interface to query HEASARC metadata, and vendor-supplied tools for database access. After discussing these access methods, a few examples of the kinds of queries a user might attempt using either HDBmc or other access methods supporting SQL queries are given.
The HDBmc command
The HDBmc command is available for querying the HEASARC database from most HEASARC servers in the /heasarc/bin/ directory. This command provides a tool for doing a single simple query of the database or metadata.
The HDBmc command may be used in the following manner:
HDBmc observatory=AnObservatoryName HDBmc table=ATableName HDBmc parameter=ATableName HDBmc sql="Any sql query"
The optional command-line parameter -host=hostname[:port] may be used to specify the database host name and socket port. The default port is 40975. This argument can be used to connect to the backup database if desired, but it is mostly used when debugging databases or software. There are four commands available within HDBmc.
HDBmc: observatory
The observatory command lists tables available for a given observatory with each table giving a pair of lines of the form
table_name=sometable observatory_name=someobservatory
The observatory may include a '*' as a wildcard character. This query uses the observatory name as specified in the ZZEXT observatory_name virtual parameter.
Example:
HDBmc observatory="GENERAL CATALOG"
Response:
table_name=B/2mass observatory_name=GENERAL CATALOG table_name=I/252 observatory_name=GENERAL CATALOG table_name=I/271 observatory_name=GENERAL CATALOG table_name=heasarc_class observatory_name=GENERAL CATALOG table_name=heasarc_messier observatory_name=GENERAL CATALOG table_name=heasarc_ngc2000 observatory_name=GENERAL CATALOG table_name=heasarc_pg observatory_name=GENERAL CATALOG
HDBmc: table
The table request returns the ZZGEN and ZZEXT metadata associated with a given table as a list of the form
field=value
where field gives the values in the ZZGEN table and then any virtual parameters for this table from the ZZEXT table. The input table name can be specified as a wildcard so that the output can include many tables. Output for each table begins with a line of the form
table_name=theName
Example:
HDBmc table=heasarc_messier
Response:
table_name=heasarc_messier table_location=dbms1.gsfc.nasa.gov table_description=Messier Nebulae table_document_url=http://heasarc.gsfc.nasa.gov/W3Browse/general-catalog/messier.html create_date=2003-05-05 18:39:01 modify_date=2003-05-05 18:39:01 table_rows=109 equinox=2000 unique_key=name declination=@dec table_priority=3 table_type=Object target_name=@name right_ascension=@ra default_search_radius=60 frequency_regime=Optical observatory_name=GENERAL CATALOG
HDBmc: parameter
The parameter request gives the details for the parameters of the table. A new parameter is signaled by a pair of lines of the form
table_name=aTableName parameter_name=aParameterNAme
Example:
HDBmc parameter=heasarc_messier
Response:
table_name=heasarc_messier parameter_name=object_type parameter_description=Object Category parameter_format=char2 parameter_unit= parameter_is_index=Y parameter_minval=DI parameter_maxval=S parameter_comment= parameter_default=0 table_name=heasarc_messier parameter_name=bii parameter_description=Galactic Latitude parameter_format=float8 parameter_unit=degree parameter_is_index=Y parameter_minval=-51.932565 parameter_maxval=84.424982 parameter_comment= parameter_default=0 table_name=heasarc_messier parameter_name=lii parameter_description=Galactic Longitude parameter_format=float8 parameter_unit=degree parameter_is_index=Y parameter_minval=1.722578 parameter_maxval=356.865940 parameter_comment= parameter_default=0 ...
HDBmc: sql
The SQL command lets the user send an arbitrary SQL query to the database. The result will be pipe delimited table. Note that the SQL statement needs to be enclosed in quotes to ensure that it is seen as a single token. Double quotes are recommended since SQL uses single quotes as the delimiter for strings. E.g.,
HDBmc sql="select * from zzext where table_name='zzdp'"
would return all of the ZZEXT entries for the ZZDP table (the shortcuts discussed in the previous chapter).
Example:
HDBmc sql="select * from heasarc_messier"
Response:
+--------------+----------------------+-----------+--------------+----------------------+... |alt_name |bii |class |constell |dec | +--------------+----------------------+-----------+--------------+----------------------+ |NGC 6475 | -4.5212902| 3600|SCO | -34.816664907176| |NGC 6637 | -10.2731749| 3080|SGR | -32.350009145856| |NGC 6681 | -12.5107943| 3080|SGR | -32.300012468954| |NGC 6405 | -0.7155348| 3600|SCO | -32.21666091568| |NGC 6809 | -23.2733634| 3080|SGR | -30.966694770854| |NGC 6715 | -14.0974428| 3080|SGR | -30.483349176839| |NGC 6266 | 7.3185708| 3080|OPH | -30.116649736484| ... |NGC 3034 | 40.5596785| 6600|UMA | 69.683390351975| +--------------+----------------------+-----------+--------------+----------------------+ (109 rows)
(Note that only the first five fields of the output were shown.)
Note: The HDBfilter utility is extremely useful for formatting the results of a SQL query made using HDBmc.
A more detailed discussion (suitable for programmers) pertaining to the HDBmc command and its inner workings is in another document.
Querying HEASARC Metadata Using Browse Keyword Search
The Browse Keyword Search interface (help) provides a way that users can query Browse metadata through the Web. If users enter the name of a Browse metadata table, then that table will be queried. If there are other keyword parameters, then the specified metadata table will be queried for values for the Browse user tables whose metadata matches the other keywords.
E.g., to simply dump the first few values in the ZZGEN table the user can enter
zzgen
But if the user wants only the XMMMASTER entry for ZZGEN they could try
zzgen xmmmaster
Or to get the ZZGEN entries for all ASCA tables
zzgen asca
Keywords are searched against all metadata so one can get more results than one expects. E.g.,
zzgen xtemaster
will return results for both XTEMASTER and XTEINDEX since XTEINDEX mentions XTEMASTER in its metadata. If the user is searching a table that does not have a table_name field, then the additional keywords should not be specified. The table name is matched against the list of tables that the user has specified.
Users can put qualifications on the query themselves. E.g.,
zzext table_name=zzdp
will list the data product shortcuts which are stored in the ZZEXT table with the table_name set to zzdp. Or try
zzext parameter_name=observatory_name
to get the mission/observatory associated with each table.
The ZZGEN, ZZPAR, ZZEXT, ZZDP, ZZDPSETS, ZZMASTER, ZZWORDS, ZZLINK, and ZZBIB tables can all be queried using this interface. One note regarding querying the ZZBIB table: It is possible for Browse Keyword Search interface to get confused with the bibcode strings. These often have multiple consecutive periods which can be interpreted as the Browse parameter value range operator ("..").
The row and table limits of the keyword search interface apply to these metdata queries. Only the first 200 matching rows will be shown by default if keywords are specified only the top 10 matching tables will be included. The rows and tables qualifiers can be used to change these settings. E.g.,
zzpar x-ray r:all t:all
will describe all parameters of all rows for any table that includes x-ray in its metadata.
Vendor-Supplied Methods of Accessing the HEASARC
Database
All major database vendors provide tools to access their respective database systems. These tools usually need to run on the machines on which the database software is installed, but they provide full access to the capabilities of the database. With these tools users can create, view, modify and query tables. The specific vendor-supplied tools available at the HEASARC are discussed in the appendix. Generally, they provide a mechanism for executing an arbitrary SQL command and displaying the results to a terminal. More sophisticated GUI tools may also be available.
Typically, database configuration and performance tuning are possible only through these vendor-supplied interfaces.
Common Database Tasks
While the Browse tools simplify access to the database, they provide only read access to the database and do not give the users access to all the query capibilities availble through SQL. In this section we describe a few queries that cannot be executed in Browse but might be of interest to a user. This is not intended as an SQL primer. SQL documentation is supplied with the vendor-supplied database system and is also widely available on the Web.
List all of the tables in the database
SQL: select table_name from zzgen
Note that this will include tables which are not displayed in Browse either because they are metadata tables, they are missing required ZZEXT metadata (e.g., a mission to be associated with), or they have been explicitly turned off using the w3browse_display=N virtual parameter mechanism. An equivalent query can be done through the Browse Keyword Search interface.
List all of the tables with data products
SQL: select distinct table_name from zzdpsets
Note the usage of the distinct attribute since we don't want to display a table multiple times when if it has multiple data products.
List all of the missions that have data products
SQL: select distinct parameter_value from zzext where table_name in (select distinct table_name from zzdpsets) and parameter_name='observatory_name'
This involves a join of tables. We put on a requirement that the tables we're interested have data products and then display the missions.
Fix a particular value in a table where it is wrong in only a single location
SQL: update heasarc_table set theparameter=newValue where _unique_id=12345
This uses the SQL update command. Normally, updates should be done by modifying the appropriate TDAT file and re-ingesting the table (see the next section), but this can be used under some circumstances. Note the use of the '_unique_id'. Almost all tables have this field which increments for each row so that it is guranteed to be unique. This field should be used in preference to other fields if you want to be absolutely certain you are modifying only a single row.
Change a field value whereever it has a certain value
Suppose a table uses the string 'STORED' to indicate the data is in our archive, but you wish to change that string to 'ARCHIVED' to be more compatible with the rest of our tables.
SQL: update heasarc_table set status='ARCHIVED' where status='STORED'
The system will normally give an indication of the number of rows updated. Once such a change has been made, you may wish to immediately exgest the table to a TDAT file after making such changes.
Create a copy of a table
SQL select * into newtable from heasarc_table
This syntax is not standard SQL, but it is widely supported. Note that this does not copy the metadata.
Note: You can also create a copy of a table including the metadata easily using the HDBexgest tool to export the table as a TDAT. Then, after a simple modification of the TDAT file, you can use the HDBingest tool to ingest the new table. This is probably the preferred procedure under most circumstances and is descibed in another section.
Copy the ZZEXT metadata from an old table to a new table
E.g., the table created in the previous example.
SQL insert into zzext select "newtable", parameter_name, parameter_value from zzext where table_name='heasarc_table'
This is a little more tedious for the other tables, since they have more fields, but it is easy enough to write a script and then edit it when needed. Note how we do not just copy the old table name (heasarc_table) into zzext.
Once you have a copy of the table and metadata, you may modify the new table however you like. The HDBexgest tool can be used to save the table you end up with.
How many objects of each different class are there in the WGACAT catalog?
SQL: select distinct count(*),class from heasarc_wgacat group by class order by class
Using the group by clause is advanced SQL. The standard Browse interface does not expose such advanced querying methods.
Documentation prepared by the HEASARC Database Group
HEASARC Home | Observatories | Archive | Calibration | Software | Tools | Students/Teachers/Public
Last modified: Wednesday, 01-Feb-2006 11:14:05 EST