Federal CIO Council

XML Working Group

 

Wednesday, March 20, 2002 Meeting Minutes

 

GSA Headquarters

18th & F Streets, N.W, Room 5141

Washington DC 20405

 

Please send all comments or corrections to these minutes to Glenn Little at glittle@lmi.org.

 

Mr. Ambur opened the meeting by introducing himself, briefly explaining the day’s focus, and asking all participants to introduce themselves. He then turned the meeting over to Ms. Lisa Carnahan of the National Institute of Standards and Technology (NIST) for an account of NIST’s XML Registry pilot.

 

Ms. Carnahan mentioned that she did not have the presentation she had intended to display, and that it was not on the website. [Editor’s note: Ms. Carnahan’s presentation is now available at http://xml.gov/presentations/nist/index.html.]

 

Mr. Ambur mentioned that the Department of the Interior has been disconnected from the Internet for three months, and he has therefore been unable to receive Powerpoint presentations. Otherwise, he would have made the slides available to the group. Mr. Ambur now uses DSL at his temporary home workplace, so he can accommodate larger files electronically. [Editor's note:  Internet access has now been restored at Mr. Ambur’s office, where he can once again be reached via E-mail at Owen_Ambur@fws.gov]

 

 

Ms. Carnahan:

 

I want to talk a little about where we are on this registry, which will be housed at xml.gov. It’s a proof-of-concept registry. We’ll talk about the specific goals I have in mind for the Registry, and hopefully we’ll achieve rough consensus. I’d also like to talk a little bit about the Registry Project Team. We’ll examine the plan in the big scheme of things, and walk through it. Right now, we’re on a dial up line so we’ll see how it goes. (From this point on, Ms. Carnahan displayed a series of web pages.) The NIST folks are actively working on this, so we’ll see what happens.

 

I’d like to talk a little about the mission of the Registry. The mission of the XML Working Group (WG) is to educate people on the use of XML. The WG wouldn’t be doing its job if we didn’t try to get people together to avoid having multiple efforts defining the same vocabularies and not collaborating, if possible. The mission of the Registry itself is the facilitation of that effort. We need to get people in those horizontal areas to work together. It’s not a mandatory registry. It’s at NIST. It’s not a policy making tool. You’re not mandated to put things here. We encourage you to put XML-related documents here. If you can get some contact between agencies, and establish some synergy, then that’s what it’s for.

I don’t think there’s consensus in the federal world of what form registries will take. I don’t think there’ll be one big centralized registry—that’s not the solution. This might be a registry of registries. In terms of proof-of- concept it’s based on ebXML version 1. We want it to be standards-based—look at the services those standard offer. We can determine whether we think the services make sense. I’ve been talking to Joel Munter at Intel to explore the possibility of connecting to the UDDI registries.

 

Mr. Michael Jacobs:  Are you making use of ebXML version 1.0 or 2.0 in you r list of requirements?

 

We started with version 1. DLA had already paid for it so we got it for free. I’m familiar with the specifications. It’s OK. Some of the services aren’t there, and some don’t need to be—so I’m comfortable with what it’s providing.

 

Mr. Marion Royal:  The ebXML repository is more than just a holder of schemas. It’s also the holder of trading partner agreements and profiles, business processes, and sample business processes. It’s envisioned that it’s the thing that’ll allow you to dynamically establish relationships between partners. It has a long way to go, but it’s part of the vision of the UN/CEFACT effort. They have many aspects that still need work, but that’s the vision.

 

Ms. Carnahan:  Yes I agree, and some of the services that aren’t there don’t need to be any way.

 

Mr. Jacobs:  So it’s proof-of-concept…

 

Ms. Carnahan:  It’s proof of concept. I plan to have that blatant there. It’s important that users understand that.

 

Unidentified participant:  If there is material that people want to share, I wouldn’t make this their only tool.

 

Dr. Glenda Hayes:  I’d encourage you to do it as soon as possible, because the acquisition and logistics people are asking why it isn’t there.

 

Mr. Ambur:  While this is a proof-of-concept, it represents a real need for the government, so the sooner we can move it to an operational status among a distributed set of registries, the better.

 

(Mr. Bruce Bargmeyer joined by phone)

 

Ms. Carnahan:  We had two goals—one was the services, the second was support for the distributed registry model. We needed to identify the requirements and procedural policy issues. We’re hopefully going to be working with DoD because they have an operational registry. We want folks to know there are registered items in there that should be considered as well. We’d like to see interaction between the registries so that we can determine policy, procedural, and technical requirements. So the goals are the services and the distributed model. The project team is a subset of the XML WG. At this point it’s limited to government employees and contractors in a real XML support effort. By XML support effort I’m referring to people who can make use of or are planning a registry.

 

Mr. Jacobs:  You mean a government-sponsored XML effort?

 

Ms. Carnahan:  Yes. Last time we had about eight or nine people in the room. We’re meeting this afternoon.

 

Mr. Royal:  One of the reasons we’re focusing on government people is because we anticipate it’ll be a procurement team. One of the first things we want to do is have a business case for the registry done. We reckon it’s a federal proposition, on the scale of FedBusOpps or FirstGov. The procurement people just went through with a search engine. That’s millions of dollars, so we have to consider that. The first thing we’ll have to do is provide OMB with a business case showing it makes sense, so that’s one of the things Lisa is doing now.

 

Ms. Carnahan:  Yes—we’re trying to get the lessons learned. The combination of the lessons learned and the business case will help us make the appropriate model. We don’t assume that what we have now is necessarily the right thing.

 

Mr. Royal:  We’ve done that in the past. With this we want to do some analysis before we move forward.

 

Ms. Carnahan:  Do some analysis and then develop a plan for it (if appropriate)—maybe a “registry of registries” functionality. That’s where we are. This is the registry as it currently looks [as displayed on NIST’s Web page]. It’s on the Web right now. We don’t advertise because there’s no meaningful data as yet. You can sign up for an account, there are still some problems. Someone will have to manually set you up.

 

Ms. Carnahan engaged in a discussion of the Toolbar links on the web page.

 

Mr. Royal:  There’s no link on xml.gov to this?

 

Ms. Carnahan:  No link as yet. Since you hadn’t seen this we didn’t want to put it on there until you saw it. We’re working with NIST and EPA, and this way we might be able to get some additional synergy going.

 

You can view or submit objects. Right now you can do it either as an artifact or as a new package. You create a container and insert more objects. You can use keywords. This doesn’t support the ebXML classification scheme functionality. We’re trying to come up with meaningful keywords. I try to think of them in terms of what makes a good query because that’s what you use them for. There will be multiple lists of them.

 

One of the things we’ll do is have you see the namespaces from the DoD XML Registry automatically

 

Mr. Jacobs:  What’s the process for updating the software functionality of the registry?

 

Ms. Carnahan:  We’ve decided we’re not taking new software from DNC. We’re taking what we have and going from there. EPA has its own software as well.

 

Mr. Ambur:  In terms of classification schemas I’m interested in exploring the extent to which it supports the Administration’s E-Government Strategy. The Strategy identifies four portfolios—Government-to-Citizen (G2C), Government-to-Government (G2G),  Government-to-Business (G2B), and Internal Effectiveness and Efficiency (IEEE). I’d like the four portfolio managers to view the registry as an indispensable tool they can use to conduct their business more effectively. Toward that end, it would be good if the registered XML elements and schemas were classified according to the portfolios under which they fall.

 

Ms. Carnahan:  I have the list of the business lines. My question is, does it make sense to the people building the systems? It’s not an either/or situation, but what I don’t know is what makes sense to those people. You have coders, etc. registering XML schemas and DTDs in there. I don’t know if the business lines are meaningful.

 

Mr. Royal:  It’s important to remind ourselves that it’s intended to be used by developers and not end-users. I’ve been looking at the United Kingdom government work. They go back to the Dublin core as a classification scheme.

 

Ms. Carnahan:  Maybe there are some marketing opportunities in the “awareness” piece. I wouldn’t think of putting a keyword in anything I do as research.

Mr. Ambur:  The question is, what's a reasonable scope for what we should try to accomplish in the pilot with respect to the classification of registered elements and schemas?

Ms. Carnahan:  This is already out there. There are a lot of relatively trivial samples registered, but nothing of import. I tried to put something in last night and I realized that it validates submissions as correct XML. It’s a feature that initially should be removed or relaxed Currently it is an all or nothing proposition; if the XML is not valid, the whole submission is denied.

 

Mr. Ambur:  That’s an issue.

 

Mr. Royal:  I don’t understand why I register a schema that ‘s not valid and parseable.

Ms. Carnahan:  I don’t know what process they’re using.

 

Unidentified participant:  Maybe we need to find out what process they’re using.

Mr. Ambur:  While it is important to have the capability to validate, from a usability standpoint, I agree with Lisa that we should not make it difficult for people to register their elements and schemas.

Mr. Royal:  The registry should check for validity and well formed documents, but not the error checking and saying “Your schema used a capital s, etc.” But we do need to examine which validator or parser we’re using to make sure it’s a common one.

 

A discussion ensued regarding an xml.gov Registry Project Team meeting to be held later in the day at GSA.

 

Ms. Carnahan:  We’re going to start at 1:00 today. Bruce, do you have the address? You can go to the archive.

 

Mr. Walt Houser ( on telephone):  Is there going to be a conference call?

 

Ms. Carnahan:  Yes, but the phone number is different from this one Walt. I‘ll look it up.

Mr. Ambur:  We met in AIA's boardroom last month. We've had bad experiences with the teleconferencing system there in the past, but we figured that a “listen only” arrangement would be better than nothing, so that's what we provided. For the rest of the fiscal year we have this room and better two-way telecon capability. When no one responded on the telecon when the meeting began, I wondered whether it was worth continuing the service for future meetings, but the fact that Bruce and Walt are now on the line may have saved the service for future telecon participants.

Next we have Glenda Hayes, who is involved with what used to be the DoD DISA Registry but has now been renamed as the DoD Registry.

(While waiting for Ms. Hayes' presentation to load, Mr. Ambur made the following announcements.)

NARA had to lead the "eRecords" project under the Administration's eGov Strategy. One of the milestones they've been assigned to accomplish by February 28 of next year is to complete the records management and archival XML schema. In an early draft of NARA's project plan, they proposed to develop their own registry.  However, that is no longer in the plan and, particularly in light of the short timelines for action under the eGov Strategy, it would certainly make sense for them to use the DoD Registry, at least initially, until the need for a separate registry can be established and justified.

Brand Niemann isn't here today because he's at FOSE where he's showing a voiceXML-enabled application EPA has developed. They have 3000 local emergency response teams identified in a database they were making available on the Web. By voiceXML-enabling the database, they're making the same information available by telephone—which could be critical in the event that access to the Internet is unavailable in an emergency.  [Editor's note: Mr. Niemann and his team received a special recognition award from the CIO Council for technology innovation.] 

Mr. Ambur displayed the "Extending Digital Dividends" guide produced by GSA and noted that it has been rendered as digital talking book.  [Editor's note: It will be presented at the April 17 meeting of the XML WG.]

Lee Holcomb, who chairs the Architecture and Infrastructure group to which this group reports, will be testifying on the Hill tomorrow and will be referencing the XML Workgroup. Next month we had hoped to have a Web services interoperability presentation, but it conflicts with their meeting, which means we have an open slot for next month, and for the May agenda as well.

We're working with the FTS folks at GSA to voice-enable the Blue Pages, which are the government listings in the telephone directories.  We're also getting them together with the Geospatial One-Stop folks at USGS to geospatially enable the Blue Pages, so that government functions will be associated not only with telephone numbers but also geographic service areas.  Doing so will mean that citizens can be connected to the appropriate office without having to know what the telephone number is or where the office is physically located.  The system will automatically determine that for them, based upon where they are calling from or where they need government services delivered.

 

Dr. Hayes:

 

Dr. Hayes:  I support both DoD and IRS.

 

Today’s talk is to provide a status update and a demonstration of the DoD XML Registry. There are two parts:

 

  1. The status of the XML Registry
  2. The status of a policy called the “Common Operating Environment” for DoD adoption.

 

Notice that we’re participating in NIST’s work.

Mr. Ambur:  GAO has been studying the Executive Branch's implementation of XML. I understand their report is due to be released at the end of this month. The draft that I reviewed took OMB and NIST mildly to task for not being more directive in providing guidance to agencies.  While some have suggested that lets this workgroup off the hook, we do have a responsibility to advise the CIO Council and OMB with respect to XML guidance that makes sense. [Editor's note:  GAO's report has been released and is available at http://www.gao.gov/new.items/d02327.pdf]

Dr. Hayes' presentation is available at http://xml.gov/presentations/mitre3/index.html

Dr. Hayes:  It’s on the NIPRNET with Internet access. It’ll be on the SIPRNET as well.

 

Mr. Ambur:  Does everyone know what they are?

 

Dr. Hayes:  It’s a military network. The intention is that everything in the Internet version would be on the internal version, with the addition of classified schemas. There are classified schemas in development that need to be registered. 

Mr. Ambur:  The national defense intelligence folks are working on a schema for sharing data for homeland security.

Dr. Hayes:  We’re tagging this data with secondary marking tags to understand who we can disseminate it to.  The super secret network needs to be able to pass the data down. We can’t necessarily do it right now.

 

Mr. Ambur:  It’s called the Intelligence Community Markup Language.

 

Dr. Hayes:  The first act they were involved with was the development of a schema for a general document. Rather than being proprietary, it would be XML. They’ve been pursuing that activity.

 

Slide 4:  The registry has now implemented added features. We can search by date range and valid value. We have online submission capability and password protection. We’re trying to start working on on-line status changes and reports for namespace managers.

 

Slide 5:  We’ve organized the registry by communities of interest. We’ve felt it necessary to have an organization that’s responsible for the configuration management of that community of interest. We’ve tried to share the responses as much as possible. We didn’t try to identify all of the communities of interest. This is part of our lessons learned from a prior activity. We have the ability for people to propose and bring in new communities of interest…people who share a common need.

 

In come cases, trying to reach agreement in one community of interest can be hard, so in those cases, we’ve divided them up into namespaces. We need a place to hold some that are in flux, with no home.

 

Mr. Jacobs:  Is that a change in nomenclature—that green boxes are communities of interest, and white boxes are namespaces?

 

Dr. Hayes:  We recognize that some of these groups that we’re calling namespaces need a further subdivision. The XML arrangement is flat. We went back to the original terminology and named it. That was our initial desire.

 

Mr. Jeffrey Smith:  Will there be overlap where two communities of interest would share namespaces

 

Dr. Hayes:  No. We wanted to maintain responsibility, so there’s an easy button to push. There’s a lot of collaboration between these communities.

 

Ms. Carnahan:  There’s other agency participation.

 

Dr. Hayes:  The controlled export represents a consortium of not only DoD but the States. There’s interest in how we can support NARA.

 

Mr. Ambur:  DoD is the steward for DoD 5015.2 standard for records management, which is applicable to all Federal agencies.  NARA has endorsed the standard for use by all U.S. federal agencies.  With reference to the XML schema for records management due by February 28, 2003, it would make sense for NARA to use the DoD Registry, rather than trying to develop their own registry within the 18- to 24-month timeline established for results under the Administration's eGov Strategy.

 

Dr. Hayes:  A significant number of those communities of interest have involvement with standardization. We’re trying to create an awareness that there are competing consortia—trying to get some agreement.

 

Slide 6:  Here’s the inventory as a of a couple days ago. It’s spiky but we’ve been getting lots of participation. We have the ability to register the schema data type. There’s an interest in expanding the types of information resources—whether to represent the full trading partner agreements as well as the processing and work flow—we’re feeling our way through this. Considering the schema of the registry, it wouldn’t be difficult to expand to that set. We have a generic model in the Oracle database, so we re able to expand.

 

The quantity alone isn’t a measure of success, but we’re starting to see a significant  amount of reuse. We also have targeting folders, global command and control, and time critical targeting.

 

Slide 7:  This represents the Data Emporium.

 

Mr. Jacobs:  You talk about the number not being a measure of success. Do you plan any metrics in the future, any automated metrics?

 

Dr. Hayes:  Yes. I thought we had a subscription capability. We don’t. The contractor tells us the next one will—so we can specify applications of schemas. We think that’ll be a measure of success. We’re trying to put this registry out with Internet access, but we have to be careful about specifying which applications use which schemas on the Internet. That’ll have to be behind a password and in some cases behind a “classified” firewall.

 

Unidentified participant:  It might not be important for people to know which applications are using it, but rather how many. Is that how it is?

 

Dr. Hayes:  In my organization it’s important to show who.

 

Mr. Smith:  You have to know who’s using it so you don’t blow someone’s application.

 

Mr. Royal:  That’s done by versioning.

 

Dr. Hayes:  [Unknown] would let them subscribe to it. They…

Mr. Ambur: The civilian agencies should have greater freedom to subscribe than the defense agencies.

Dr. Hayes:  Not with the IRS.

Mr. Ambur:  The commercial sector is having problems figuring out how to use digital book standards without risking the loss of intellectual property rights. In the public sector that's not an issue, so that's an instance in which we should be able to get ahead of the commercial sector in serving our stakeholders.

Mr. Jacobs:  Once you’ve identified a schema that lots of systems use, there’s the issue of ownership—now many systems are using it—any conclusions on the best way to handle that, as systems would like it changed?

 

Dr. Hayes:  Those are difficult challenges that have to be worked into acquisition. I don’t think we’ve done really well on that. This is a common operating environment. All the command and control programs are required to use it. It has implications on the services. Many agencies are moving to adopt it. NSA has said they plan to.

 

There are two rules of guidance:

  1. Reuse what’s in there.
  2. Process namespaces for new communities of interest on an as-needed basis.

 

When someone agrees to manage it, there’s a burden. Our hope is that you would piggyback on existing configuration management of data activity—not establish a new bureaucracy that would compete. This guidance that exists within the COE was…notice the date of August 2000.

 

Slide 9:  We received comments and turned them around quickly. For whatever reason, that hasn’t moved forward.

 

The change in terminology from COE XML Registry to DoD XML Registry—we affected that change in “hours”. [This was stated in jest.]

 

Mr. Smith:  Version 8 of the Implementation Plan is in draft—the plan is to just reflect what the final policy looks like?

 

Dr. Hayes:  They don’t want a different XML policy. They want to fix the data plan. Right now they’re working on a draft of the XML Registry policy, which will be issued as a separate policy. One of the objectives was to identify overlaps and work toward convergence.

 

Slide 11:  Within the namespace managers format we established a convergence task force. The finance and accounting namespace suggested we find a victim—find a postal address for the DoD identifier. If we could reach convergence within DoD we’d have more influence in industry.

 

Ms. Carnahan:  Did you look at the article in the news about names and addresses?

 

Dr. Hayes:  We’re going to have a schema for delivery address and have transformations done to the common ones.

 

Mr. Jacobs:  There was a forum on how complex it is.

 

Dr. Hayes:  The more you reach convergence, the more you water down the schema.

 

Mr. Royal:  There’s a new technical committee at OASIS looking at contact information. 

 

Dr. Hayes:  Isn’t that part of the CIQ?

 

Ms. Carnahan:  Yes.

 

Dr. Hayes:  We’ve also been briefed by Jon Bosak. He’s been seeking a DoD liaison on the UBL committee. We don’t know whether we can swing that. We started the pursuit of common domains. Slide 12 has some presumptions.

 

Dr. Hayes read the presumptions to Mr. Bargmeyer.

 

Dr. Hayes:  Borrowing from our experience on these reference sets, we think people will use common domains if they’re easy to use. The next slide shows what a reference set is.

DISA and the Shared Data engineering group have harvested those terms from some of the good things that have happened in the data-sharing program and created over 800 of these sets. In one instance, there was a periodic download of these reference sets. We didn’t know what it was at first.  It turned out that there was someone from the secret network getting them—to put them on the SIPRNET.

 

It’s worthwhile to adopt a frame of reference from common data types. Elements provide the role or context for these data types. It’s similar to the IRS.

 

Slide 14:  I maintain that developers may reuse a data type, but not like the name. We can improve interoperability if we can get reuse of the domains.

 

Slide 15:  If you did create common domains, you can think of them in terms of  Slide 15. For the common operating environment picture, it’s important to know the country codes, but there are also some codes that aren’t included. There’s also a need for exercise codes. XSchema provides an opportunity for union domains—the possibility for interoperability by growing domains—domains that can be “unioned” together. A standard set of unit-of-measure codes and precision codes would be very useful.

Mr. Ambur:  With respect to unions in the sense of database operations, it is noteworthy that the opening line of the Constitution is "We the people ... in order to create a more perfect union ..."  In my view, that's a requirements statement for XML.

Dr. Hayes:  We hope to have these as the first items to go on the enterprise. There are two types:

1.      Normal types

2.      Simple types.

 

Slide 17:  (Country code schema). In many cases we’ve piggybacked on the Navy, and we’ve extended that in some cases. We’ve worked on enumeration. We’re trying to come up with a standard means of expressing that enumeration value, because the intelligence community deals with multiple languages.

 

Mr. Royal:  What’s the difference between the “document” and the “application?”

 

Dr. Hayes:  The document is meant to be human-readable text. Application information is for an application to parse and use.

 

Mr. Royal:  What is the meaning though— what’s the difference between that and annotation?

 

Dr. Hayes:  Annotation doesn’t have a free form text field.

 

Mr. Royal:  The annotation documentation does.

 

Dr. Hayes:  The meaning would be a value—so that instead of seeing the code you’d see the value.

 

Mr. Ambur:  Glenda, we’re over our time.

 

Slide 18:  I said ISO to FIPS isn’t trivial …here’s a graphic of the challenges involved.

 

Slide 19:  This shows our role in the NIST registry proof-of-concept effort.

 

Dr. Hayes displayed a portion of the DISA registry and explained how to conduct a search. She mentioned that she was going via telephone into the NIPRNET (behind the firewalls).

 

Dr. Hayes has created an Access database at Mitre as a helpful tool. DoD is not yet using it, but if the database works, there may be an easy translation to DoD use.

 

Mr. Ambur:  Are there any questions for Glenda?

 

Mr. Royal:  ebXML compatibility—are you still working on that?

 

Dr. Hayes:  Yes—that‘s one of the objectives in working with the NIST proof-of-concept. The DoD registry has yet to break a million dollars.

 

Dr. Hayes:  We don’t want to abandon our environment before it’s proven. What DoD wants to pursue is the development of an API indoor registry.

 

Ms. Carnahan:  I don’t imagine all registries will implement the information model as it’s specified—but as long as you can get information in and out using the services, no one cares what it looks like. It would be your responsibility to move to ebXML, which is as yet unproven, in all its glory?

 

Dr. Hayes:  Many registries won’t necessarily have the ebXML information model as specified, but they’ll be able to use its services.

 

Dr. Hayes:  We can create views without any problems.

 

Unidentified participant:  is there a cost?

 

Dr. Hayes:  No

 

Mr. Royal:  Have you looked at UDEF? 

 

Dr. Hayes:  It’s a taxonomy they use to break down the assignment of object classes and element names. We’re looking at it. At UBL they don’t care what you name the element, but you can always link it together with the UID.

 

Mr. Bargmeyer:  Question  [unable to hear it].

 

Mr. Royal:  I’ve looked at it and I’m not sure that’s a good long-term solution.

 

Dr. Hayes:  I didn’t realize it was associated with UDEF.

 

Mr. Royal:  We’re starting to get too many of those UIDs (Universal Identification Numbers)…

Mr. Ambur:  Glenda said they've spent less than a million dollars. The XML Working Group doesn't have that much, but based on the CIO Council's funding allocations, the registry is their top de facto priority.

Dr. Hayes:  DoD had offered the software and they’ve offered it to the CIO Council. It would not be a million.

 

Break

 

Mr. John Clark introduced Ms. Lynn Hadden from the county of Fairfax for the next presentation.

 

Ms. Hadden:

 

Ms. Hadden:  The purpose of GwoB is to provide an intranet where governments from all levels can collaborate and share knowledge on service offerings to citizens. Actual internet access for citizens to these collaborative services  would be provided at the various levels (i.e. local and state homepages and First Gov.) 

 

Slide 3:  Within this framework services are offered in channels or communities of interest. Within Fairfax County we’re a microcosm of the Federal Government. We deal with everything the Federal Government deals with, and will use a similar framework internally. Today I will discuss deliverables within a given channel—Parks and Recreation.

Mr. Ambur:  The Recreation One-Stop, at recreation.gov, is another of the projects prioritized for rapid enhancement under the Administration's eGov Strategy.

Slide 4:  We needed to determine our process . We met with state, federal, and county technical people, and identified the data that needed to be exchanged. We created a taxonomy—not hierarchical—we  refer to it as an Information Glossary.

The first deliverable within the Parks Channels was the Information Glossary.

 

Slide 6:  At that meeting, we defined these items as the entities we needed to describe

 

Slide 7:  For example, to exchange location data on Park, we created the Location Entity. It has the following attributes: a Name, a type, and a URL. We then defined each of those attributes. We originally wanted to develop XML objects associated with each entity, but in order to facilitate delivery, we created one large schema.

 

Slide 8:  From a visual perspective, the data from New Jersey, Virginia, and Fairfax county can be described as three overlapping circles. The intersection of those circles defines the application schema.

 

Slide 10:  For the Calendar of Events Applications, here is the schema. This is a complex type. This schema definition came from the shared information meeting.

 

Mr. Royal:  How do you save the schema? Who showed you how?

 

Ms. Hadden:  You select File-Save As from the File menu within the browser.  We used  the Microsoft Parser (the Microsoft MSXML XML Parser) which can be installed with IE 6.0 . It validated “OK.”

 

You can comment on the schema. We got this on a tip from Owen. It’s called “QuickTopic.” It lets you post schemas and receive comments.

 

Ms. Hadden demonstrated the procedure to create a comment.

This aligns with Glenda’s concept of “enterprise domain.” It brings it up to a higher level.

Mr. Ambur:  Lynn used the free version of QuickTopic Document Review.  The developer makes it available for free in the hope that users will want to purchase the commercial product in order to take advantage of its added functionality.

Ms. Hadden drew the group’s attention to a comment by Mr. Royal, and then here response, on the Quick Topic page.

 

Mr. Ambur:  What I like about the QuickTopic document paradigm is that it treats comments as attachments to documents, rather than treating documents as attachments to messages, which is the E-mail paradigm. I'm looking forward to seeing XML used natively in such an application.

Slide 13:  Here’s a slide giving guidelines for use of the schema. It discusses how Fairfax County used the schema.

 

The Commonwealth of Virginia needs to show how they’re doing it. We are also waiting for the State of NJ to follow up.

 

A driving theme  for GwoB should be sharing knowledge. We should be participants in  groups like xml.gov. This schema should be in the xml.gov site. Our role is dealing with facilitation. This schema was developed as proof-of-concept

Mr. Ambur:  The initial concept of the Administration's eGov Action Plan was to make government IT services citizen centered, rather than system centric or based upon bureaucratic boundaries. Citizens don't care what part of government provides the services they need. GSA's “Government Without Boundaries” effort is very important toward meeting the objectives of the Administration's eGov Strategy.

Ms. Hadden:  We try to approach it as if there’s no one approach. We want to make it so that any group of people can access the service from multiple sites.

 

The last deliverable within a channel is a registry of Web services. There are a lot of issues around UDDI. It’s basically a private sector effort. Government jurisdictions will need to collaboratively decide if they want to create a Government UDDI Registry.  So for now, we have simply listed the Web Service , the originating jurisdiction, and a brief explanation of what the service offers.

We’d like NJ to dynamically use their information. I don’t think it’s happening yet.

 

Mr. Ambur:  Perhaps we could use your schema as a case study on how to review and comment on submissions to the registry. Marion had some questions about your draft schema.  Did you want to address your comments to him?

 

Ms. Hadden:  We used the Microsoft Parser. Marion, what do you use that prompted your question on the site?

 

Mr. Royal:  XMLspy.

 

Mr. Ambur mentioned that the Parks and Recreation Channel appeared to be an excellent place to implement voiceXML.

 

Ms. Hadden:  We have an interactive voice response system. We want to try to make that happen.

 

Mr. Ambur:  Brand Niemann has said that he's almost embarrassed to say how little time and effort it took to develop EPA's Voice XML application.  The cost of adding Voice XML is very low.  The cost is in gathering and maintaining the data.  Once you've done that, the cost of making the data available by different means is trivial, using XML.

Ms. Hadden:  What’s the forum for discussion and input at xml.gov? How do we create versions and disseminate to state and local entities?

 

Mr. Ambur:  Marion has scanned Lynn’s schemas. Is the group interested in looking at them?

 

Ms. Hadden handed out copies.

 

Mr. Smith:  One thing I noticed is the underscore in the element name. Some parsers would kick that out.

 

Mr. Jacobs:  One of the reasons the Navy’s Developer’s Guide is that there’s some incompatibility somewhere between systems. I don’t know whether it’s parsers.

 

Ms. Hadden:  I have a question about the overall structure—which is the best approach? If you use XML objects (derived from XML elements), people can create different applications and schemas using those XML objects.

 

Ms. Carnahan:  Glenda probably has experience in people registering things as elements instead of schema. Do you find the reuse of elements vs. schemas any different, and useful?

 

Dr. Hayes:  What we saw was significant schema development. The developers are screaming for guidance. It’s the chicken and the egg. We harvested from other data formats, the authoritative sources, using their metadata….

 

Unidentified participant:  We’re getting our reuse in the tag names. Our approach was that some guidance was better than no guidance.

 

Dr. Hayes:  You didn’t impose any constraints…?

 

Ms. Hadden:  This was a “make it happen” scenario. If anyone would like to comment on the schema I’d appreciate it. It’s the first one I’ve created.

 

Mr. Royal:  A lot depends on the application—what you want to do. This is a specific schema, but say I want to develop a common schema for acquisition systems, then you want to take more time and think about whether it’s appropriate to make a flat schema for reused items, or do you want to nest it. That’s where we’re heading, and reusing  it between different branches of the hierarchy.

 

Ms. Carnahan:  Your addresses are used many times over—and contact information. Maybe that contact information is used in other contexts, and maybe the events themselves—although Michael if this works, go for it. In a perfect world they’d be broken out so you’d have that reuse.

 

Dr. Hayes:  There’s a high degree of replication on the next page (i.e. the Parks Facility Schema) so you could use it…

 

Ms. Hadden:  This is within the Parks community of interest, but address, etc. is much broader…Is xml.gov going to look at creating the Enterprise Unions described in the DoD presentation?

 

Ms. Carnahan:  In the proof-of-concept, the idea is to get those people together. I don’t think xml.gov would define that union—it’s more for them to get the people together to then define it. 

 

Dr. Hayes:  Union is a feature within XMLSchema?

Ms. Carnahan:  If there are different efforts where they all define common terms, then you choose one that all of them use.

 

Ms. Hadden:  Was it in the xml.gov vision to include state and local?

Mr. Ambur:  Yes. Lee Holcomb said that he very much wants to include State agency submissions in the registry, and presumably that includes local governments as well. It's more an issue of when we will have something worth using. The policy we agreed on at the first registry/repository team meeting is that anyone can submit anything. I'm a little concerned that we might be overwhelmed with submissions.

Mr. Bargmeyer:  EPA is working with states on XML submissions.

 

Ms. Hadden:  Have you thought about the process as far as facilitating review of this?

 

Ms. Carnahan:  It’s part of the proof-of-concept.

 

Unidentified participant:  It’s more an issue of getting the facilities people who register items together, and maybe down the road look at DoD’s model. If there are already established communities of interest, someone can say, “I’m the one who does it.”

Mr. Ambur:  I'm concerned about the process. If people expect us to conduct a thorough review of their submissions and identify others that may be related, it could be a problem. We don't have the resources to do that. Also, we don't want to make it difficult for folks to register their XML elements and schemas because the result may be to cause them not to use the registry.

Ms. Hadden:  How do you get schema inputs from people? It’s critical for collaboration when participants are geographically dispersed.

Mr. Ambur:  Besides the registry itself, as we upgrade the xml.gov site, I'm interested in providing ancillary features that can support the registry.

Ms. Hadden: Is there any collaborative application development and versioning work going on?

 

Ms. Carnahan:  There’s a tool…

 

Mr. Jacobs:  What are the permissions that are involved?

 

Mr. Ambur:  The person who sets up the QuickTopic discussion registers. Anyone else can comment, with no authentication of their identity.  I’ve been casting about for a native XML equivalent, where you can elect to display the comments in the text or not. I know that people are working on such applications but I don't know if any of them are ready for use in an operational system.

 

Dr. Hayes:  There’s a limit to the comments.

 

Ms. Theresa Yee:  You got the free version?

 

Ms. Hadden:  This is the free version—no ads

 

Mr. Royal:  Steven is doing this for free as a demonstration. The marketing goal is to sell it to corporations who want to use it on an enterprise-wide scale.

 

Mr. Williams:  Is there no collaborative effort for DoD XML. Registration?

 

Dr. Hayes:  There’s currently no capability.

 

Mr. Jacobs:  It’s really a manual effort with the namespace managers.

 

Dr. Hayes:  It’s strictly a visibility issue. There’s a presumption that collaboration has impaired development.

 

Unidentified participant:  Is there a final version of the schema?

 

Dr. Hayes:  It’s important to recognize that DoD is incrementally increasing its capability.

 

Mr. Williams:  We incorporated collaboration as a development tool across the Internet, so parties can comment on proposed changes. I wonder if that might be a better tool

 

Dr. Hayes:  We initially looked at [missed] and extensibility. We were hoping they could provide a collaboration feature for communities of interest, and publish through our registry. It hasn’t happened yet.

 

Mr. Williams:  I had assumed that some of these registries would have this capability. Mine is proprietary. If there’s interest, I’ll try to look at working with people for collaboration.

 

Mr. Williams was asked to described the arrangement of his setup.

 

Mr. Williams:  It’s primarily a people issue. The technology is there. All the participants in our group were geographically diverse, so we needed the Web capability. If there’s any interest, I’ll be happy to work with anyone.

 

Dr. Hayes:  The pay model of getting standards isn’t going to work. We’re going to put ours out there to let people see them…things like “country.” Lynn’s example was “USA”,  which doesn’t conform to FIPS code, so how do you get agreement?

 

Ms. Hadden:  We at state and local government want to collaborate with you.

Mr. Ambur:  I can't speak for ISO, but ANSI has initiated a joint effort with OASIS to specify the metadata elements required to enable folks at least to identify and find out how to obtain the standards that are relevant to their needs and interests.

Mr. Bargmeyer:  Many of us have tried to get the values out of the ISO standards and make them available.

 

Dr. Hayes:  Make them available in XML format, so there’s no interpretation.

 

Mr. Bargmeyer:  Manage it in the 11179 registry and then put it into an XML schema, but that’s a terrible place to maintain them. If we can get these registries to work together, we can do configuration management to make sure the registries are up to date. We’ll have to struggle our way through.

 

Ms. Hadden:  We’re piloting an effort at the County. Things are happening so rapidly that people have to get it out there for comment

 

Mr. Royal:  Hopefully in the future we will gather common sets of data we can agree on and share on the federal side. We have little hope of mandating what agencies will use, so the alternative is to establish precedents. [Missed] have agreed to identify the common elements and to do those with industry and the private sector correctly and forget about their own agency-specific considerations. Most will do what you did—create a database on paper forms or…In the long term we need to establish precedents with HR, online forms, and the Quicklsilver initiative. The eGov task force initiatives—if we can agree on the proper way of expressing those things across agencies, then we get to where we have some good examples to show you.

 

Ms. Hadden:  Put your best practices out there as a draft, and there may be a better way suggested. It’s not a revolution, but an evolution. People see commonalities and begin to say “Why not use this one?”

 

Ms. Dena Cocos:  Can you tell us what is going on from an interagency perspective?

 

Mr. Royal:  HR data network, eGov task force initiatives. We’re not examining data or identifying architecture, but we will be, so as we identify common data elements across agencies, we’ll publish it through this working group.

 

Ms. Cocos:  What should I recommend to my clients?

 

Mr. Royal:  I recommended ebXML until I found out there wasn’t one. We use work in progress from UBL and UN/CEFACT from OASIS. I Think UBL has the greatest chance of success in the near term. They just sent out two things from the Naming and Design Rules committee. It tackles things that ebXML didn’t. UBL says, “right or wrong, we’re going to express it.”

 

Ms. Cocos If they don’t know what tags mean, what do we tell them? I can’t reuse it because I don’t know what the elements mean.

 

Mr. Royal:  Over the longer term, it will be from core components from UN/CEFACT.

Out of that you create the BIE (Business Information Entity). Those are your elements. UBL is creating the core component so you can create these elements

 

Ms. Dena Cocos:  Do you have an idea of the time frame?

 

Mr. Royal:  The time frame for UN/CEFACT core components -2 years architecture, 5 years fro the rest. UBL’s is probably two years. We just published the order document. If that’s approved we can hit the acquisition documents quickly

 

Ms. Hadden: I’m worried about the time lag.

 

Mr. Ambur:  The eGOV strategy expects 24 projects to be delivered in 18 months, so that’s a driving force.

Mr. Ambur:  If there are no other comments, the next presentation is by David Brown of IRS and Michel Biezunski of Coolheads Consulting.

 

Messrs. Brown and Biezunski:

 

Mr. David Brown:  It’s probably best for Michel to start out, then I’ll follow up.

 

Mr. Michel Biezunski:  As Owen said, I’m Michel Biezunski from Coolheads Consulting, and I’m here to give you an XTM update (XML Topic Maps). In July there was a presentation of XTM. I’m going to reintroduce it in case you didn’t get to see it. I’m going to show the model then talk about the relationship between XML and topic maps.

 

Slide 3:  Requirements. We don’t believe there will be a central repository of terms that are uniquely valid for everyone in the world. All previous attempts have failed. When GenCode was created and experienced failure of central tag names, that’s how SGML was originally derived. We think it will be a distributed approach.

 

Slide 4:  XML Topic Maps have been added to the ISO standard. The ISO standard has been made available for free by the Department of Energy.

 

Slide 5:  OASIS is working on “Published Subjects” to promote reliability for publishing what a subject means. The rest of this slide is the historical timeline, which I’ll skip.

 

Slide 6:  Paradigms. You take the sources of information and on top of that you externally apply topics showing what these things mean. The difference is that it’s outside. You can classify the types of topics you’re describing. It’s wider than what a relational database can do. Once you have the superimposed layer, you can map resources to it. Occurrences of a topic are resources (or fragments thereof) that are relevant to a given topic.

 

Slide 7:  Topic Map Constructs. We have Topics (several names), Associations, Occurrences, and Scopes. Scopes are made of several topics, each of which can have several names. Because many things in topic maps are topics, a topic map can be fully documented using the topic map features.

 

Slide 8:  Subject Identities.

 

Slide 9:  Something New for the Semantic Web. There are two kinds of subject: a resource can indicate the subject or be the subject. On the Web, every URL is considered the subject. This is important. 

 

Slide 10:  Current developments.

 

Slide 11:  The overlap with W3C’s RDF is an issue we are working on right now. RDF is similar, but topic maps are higher-level languages. It’s similar to Visual Basic versus Assembly Language. RDF is closer to Assembly Language, so every RDF project looks different. With topic maps, you know the object you’re talking about, which is something good for indexing information. You don’t need a programmer to explain it to them. Both levels of language are necessary.

 

We use the comparison of the way the atom is viewed in physics and chemistry. We need both points of view for a better understanding. We finally found how to express topic maps as RDF, so we can them add to existing topic maps. We’re just starting that now.

 

Slide 12:  (Skipped).

 

Slide 13:  XML and Topic Maps. Topic maps are expressed in XML, but it’s a different view. In XML, you can’t get a different view; with topic maps, you can apply as may schemas or perspectives as you like—even on the same XML. Topic maps simplify the use of XML. 

 

Mr. Williams:  What is the mechanism used for linking in Topic Maps?

 

Mr. Biezunski:  Simple XLink. We have three types of reference elements that use simple xlinks. When we do a reference to a topic or a reference to a subject or indicator it’s another, and if it’s the subject itself it’s another.

 

Mr. Biezunski turned the discussion over to Mr. Brown.

 

Mr. Brown:  Good morning. I’m David Brown from the IRS Multimedia Publishing Division. We’ve been working with unstructured text for a long time. We developed the first Standard Generalized markup Language (SGML) civilian application in 1986. The Department of Defense was working with SGML prior to 1986. An employee of IRS Publishing was on the SGML standards committee.

 

Currently we have 110 Taxpayer Information Publications, 175 Tax Forms Instructions, and the Internal Revenue Manual (an internal directives system) in SGML We just completed an XML application for writing and managing frequently asked questions. We are currently developing an XML application for our Internal Revenue Bulletin.

 

Dr. Hayes:  Is it available in the Digital Daily?

 

Mr. Brown:  We have SGML files for Instructions, and Publications are available on our website as a download. We are filtering SGML to HTML for Web pages.

 

We have a core repository for published products. The standards for repositories are PDF for forms and printed representation, SGML and XML for text, and Encapsulated Postscript (EPS) for graphics.

Topic maps are a method for link management. Topic maps will allow us to add value to information within a repository. Topic maps are at a higher level, above the repository. With topic maps, we talk about the relationships between topics within and across documents.

 

With topic maps, we can create “finding” aids. We can use computers to process topic maps against the repository to create finding aids. As Michel mentioned, it allows for different views of the information based upon the way you access it.

 

Now I’ll show you the initial topic map prototype we did. We started it last summer with Michel. It uses our SGML taxpayer information publications. We took index items in the SGML files, and automatically built a topic map, so each index item became a topic. We processed the topic map against SGML files, and automatically created an HTML rendering. The initial prototype was eight publications out of 110. I have CD ROM samples with me if anyone would like one.

 

Mr. Brown  displayed a topic map prototype slide demonstrating the different links one would click during the process of navigating.

 

What’s next is that we plan to expand the prototype to all 110 publications, and add some additional products, because we can link between different product types, such as XML, Word files, and ASCII files. We want to target our “800” Call Centers. Right now they have the taxpayer information files, but they’re using full-text search. We think they’ll save a lot of time by researching by topic.

 

We also hope to add a FAQ product. Right now on the website, people can send in tax law questions. We have a pool of people who research them and respond. For the FAQ project, we used an email analysis package that clustered emails according to category, and came up with 800 FAQs out of the quarter-million questions. We also came up with the 250 words most commonly used by taxpayers.

 

Mr. Royal:  I have two questions:

 

1.      If I would like a copy of the 250 words, where can I get one?

2.      If I would like the definitions, where do I get them?

 

Mr. Brown:  I can get you a copy of the 250 words. The definitions are something we’re examining, because  we have to get the associations between topics. As part of the expanded prototype, we want to look at the associations. A thesaurus is an example of a finding aid that would be based on associations. It could have definitions, broader categories, and more specific ones.

 

Mr. Biezunski:  You don’t have to use a hierarchy, but you can have any kind of relationship you want. That’s what we’re going to do. Make a model based on this ontology to improve navigation.

 

Mr. Brown:  What we see our role as is to process the maps. We’re assembling a group of tax law experts to do the associations and the knowledge work to create the maps.

 

Mr. Royal:  You can do some of it, but then someone has to say “that’s relevant” or “that’s not relevant.”

 

Ms. Hadden:  Search engines like Google and Altavista—they find your HTML online index—will they still find all the other documents you’ve incorporated into the index?

 

Mr. Biezunski:  I worked for one in France that had an 80,000-word index,—so you can have an index search prepared by others, and you can also get that as something to feed to a Web search engine.

 

Ms. Hadden:  If you don’t do inline tagging, you’d want to funnel everyone through the topic map?

 

Mr. Biezunski:  It’s just plain HTML. There’s nothing special about it. It’s browsed like any other set of HTML documents. There’s no technology involved at the end other than regular Web browsers.

 

Mr. Brown:  The topic map is a separate layer of technology on top of the information.

 

Ms. Hadden:  There’s a different entry point…

 

Mr. Brown:  It can be. We’re using topic maps to process files and then show them in HTML.

 

Ms. Hadden:  How would something like this reduce all the hits that, say, Fairfax County would have?

 

Mr. Biezunski:  We’ll have an aggregation of subjects. If people are using the published subject, you know you’ll be connected with all the other people involved with that. At least you’ll have semantic convergence, because they’ve accepted the definition. It’ll be better for the end technologies to use that rather than a string search.

 

Mr. Ambur:  One of the requirements for the upgrade of FirstGov is to be able to do a search on metadata. I view Web pages comprised of hyperlinks as topic maps.  Many different sets of hyperlinks can be provided referencing the same records, based upon different contexts in which each set is meaningful.

 

Mr. Biezunski:  Topic maps are like XML—they don’t tell you how to model your information, but rather provide you with neutral placeholders.

 

Mr. Brown displayed a tool using keyword search, and mentioned that you can have new words added.

 

Mr. Brown:  For our expanded prototype we plan to do a topic map for FAQs and  a topic map for Publications, then merge them, look at the relationships, and create a thesaurus and other finding aids. Call Centers will be an initial part of the effort. Our next plan is to pitch it to all Call Center people. Then we’ll do the maps for the 2001 collection of Publications and FAQs and prototype it at the Call Center over the summer. Hopefully next year we’ll roll it out as a system they can use alongside their other research tools.

Mr. Ambur:  The Social Security Administration made a presentation on XLinking at our January meeting.  They have similar requirements, and if you haven't had any contact with them, I would encourage it.

Mr. Brown:  Downstream, I’d like to show the prototype to the Social Security Administration and see if they are interested in developing a topic map for their FAQs and look at relationships across agencies.

 

Dr. Hayes:  Have you looked at examples like “I need to file a 1040, and figuring out the other dependencies among schedules, rather than downloading one, processing it, and then seeing what all the potential dependencies are?

 

Mr. Brown:  I think you’re saying a kind of a “yes or no” path—yes, it looks as if topic maps can help with that.

 

Mr. Biezunski:  When you have a topic with several occurrences, then you have the capability to navigate straight from one occurrence to another. It wasn’t there originally, but since people have used the same term several times it’s there, and we can use it to navigate directly between occurrences.

 

Dr. Hayes:  How is the fight with RDF?

 

Mr. Biezunski:  It’s not a fight. We’ve worked with them for two years, and we have a very good relationship. There seems to be a feeling of overlap on the outside. We’re trying to develop a topic map of RDF schema. The underlying model is graphically based—they resolve to graphs as topic maps see them—so it’s very similar to RDF.

There is a need to express the rules that constrain the dependencies between constructs. We can use DAML + OIL, for example, to express that. The difference is that topic maps are like XML—a description—where we’ve purposely excluded processes.

Mr. Ambur:  By the way, Ken Thibodeau made a presentation to the XML Working Group concerning NARA's big Electronic Records Archive project, in which topic maps have been identified as a critical component. If there are no other comments, thank you all.  [Editor's note:  Ken Thibodeau's presentation is available at http://xml.gov/presentations/nara/era.htm]

End meeting.

 

Last Name

First Name

Organization

Ambur

Owen

Interior-FWS

Biezunsky

Michel

Coolheads Consulting

Billups

Prince

DISA

Boyle

Carrie

LMI

Brown

David

IRS

Carnahan

Lisa

NIST

Clark

John

GSA

Cocos

Dena

LMI

Hadden

Lynn

Fairfax County

Hayes

Glenda

MITRE

Jacobs

Michael

DON CIO

Lewis

Diane

DOJ

Pittman

Ken

Pittman & Associates

Royal

Marion

GSA

Sioson

Roehl

US Senate

Smith

Jeffrey

DoD CIO/LMIT

Stanford

Brad

ONR

Vella

Chuck

State

Weber

Lisa

NARA

Williams

Kevin

Blue Oxide

Yee

Theresa

LMI