XML Working Group

Meeting Notes

October 15, 2003


Owen Ambur announced that at least three other meetings were occurring concurrently at which folks interested in XML may be in attendance. The Semantics Special Interest Group/Community of Practice was holding it inaugural, organizational meeting. The Content Management Working Group was hosting a forum on eXtensible Topic Maps (XTM) at the Library of Congress, and the PKI Technical Steering Committee was also meeting. Owen also announced that Jim Whitehead, the “father” of WebDAV, and his graduate students are seeking comments on a revised draft XSD for the metadata elements specified in DoD Std. 5015.2 for records management applications. http://www.soe.ucsc.edu/~dgordon/ERM/erm.xsd


Marion Royal announced that GSA has awarded a contract to SAIC to develop a consolidated repository of reusable components. The project has three phases but, depending upon the amount of funding provided by Congress, only the first phase is mandatory. The three phases are:

 

1) A collaborative environment for the development of components

2) Integration of auxiliary tools, including eForms development tools

3) Federation with other repositories, including UDDI & ebXML registries

4) Operation & maintenance of the Consolidated Component Repository


Ken Sall asked when it would be appropriate to use the new repository, such as for XML components, including XML schemas. Marion responded that, if the component is XML, for now it would be appropriate to use the proof-of-concept registry hosted by NIST. http://xml.gov/registries.asp


Owen Ambur asked about a related contract for the Business Gateway, which will include an eForms portal backed by an XML registry. Lee Ellis responded that SBA has the lead for the project, which will include an eForms catalogue and will incorporate the FedForms site. http://www.fedforms.gov/  Marion indicated the eForms/XML registry contract has been awarded to Sytel. http://www.sytel.com/


Ken asked what software will be used for the Consolidated Component Repository, and Marion responded that SAIC proposed the use of CollabNet. http://www.collab.net/


Via teleconference, Bert Sheingate introduced HyperVision’s presentation of a draft XML schema (XSD) for strategic plans required by the Government Performance and Results Act. Assisted by Jon Barrett of Hummingbird, Pradeep Jain conducted the presentation remotely via WebEx and teleconference. Pradeep displayed USGS’s strategic plan, in PDF format, and he pointed out that crafting an XSD requires understanding of the objectives of the document. He noted there are both formatting as well as semantic elements, and that only business experts know which semantic elements to include.


Owen noted that his objective for HyperVision’s presentation is to solicit input from others, particularly the Office of Management and Budget (OMB), to help perfect the draft XSD for use by all federal Executive Branch agencies. However, he also expressed the hope the XSD would be generic enough to be used by all organizations, worldwide. He suggested that unless and until technologies like XLink, XPath, and XPointer are used, all of the talk of “strategic alignment” is just that – talk.


Pradeep showed that HyperVision had converted the National Park Service’s strategic plan from PDF to XML, but noted that there are difficulties associated with converting PDF. He indicated HyperVision does its XML markup in Microsoft Word.


Bruce Cox asked if Pradeep could show how to fix the problems with the document he displayed, but Pradeep indicated fixing poorly or unstructured documents is not a simple matter that can be automate or easily done by manual means. It is possible to do but takes some effort as well as knowledge of XML. Owen suggested that converting important documents like GPRA plans to XML seemed like a good role for the Government Printing Office (GPO), not only so that they can be posted in intelligent format on the agencies own Web sites but also so that agencies like GPO, OMB, and GAO can: a) compile a comprehensive strategic plan for the entire Executive Branch, and b) readily link strategic goals and objectives to the budget process and the performance reporting process.


Ken asked what HyperVision’s WorX plug-in does that Microsoft Word 2003 does not. Bert displayed the left-side pane that WorX provides to complement the right-side pane that Word now provides to support XML markup. Jon also noted there are two different functions that HyperVision is addressing – automated conversion and markup versus manual tagging of documents. HyperVision’s business model is aimed particularly at automatically converting Word documents that are well structured, such as journal articles, but for which an XSD has not previously been supplied. Once the XSD has been specified, they can convert Word documents to XML and assist with fixing markup problems associated with those documents. Pradeep showed how that is done, using a journal article.


Jay Di Silvestri showed a customized version of Corel’s XMetaL application to assist users in creating XML instance documents that will validate against the XSD for Stage 1 of the emerging technologies life-cycle management process. Since the XSD is pretty simple, there was not too much that he could show with respect to how XMetaL assists users in producing more complex documents.


Since the Stage 1 instance documents that was drafted by Mamoon Yunus of ForumSystems does not conform to the latest version of the XSD and since that XSD itself is not yet in final form, Jay was unable to show how such an instance document might appear to the average user who does not care to look at the XML tags. However, he did show a simple form that could easily be used to produce a valid instance document, and he showed a validation/lookup/pull-down box containing the controlled vocabulary for the ComponentType element (software, hardware, or data).


Finally, Jay noted that maintaining volatile data values in a database can help to reduce the change management issues associated with enumerated list values. However, others pointed out there are complexities associated with that strategy as well, such as gaining access to the database when creating/editing documents.


The remainder and primary focus of the meeting was on achieving substantial consensus on the elements of Stage 2, Subscription, of the emerging technology life-cycle management process. However, the group also revisited elements of the Stage 1, Identification, and agreed to replace the CostAndEfficiency element with one more generically named “ComponentBenefit”. Iqbal Talib suggested the need for multiple “labels” and descriptions of the benefits of proposed components, to facilitate faceted searching. Jon Barrett noted that “feature” and “benefit” are common elements of all vendor marketing materials, and Ken suggested it might be beneficial to include an element named “ProblemSolved,” perhaps with child elements. However, Bruce Cox argued for keeping the XSD simple and generic, and there seemed to be general consensus to do so.


Although the Open Applications Group (OAGi) submitted schema fragments that may be candidates for reuse for the Contact elements of Stage 2, they do not yet conform to the design instructions contained in the XML Developers Guide and thus were not considered for inclusion at the meeting. There seemed to be general agreement on the concepts of Interest and Commitment as elements of the Subscription stage of the process, and no objection was voiced to establishing the allowable Levels of each as High, Medium, and Low, with respective numeric values of 3, 2 and 1. Since no one has yet volunteered to draft the XSD representing the elements of Stage 2, none has been produced for the Interest and Commitment elements. However, Betty Harvey previously submitted a draft XSD that may be appropriate for use with respect to the Contact elements, at least for vendors (as opposed to .gov folks). http://www.eccnet.com/E-GovRegister/register2.xml


Following is a summary of the results of the discussion as understood and documented by Ken:


DTD Symbols Used

? = zero or one

+ = one or more

* = zero or more

no mark - exactly one

# = comment


Subscription # root

                         ComponentName

                           Benefit?

                           WebAddress


                         Interest+

                           Level # value: 1|2|3

                           Description?

                           WebAddress


                         Commitment

                           Level # value: 1|2|3

                           Description?

                           WebAddress


                         Contact+

                           PersonName

                           Expertise

                           Level

                           Description

                           WebAddress # Owen, I think the second WebAddress isn't needed?

                           ElectronicMailAddress?

                           VoiceTelephoneNumber

                           OrganizationName


If you have any corrections or additions to these notes, please contact Owen_Ambur@fws.gov


Those in physical attendance, for part or all of the meeting, included:


Owen Ambur, Co-Chair

Lee Ellis, Co-Chair, GSA

Marion Royal, former co-chair, GSA

Roy Morgan, NIST, head of our Registry Project Team

Liz Fong, NIST, assisting Roy on the Registry Team

Ken Sall, SiloSmashers

Jim Disbrow, DOE

Annie Barr, GSA

Joel Patterson, Software AG

Iqbal Talib, i411

Amin Hassam, i411

Bruce Cox, USPTO

Alice Marshall, Presto Vivace

Renee Lewis, Lockheed Martin for SSA

Jon Barrett, Hummingbird

Steve Kruba, Integic


Participating via teleconference were:


Bert Sheingate, HyperVision

Pradeep Jain, HyperVision

Carol Blackston, DOE

Jay Di Silvestri, Corel

(plus 3 or 4 others whose names and affiliations I missed)