XML Working Group

Wednesday, November 14, 2001

GSA headquarters Room 5141

 

Please send all comments or corrections to these minutes to Jessica Glace.

 

Working Group co-chair Owen Ambur convened the meeting at 9:00 a.m. at GSA Headquarters. Attendees introduced themselves.

 

Michael Jacobs, Department of the Navy CIO's Office, & Brian Hopkins, Logicon, presented the Navy XML Developer's Guide.

 

Mr. Jacobs opened the presentation by explaining the mission of the Department of the Navy (DON) XML Working Group. Their mission is to provide leadership and guidance to maximize the value and effectiveness of emerging XML component technologies implemented across the DON enterprise.

The group had a couple initial deliverables including an Interim DON XML Policy document, an Initial XML Developer’s Guide, and an XML Primer (business and technical).

 

They have two goals. The first goal is to enable the use of XML in support of the DON IM/IT vision by identifying ‘best fit’ applications of this technology in DON applications and architectures. The second goal is to enable an effective and efficient management approach that ensures XML’s use adds to, rather than detracts from, the Department’s interoperability goals. The DON XML workgroup has 30-40 members and is broken into five sub-teams made-up of 5-10 people.

 

This presentation concentrates on the Initial XML Developer’s Guide (Mr. Ambur requested that Mr. Jacob’s send him a copy of the presentation to post to xml.gov).

 

Mr. Hopkins took over the presentation from that point forward and explained that the Developer’s Guide is a living document. It is meant to enable a vast number of activities in a standard fashion. It provides initial generic guidance while also not being so specific as to restrain development efforts. It is trying to solve the issue of how do you get the Navy to work together as a whole. DON Web Task Force lead by Mike Robinson has chosen XML as the data format of choice and the DON CIO signed the “initial” guidance.

 

Walt Houser asked if the Guide would be open to the public.

Yes, the guide is available to the public.

 

The Guide draws from various places including Xfont’s Schema Best Practices, LMI Federal Tag Standards for XML, UML and RUP, ebXML and COE XML Registry. The Guide is 21 pages and provides guidance, explanations, and examples. The Guide has adopted RFC 2119 Terminology.

 

Mr. Houser asked are the examples normative.

Mr. Hopkins said the examples are not normative.

Mr. Houser stated you tend to get in trouble if you make examples normative.

 

The question was posed as to whether the document addresses Schematron.

Mr. Hopkins said the Guide does not address Schematron.

 

The Guide states that all production applications MUST conform to W3C XML Recommendations with the exception of explicitly allowing SOAP 1.1 and SAX 1.0 and 2.0.  The Guide also states that use of proprietary extensions is only for “private” implementation and only with government program management approval.

 

Mr. Houser questioned if beep was evaluated. {Editor’s note Blocks Extensible Exchange Protocol (beep)}

Mr. Hopkins answered no. SOAP was adopted for its infrastructure because it’s mature and well enough supported in tools.

Mr. Ambur stated that Mr. Hopkins shouldn’t suggest that any government work is “private”.

Mr. Houser suggested that “local” might be a better term to use.

Ms. Glenda Hayes asked if this was related to style sheets.

Mr. Hopkins said no.

Ms. Hayes asked for an example.

Mr. Hopkins explained DOM MSXML 3 extension for DOM is OK locally; the downside is that you are then bound to your tools.

Ms. Hayes said this is independent of the schema; so, the proprietary is only for manipulation for the tools.

Mark Crawford said it could be in schema forms; Mr. Hopkins is talking about an extension of what’s allowed in W3C.

Mr. Jacobs clarified that the main point is that W3C should be used, but extensions MAY be used for special circumstances.

Mr. Jacobs agreed to take the suggestion of using “local” instead of “private”.

 

The Guide adopts the ebXML Case Convention of:

Elements in Upper Camel Case (UCC)

Attributes in Lower Camel Case (LCC) and

XML Schema Types in UCC.

 

The Guide encourages using W3C Schema (.XSD) over DTD; focusing on XML Schema Language has implications for interoperability.

 

The Guide adopts the ebXML convention of acronyms and abbreviations:

No abbreviations

Acronyms used sparingly at discretion of government program manager and

Acronyms in all CAPS and spelled out in element definition.

DON plans to change abbreviations and acronyms to read SHOULD NOT be used in the next edition of the Guide.

 

Mr. Crawford said this should enhance interoperability.

Ms. Hayes said BE number is an example where there is controversy over what exactly BE means.

Mr. Hopkins said the person writing the code should not have to make these definitions; technology is outpacing the development, the business needs to keep up with the technology.

 

The Guide says developers must reuse COE registered components if suitable, if a COE component is not suitable, the element should use ISO 11179 or a common business term.

 

Ms. Hayes asked if ISO 11179 adopted no spaces.

Mr. Hopkins clarified that DON has adopted ISO for semantics, but the format is the ebXML defined UCC and LCC.

Mr. Crawford added that they are removing the separators.

Mr. Royal asked if the Guide was using 11179 as modified by ebXML.

Mr. Hopkins answered that yes, the Guide adopted ebXML standards.

 

Mr. Hopkins reviewed an example UML model used to define a ComplexType.

 

Mr. Houser asked Mr. Hopkins his feelings on attributes.

Mr. Hopkins said they hope to follow along with the Dave Carlson book.

Ms. Hayes asked if naming conventions would have a simple or complex type with an element name.

Mr. Hopkins said that is not in this initial guidance; but that is the path we expect to follow.

Mr. Crawford said the department is supporting UML as a way to model the business process, but not for mechanical generation of XML from UML.

Mr. Hopkins said we have severe reservations about auto generation of XML.

 

The Guide took its concept from ebXML. The Guide discusses business terms, provides a list of synonyms for Core Component metadata, and acknowledges that different names exist and are commonly used for the same information entities. It also acknowledges the fact that XML Schema syntax provides a construct to handle Substitution Groups.

 

Ms. Hayes asked why did you choose not to take the convention to have “type” in simple type and complex type names.

Mr. Hopkins said our initial thought was that “type” was reserved for ebXML; but we would be interested in what the group thinks.

Ms. Hayes asked what happens if the semantics are slightly different.

Mr. Hopkins said it wouldn’t affect the name.

Ms. Hayes asked how close do the meanings have to be to be a substitution group.

Mr. Hopkins said they would have to be semantically synonymous.

Mr. Houser asked could you declare them across namespaces.

Mr. Hopkins said that still must be dealt with.

Mr. Houser asked have you defined the namespace.

Mr. Hopkins said we’re using the COE namespace.

Jim Disbrow asked why don’t you have specific attributes.

Mr. Hopkins said this is just an example, but you would want to define that in the schema.

 

The Guide says you should capture as much information as reasonably possible in either the schema itself or a referenced guide.

 

Mr. Ambur commented that the greatest value of providing additional documentation at xml.gov is to accelerate a standard reg/rep.

Mr. Houser asked if you could use the external guide as a notation.

Mr. Hopkins asked Mr. Houser to clarify his question.

Mr. Houser said to make it interoperable.

Mr. Hopkins said it is a free form guide, but we can get into specifics of how to make it automated later.

Ms. Hayes asked do you have a convention for stating the meaning of the code from a code list.

Mr. Hopkins said I’ll cover that later in the presentation.

David Eng asked what are some implications in the real world.

Mr. Hopkins said it is the application guide. It is so analysts and developers can use it to get information. One of the things DON is working on is a taxonomy. The long-term plan is to have a reference to the taxonomy in the schema.

Mr. Eng said standardization of the issue rather than the element.

Mr. Hopkins said defining want to standardize will probably be based on core components, which can then be extended, but he made the caveat that was his opinion.

 

Vitria representative commented this is great work and asked how much DON XML work has been done before the guide was released.

Mr. Hopkins said XML work is happening at a grass roots level, but no major initiatives have occurred.

Vitria representative asked what is the status of already developed XML and what happens if Boeing or another vendor uses another standard.

Mr. Hopkins said nothing in the Guide precludes you from referencing another namespace.

 

Mr. Houser asked what is the Webex regular expression.

Mr. Hopkins said it still needs to be standardized.

Jim Whitehead said LDAP may have defined a regular expression standard.

XML conventions must provide a definition.

 

Mr. Hopkins wanted to get people thinking and will continue to improve the document with best practices. He is not really happy with the current version guidance.

 

Mr. Houser said on versioning, Jim Whitehead has been working on DeltaV on namespace collections.

Mr. Whitehead said yes, you get this, but you want to keep it in your document.

Mr. Hopkins said it needs to be XSL and XSD.

Mr. Whitehead said it has consistent configuration of objects, so DeltaV could help.

Mr. Houser said there are multiple XSL associated with one schema.

Mr. Crawford referenced part 10 in EDI outer envelopes and said what they’re trying to do is get their people in this same frame of mind so there will be a smoother transition.

Ms. Hayes asked are you requiring a system identifier and not a public identifier.

Mr. Hopkins said yes that is a good thought; a resolvable URN may be better.

 

Mr. Royal stated the Guide says MUST, is the document that fuzzy.

Mr. Hopkins said they tried to give some guidance without making it restrictive.

Mr. Jacobs said remember this was a rush to press job to meet a Web Task Force deadline; so this was better than not having any guidance.

 

Mr. Houser asked would a document be subject to 5015.

Mr. Hopkins said he was not sure and referred to Mr. Jacob.

Mr. Jacob said he was unsure.

Mr. Amber said the answer is that it should be.

Mr. Houser clarified that it is the law that regulates documents.

 

The current guidance on attributes verses elements is attributes are discouraged. Attributes should be used with consideration. COE is the registry guidance.

 

Mr. Eng said metadata is needed for data quality, which is a characteristic of the incoming data.

Mr. Crawford added that attributes are tolerable, but I would limit my use of attributes due to parsing.

Mr. Eng asked what if you apply this to data validation such as Schematron.

Mr. Hopkins said then you’re tying a standard to a document.

Mr. Eng asked could this be done by a data generator.

Mr. Hopkins said the jury is still out on that issue.

Mr. Eng asked what highlights the fact that this could be useful.

Ms. Hayes said the performance comparison of parsing with elements and attributes by Mitre and I believe that elements parse better.

Mr. Hopkins said that’s because attributes are white space normalized.

 

Mr. Hopkins shared a “Possible Vision of the Future” for an overall XML architecture. This picture is on page 25 of the presentation.

 

Future plans for the guide are to incorporate comments from this meeting.

 

Ms. Hayes asked if the Guide addresses the Lang tag.

There was also a question regarding engineering guidelines for SOAP.

Mr. Hopkins asked if they meant was there any guidance.

Ms. Hayes said she would like the services to move that forward.

Mr. Royal asked if anyone knows about UDEF- Universe Development Framework, a hierarchal framework. Mr. Royal further explained that it’s a UID BIE that includes the content of how they are used in an XML document. Mr. Royal’s contact is from Lockheed.

Mr. Hopkins said he never heard of it before; the DON guidance is to include a UID with every element in the document that is part of Best Practices.

Mr. Royal said UNCEFACT has not given much thought to UID.

Mr. Crawford said that is outside the scope of the Core Components group; but he hasn’t talked to Lisa as of late as far as Registry stuff is concerned.

Mr. Royal said that might fit better in the BIE.

Mr. Crawford said there is probably a need in both places. It’s in the latest spec that is ready to come out. We want a relationship between BIE, the Core Component would have the core UID that all BIE link back to; that’s how you get the link back in to the process model.

Mr. Royal said let’s follow-up later on this topic.

Mr. Royal gave congratulations on a great guidance document.

 

Mr. Ambur said we want to do anything at XML.gov to benefit others.

Ms. Hayes asked has this been vetted through the DON.

Mr. Jacobs said Mr. Porter, the DON CIO, has signed it out.

Mr. Hopkins said it is compatible with the Web Task Force, but he is certainly open to suggestions.

 

 

WebDAV & DASL Remote Collaborative Authoring and Electronic Records Management

Ted Weir, BLM, & Jim Whitehead, University of California, Santa Cruz

Mr. Ambur introduced Mr. Weir who explained some of his background and then introduced Mr. Whitehead.

 

Mr. Whitehead gave a presentation on WebDAV.

 

What is WebDAV? It is an application layer network protocol developed by the Internet Engineering Task Force (IETF). WebDAV is used for remote collaborative authoring. Many vendors have adopted this protocol. It has strong open source support and strong vendor support. A user puts a document on a web site URL making it possible for another collaborator to access it.

 

Mr. Jacobs asked is version capability in it, like in Word and does WebDAV work with that feature.

Mr. Whitehead said within the document, versioning is transparent to WebDAV.

As soon as you lock the document, how will other people deal with this document?

Mr. Whitehead responded WebDAV uses a locking protocol. Every 30 seconds Word polls the document to see if it’s locked. In order to reduce scope initially we did not take on event notification.

John Weiland asked what happens if you shut off your computer while it is locked and how will you get the document.

Mr. Whitehead responded that Word only locks a document for two minutes, so if a lock refresh message is not received within this time, the lock evaporates.

Greg Portnoy questioned what happens when you’re doing work with multi-megabyte bite files, which sometimes take 10 minutes to upload.

Mr. Whitehead responded that’s a server configuration problem.

Mr. Ambur said no one wants to read that large of a document; it should be broken into smaller components.

Mr. Royal asked if you could give ownership within a document.

Mr. Whitehead responded WebDAV doesn’t allow you to do that. You would have to divide the document into different chunks. WebDAV doesn’t do that because it would need to know the internal document structure, and a WebDAV server would have to embed this knowledge into its implementation. Word could have a shared lock since that application has more sophisticated knowledge of the internal document.

Mr. Jacobs asked have any applications implemented this.

Mr. Whitehead responded no.

Ms. Hayes asked is there a concept of ownership of parts.

Mr. Whitehead responded no the client application would have to do that.

Mr. Eng asked how are conflicts resolved.

Mr. Whitehead said versioning does allow people to work in parallel and then you may have a merge problem; we delegate that to the particular tool’s merge capabilities.

 

Mr. Whitehead shared various examples of collaborative development efforts and the practical uses for WebDAV. There are many different visions of what WebDAV is. Participants in WebDAV see it as a collaborative authoring tool, a Web-based network file system, a data integration technology, a remote software engineering infrastructure and a replacement protocol. All these views are correct. Mr. Whitehead then provided examples of each of these visions in current software. There are many facets to the WebDAV work. It’s a collaboration infrastructure, metadata repository infrastructure, namespace management infrastructure, access control infrastructure and a searching infrastructure. The search function is called DASL.

 

Mr. Royal asked is that how you would do the segments.

Mr. Whitehead said that’s one way you could do it.

Mr. Portnoy said, so you’re not actually working on Word on the Web.

Mr. Houser asked what the security story was.

Mr. Whitehead said the security story is that WebDAV leverages http and can be sent over SSL. Digest authentication is required, but we have received pushback on this authentication scheme. This is because some implementations of Digest require storage of the password in essentially clear text form, and that isn't acceptable to many customers. Use of SSL over WebDAV is the preferred authentication. Those are just the standards and there are others, which can use WebDAV. Authentication is used for any of the methods.

Mr. Royal asked are there any comparisons of http v ftp.

Mr. Whitehead said http usually beats ftp because http only sets-up on a TCP connection. Smaller files on http have a much greater advantage, but large files have diminishing returns. For procurement documents, ftp is much faster.

Ms. Hayes said ftp with compression was better.

Mr. Whitehead said common ftp implementation typically requires a log-in, but often http provides better security.

 

Mr. Eng asked if a .pdf file could be searched with DASL.

Mr. Houser said .pdf wouldn’t be XML compliant.

Mr. Eng said I’m looking for a search algorithm.

Mr. Houser said that’s probably out of scope for WebDAV.

Mr. Whitehead said it is primarily for text, not .pdf.

Mr. Eng said you are compliant with Oracle and asked it can go in and do a search for XML tags.

Mr. Whitehead said DASL would not handle your situation, but you could extend it to make it accommodate your needs.

Mr. Eng asked if the author does some tagging, could the server go out and do the query to do that.

Lex Poot asked if it keeps track of the lock.

Mr. Whitehead said it depends on what the web server is and how it reacts to an unexpected shut down

Mr. Whitehead said core web must support digest and should also support authentication. We are working on a follow-on protocol. All WebDAV services have some sort of access control. We are walking a fine line from replicating LDAP.

Mr. Royal said he is looking for single sign-on.

Mr. Whitehead said WebDAV doesn’t have a good single sign-on.

Jim Disbrow said save back to server never figured out how to configure server to allow this- MS work seemed to disappear.

 

Ms. Hayes said DAV makes it seem that you have a strategy for structuring documents. She asked is this a misnomer.

Mr. Whitehead said when we got involved in document management meetings that this is a low-end document management tool with a highly functional core. WebDAV server implementations are not integrated with high volume scanners, which is more of a system integration issue. This is a protocol. The other thing is that the competing standard was looking to create an API for middleware and a bunch of document management systems. They took too long and it was too complex. Then some political aspects came up and that initiative hasn’t been finished. In 96-99, the Web was the new technology and it excited people, which helped WebDAV.

Ms. Hayes said there is no document structure or document metadata.

Mr. Whitehead said we just want to treat documents as text. More could have been done with metadata, but it seemed like that was being covered in other places. We’re starting to take a greater interest so they need to be more clear on namespace technology.

 

For more information go to http://www.webdav.org/.

 

 

Co-chair Owen Ambur asked for other announcements and reminded the attendees to sign the roster.

 

Mr. Whitehead’s workshop is at Archives II on interoperability of records management- the meeting will explore looking into what would be involved in that, are there people willing and able to work on that; framing the problem; where would that be done.

Mr. Ambur asked if people planning to attend should read any documents ahead of time.

Mr. Whitehead suggested people be familiar with 5015.

 

Mr. Ambur questioned whether this room is OK for the next Federal CIO Council meeting and mentioned someone from MicrosSoft is attending.

Mr. Royal said this is the largest conference room that GSA has, but the meeting could be hosted someplace else.

Mr. Ambur asked do you share my thought that too many people will be here. Mr. Ambur deferred the issue of the room.

 

Jon Bosak has expressed an interest in coming in to speak and if there are other ideas for the agenda let Mr. Ambur know.

 

LMI is hosting a strategy meeting Friday November 16; go directly to the Second floor.

 

 

 

Last Name

First Name

Organization

Ambur

Owen

Interior-FWS

Billups

Prince

DISA

Crawford

Mark

LMI

Disbrow

Jim

DOE

Eng

David

EPA

Gaven

Tom

Vitria

Hart

John

Vitria

Hayduk

Matthew

Dyncorp

Hayes

Glenda

MITRE

Hopkins

Brian

Logicon

Houser

Walt

VA

Jacobs

Michael

DON CIO

Johnson

Hilda

i4i

Luo

Haiping

VA

Masri

Sa'ad

Dyncorp

McKeever

David

i4i

Poot

Lex

DTS

Royal

Marion

GSA

Sinisgalli

Mike

Vitria

Stanco

Tony

GW CPI

Weiland

John

NMIMC

Weir

Ted

BLM

Yee

Theresa

LMI