Federal CIO Council

XML Working Group

 

Wednesday, July 17, 2002 Meeting Minutes

 

GSA Headquarters

18th & F Streets, N.W, Room 5141

Washington DC 20405

 

Please send all comments or corrections to these minutes to Glenn Little at glittle@lmi.org.

 

Mr. Owen Ambur opened the meeting by introducing himself and briefly explaining the day’s focus:

 

Mr. Ambur:  When I sent out the notice of the final agenda I got a lot of messages about people on vacation. We do have some key guests here, three important people from NARA. Our lead speaker is here, and I know we have at least one more coming in. I’ll start off by introducing myself and my interest in XML. I’m Owen Ambur, and I’m the co-chair of this group. I’m from the Fish and Wildlife Service, and two original proposals I made led to the formation of this Work Group. The first was to render all Government forms in XML and gather the data from them in XML, and the other was to use XML meta-tags to classify and manage electronic records Government-wide.  So I have a strong interest in the primary topic of our meeting this morning.

 

[Each participant then introduced him/herself and advised the group members as to his/her interest in the Work Group’s activities.]

 

Mr. Ambur:  I have just a few announcements: for anyone who didn’t catch it, the GSA/IDEAlliance seminar scheduled for tomorrow has been cancelled. I may want to reschedule a scaled-down version of it at one of the regular monthly meetings of the Working Group.  I posted a notice of the cancellation on the XML.gov home page and list serve, but others who have heard of it by other means may not be aware of the cancellation.

 

Next month we’ll be discussing the brainstorming ideas raised in follow up to GAO’s previous briefing concerning the Executive Branch’s implementation of XML.  After GAO’s presentation, we brainstormed more than 60 ideas but we didn’t have time to rank and rate them. We’ll continue that discussion next month.  In the meantime, Shanti Rao of Raosoft will put up a Web survey for anyone who may wish to participate in rating the importance of each idea.  At the meeting, he’ll talk about XML-based E-forms, interactive XML Editors, and the results of his survey.  During the rest of the time after break, we’ll discuss the ideas further. We may also have an update on the Global Justice Information Network schema, which relates closely to Homeland Security.

 

Mr. Marion Royal:  I want to point out it’s not a Government survey.

 

Mr. Ambur:  Yes, under the Paperwork Reduction Act, if we survey 10 or more people we need OMB clearance. This is Shanti’s survey, and it is part of his presentation demonstrating the potential and benefits of Web survey technology.

 

With reference to the What’s New section of XML.gov site, most of you know of the Namespace Management Workshop at NIST on July 26.  It relates closely to the XML registry.  Also, we are working with ITAA to plan an event tentatively scheduled to October 22 at the Reagan Building.  They want to help us focus on practical implementation issues, i.e., what can be done at this time.  We’re also planning an event with CENDI, whose focus is scientific and technical information.  October 29 or 30 are the tentative dates, but they’re not firmed up yet.  I also want to mention the Information Sharing & Homeland Security Convention scheduled for August 19 – 21 in Philadelphia, at which there will be an XML session.  Lastly, we were contacted by Frank Olkin of Lawrence Berkeley National Laboratory, who is pursuing the specification of an XML vocabulary standard for units of measurement.  I sent a message to Karl Best to see if OASIS has anyone involved in such an initiative.  Frank is already working with NIST.  Are there any other announcements?  If not, then we’ll go right to the GAO report on e-records management challenges. Our first speaker is Mike Dolak.

 

Mr. Mirko Dolak:  Thank you Owen. This has been a lengthy and fascinating assignment, with the by-product of an appendix of XML metadata and technology. I know that most people aren’t XML experts, but I believe that XML is transforming technology. The progression of “where do we go?” is still unfolding. Jamey Collins did most of the work and will present our observations. We have copies in the back of the room for anyone who’d like one. Please don’t hesitate to interrupt if you have any questions. Jamey?

 

Presentations:

 

Mr. Jamey Collins

U.S. General Accounting Office

“XML and the Long-Term Preservation of Electronic Records”

 

Slide 2  [Preservation Challenges]:  Archives have been preserved electronically for some time. Most have been in limited formats—for example, NARA, since the 80’s, has been preserving databases by turning them into flat files. That reduces functionality, and it does not take into account the formats available today. Because of that, these files are at risk of being lost. Not all of that is due to the archives. Some is an issue of preserving data for the long term and identifying what’s appropriate at the Agency level. If they’re not identified, they’ll be lost. We’ve seen this within Federal Agencies. Unless Agencies do it themselves, many of these files won’t get archived. At the end of the Clinton Administration, they took a snapshot of the websites and which left behind little functionality.

 

Slide 3  [Preservation Approaches]:  Here’s a quote from Jeff Rothenberg about a viable approach:

A viable strategy for long-term preservation of electronic records

calls for a solution that does not require continual heroic effort or

repeated intervention of new approaches every time formats,

software or hardware paradigms, document types, or record-keeping

practices change.

 

The big issue is the ability to save records and make them accessible through paradigm shifts, and as formats become obsolete, it’s difficult because of the obsolescence and because of the volume of electronic files created.

 

Slide 4  [Preservation Approaches]:  Despite our advances, there’s no solution yet. That leads to a variety of approaches that fall short of solving the problem— leaving archives to use a combination of approaches to tackle it. Without a viable solution, they have little choice. Without doing it, records will be lost, and that defeats NARA’s mission. There are basically five strategies:

  1. Technology preservation
  2. Emulation
  3. Migration
  4. Encapsulation
  5. Conversion to standard formats.

 

Slide 5  [Technology Preservation]:  This means that we maintain technologies to access old formats. To access outdated formats is simple, but it’s only viable in the short term. As time passes, it’s increasingly costly and difficult, because of inevitable failure and replacement, and the associated difficulty of replacement and cost—especially compared to more efficient counterparts.

 

Mr. Royal:  That means you have to maintain old 10-stack disk drives.

 

Mr. Collins:  There are other approaches, but for the most part, yes, you preserve it as it was.

 

Mr. John Dodd:  There hasn’t been a lot of research on that. I worked on a NASA task where we used means to prevent that. The job is to have metadata built in. Creating archives metadata standards is one game to do it.

 

Mr. Collins:  If you combine this with encapsulation down the road, it might be better.

 

Mr. Dolak:  This approach is for organizations that require access to records. They’re not managing records any more, but they need to keep them alive until a better way can be found to make them available.

 

Mr. Collins:  If too much time passes, there’s a lack of familiarity with systems. Obviously, this is only a short-term solution.

 

Slide 6  [Emulation]:  Emulation mimics the functionality of hardware and software. The file is saved, and the environment around it is recreated. One saves data files with the application, and emulates the operating environment. This is suitable for certain types of records, but one still must overcome obstacles if it is to be used systematically in an archive.

 

There are property-rights issues with the applications and systems used. These specifications most likely won’t be turned over for archival integrity. There are also issues of user familiarity, such as encountering problems similar to the Y2K experience. Though not quite the same, as time passes, bugs will be found, with an increased chance of encountering failure. Today emulation has not been applied in a large scale archive. It’s fairly immature. As with anything, the more components that are involved, the greater the chance of risk. You need to have the file, the application, and the environment in order to access the document, but it’s one of the strategies under consideration at the UK PRO.

 

Slide 7  [Migration]:  The transfer of files from one format configuration to another is called migration. It refers to migration of formats and media to maintain data on the most effective, cheapest source medium. The process is one of conversion of format—the structure of data elements is rearranged. It’s risky, because you risk the alteration of the data file, so there’s a chance of data loss. It requires extensive knowledge of the data and source to find problems when they occur. It’s also risky because the chances are you’ll have to migrate again when a new format comes along. Each successive version implies more risk. Many archives use this as their approach. Because so many of the archives are so young, most are dealing with today. It exists as a future solution, since today’s documents are available today.

 

Mr. Royal:  Will this extend past a few years?

 

Mr. Collins:  Yes.

 

Mr. Royal:  But the electronic part is the one we’re faced with now.

 

Mr. Collins:  Yes.

 

Slide 8  [Encapsulation]:  Encapsulation is a programming concept defined as combining several elements to create one unique entity. It’s now done using XML containers. We’re taking that approach now. Encapsulation may include multiple files—any possible series of files that constitutes an entire record. The format is saved in its original form, or used with the next approach. Saving in the original form would allow us to keep metadata, so that later you can access it to apply any strategies that are found along the way.

 

Mr. Dolak:  Let me point out that encapsulation allows us to preserve in the native format, in the hope that later, technology would let you reformat in a more appropriate way.

 

Mr. Royal:  That only allows us the “library” function of identifying records, as opposed to using them.

 

Mr. Dolak:  Yes.

 

Slide 9  [Conversion to Standard Formats]:  This is ideally the one that will be used and accepted and will persist, and it’s also open source; for example XML or ASCII—this decreases the dependence on particular hardware or software, and increases the likelihood of future accessibility. But in selecting a standard format, it’s hard to determine its longevity. For example, PDF now might pose problems down the road. This is compounded by the fact that archives have to preserve information for centuries, which is longer than most of the creating Agencies need to retain it. The most important implication is how it replicates the original document. This is key in the archival world.

 

Mr. Ambur:  Regarding PDF, I’ve heard that Adobe is proposing a stripped down version as an ISO standard. Does anyone have any more information on that?

 

Ms. Lisa Weber:  Yes, it will be proposed as an ANSI and ISO standard, called PDF 8. Carol Brock is the contact person.

 

Mr. Brand Niemann:  You have a plug-in for the data. It works with Acrobat 5. You save it as XML, with or without a style sheet.

 

Mr. David Hofert:  What structure is retained?

 

Mr. Niemann:  We have our Electronic Record and Documentation task force working in cooperation with NARA, basically, to understand that as acrobat has started, how you can get to XML. There are three types of PDF:

  1. Unstructured
  2. Structured
  3. Tagged.

 

You can’t get structured information from non-structured. They’re talking about the strategy to go from the first two types to the third type. “What do you do with the first two types?” There’s lots of legacy content. They have custom mapping, so if you don’t like their way you can tweak it.

 

Mr. Ambur:  I’d also like to note Susan Turnbull’s interested from the standpoint of Section 508 and the accessibility of E-records to persons with sensory disabilities.

 

Mr. Mark Miller:  Last night I spoke with Microsoft. They conveyed that in the next version of Office, every document will be saved as XML.

 

Mr. Niemann:  It’s already present in XP. Tomorrow’s presentation deals with that. You can go from Word, Access and Excel, and it’s there. There are three strategies to go from Word to XML:

  1. Download the Visual Basic plug-in
  2. Use a third-party tool like XML Spy
  3. Commission Microsoft to define one.

 

One example is the Wall Street Journal. It has one where their Word is creating an XML repository and the users don’t even know.

 

Slide 10  [Documenting Archived Records]:  Any strategy is to take records and collect information to be accessible in the future. It’s known as metadata. Archives have always used catalogues, but more extensive information is required for electronic archives. Some examples are, “how, when, or why” a record is created, updated, or changed. “How do you open up access for records?” It complicates the metadata structure. It’s a formidable task for anyone attempting to create a viable electronic archive.

 

Slide 11  [XML’s Role]:  This is where XML comes in. Researchers from NARA are considering it for how to retain information. It’s why so many industries are considering it. It’s flexible—it can consider needs as technology changes—it’s non-proprietary, it provides a platform for standard tags, it’s text-based, and easy to transmit over networks. NARA is shortly to go to a decentralized network. It’s also human readable, which adds to its longevity, because  in addition to being parsed by the computer, you can view and print it on paper.

 

Mr. Dolak:  That doesn’t imply that it can be read by humans, though.

 

Slide 12  [XML in Public Records Office Victoria]:  Let me go over the strategy of the Victorian Electronic Records Strategy. It uses encapsulation and conversion. It was started in 1995. They realized that longevity was a problem. They were facing obsolescence. They enlisted Ernst and Young to invest COTS strategies. They didn’t find any. In 1998, they started a proof-of-concept based on the concept of a generic long-term format. It doesn’t provide the answer for all formats. It specifically addressed those records that would be lost, rendered obsolete, or mismanaged—those created on Government desktops.

 

Mr. Niemann:  Can you give us an example of formats that don’t work?

 

Mr. Collins:  They try to preserve all office documents, but it doesn’t work with complex documents such as accounting documents. There’s no way to access them.

 

Mr. Niemann:  That’s where new XML forms standardization comes in. Separate content from presentation and provide standardization.

 

Slide 13  [XML in Public Records Office Victoria, continued]:  The outer-most layer is a container based on XML. The metadata is defined by the National Archives of Australia, but it’s not all suited to the needs of the Victorian Government. It retained all 140 elements of the NAA standard. Of those, 30 are mandatory. The majority can be automatically generated or populated using defaults. Within the record itself, one can retain multiple or single records. The original is maintained in PDF for access down the road. The final encapsulated record is text-based, using a BASE64 algorithm with the result of a text-based file, within the metadata of the file. This is a young archive. As of now, I can’t say it’s entirely viable. The formats are not now obsolete. We haven’t yet had to access it down the road, so we won’t know for a while how successful it is.

 

Slide 14 [Encapsulated Record with XML Metadata]:  So you see that this is really an encapsulated object wrapped in XML. The gentleman brought up digital signatures. They are included in the final record. It’s a system time stamp and a session document that you can view to ensure no changes were made on the record.

 

Mr. Ambur:  With reference to Adobe’s intent to establish a stripped down version of PDF as in ISO standard, we need to figure out if that makes sense.

 

Mr. Niemann:  It doesn’t make sense for the Government to keep using PDF.  There’s a transaction that makes it much more difficult.  That’s the paradigm that needs to occur.

 

Mr. Collins:  Sometimes you capture the direct representation; sometimes a PDF is good.

 

Mr. Niemann:  You can reproduce it in style sheets. The only time PDF makes sense would be when you have a large amount of material and you don’t have time to convert it. You want to create an index. We have terabytes of PDF in the Superfund, and we don’t have the resources to convert it. For large volumes of information we’re looking at using an XML index.

 

Ms. Susan Turnbull:  It’s useful to look at NIST records. Seven years ago, they had a committee look at establishing PDF as their standard. They decided against it.

 

Mr. Collins:  Yes, and the slide shows its limitations. You have an Excel document saved with no guarantee of access down the road.

 

Slide 15  [Where Do We Go From Here?]:  There’s no solution yet, but XML is promising for the longevity of electronic records. I can be corrected on this, but we think that the new DoD repository is using XML. Part of the problem is coordinating with industry on document management systems and electronic records systems. The standard in existence doesn’t tackle the problem of metadata or the use of XML.

 

Mr. Ambur:  Regarding the 5015.2 standard, Jim Whitehead, the Father of WebDAV, has graduate students who have rendered it the records management metadata elements in a draft XML schema.  Meanthile, the 5015.2 standard is undergoing enhancement to incorporate additional requirements, so the schema will be extended.  Are there any questions for Mirko or Jamey?

 

Mr. Miller:  Gartner has a July 11 report on this.  Has anyone seen it or does anyone have any comment on it?

 

Mr. Dolak:  Is the report available?  We’d like to investigate.

 

[No one indicated familiarity with the report.]

 

Mr. Miller:  Well, the Gartner access isn’t free, so I didn’t expect that people would be familiar with it.

 

Mr. Dodd:  There’s been lots of emphasis on enterprise architectures. Now the discussion is focusing on business-line architectures. Every discussion I’ve been involved in has had a directory-management part. Some were fairly low priority. They’ve been waiting for guidance for a record management reference model, to see how they could then deploy or integrate that. I think that’s part of an E-record program? Is there a way to use enterprise architecture as a policy to have in every business process model, as a preferred model driver?

 

Mr. Ambur:  Document record management cuts across all business lines. It’s a basic requirement for all business operations. We’ve been working to get record management recognized in IT architectural guidance.  IT folks are focused on moving electrons but don’t commonly recognize the need to preserve history.  Hopefully, the IT architectural guidance scheduled for release soon in support of the eGov Strategy will incorporate record management.

 

Mr. Dodd:  I didn’t see it in the earlier draft.

 

Mr. Niemann:  We’re undergoing an interesting evolution at EPA. We’re about to give States $25 million in grants to jump-start XML data exchange. That’s triggered the next thing—enterprise architecture. In looking at the target architecture, we decided that to be compatible with an XML external network, we must add an XML layer for our own architecture. That’s triggered the realization that they need to include that in their strategy. Until recently, they were going to finalize, then they realized they need to ensure that the strategy interfaces with the XML layer in the Agency.

 

They asked me how to do it, as a practical matter, to challenge me to see how we put all of the Agency’s documents into XML. Two things have come from that. It’s useful to look at individual formats. You need to look at them more broadly with NARA’s guidance. How do we go from the current authoring to get to XML, and how do we put in place an XML document framework to start authoring in XML, or as close as possible? I recommend that NARA and XML.gov and others foster this in development best practices and guidance for current formats, and in the future target architectures to change to a more fundamental XML framework.

 

Mr. Royal:  I agree with respect to the framework. To begin with, we need schemas for record management or storage. Without it, you can’t establish a framework for how to manage it. The challenge is to develop a schema’s starting place.

 

Ms. Susan Cummings:  We think there are four issue areas. EPA is leading on the enterprise-wide one, and NARA is leading on XML schemas, record management, and archiving data.

 

Mr. Royal:  I had a voice mail conversation with Mark to figure out how OMB’s Solutions Architecture Work Group can assist with that. It’s the next step we need to take.

 

Mr. Niemann:  We need to clarify whether it’s schemas for documents, for metadata, or both. We need experience in how to do it. It has an important bearing on what we’re trying to develop.

 

Mr. Royal:  The one danger of that is that agencies will accept whatever is the default in what we have. We’ll end up with multiple schemas, from which we’ll have to provide style sheets to translate them into storage format. If we can get an established schema first, we can tell the vendors “this is the format.”

 

Mr. Dolak:  My interpretation is that permanent records must seamlessly move into electronic archives. This implies manageable XML tags, and no translation sheet. I anticipate a DoD standard established in conjunction with NARA, in which there are mandatory tags that allow seamless transfer, with the assumption that the agency has other tags for their requirements—but to envision 50 or 60 systems that can’t talk to the tags is frightening.

 

Mr. Royal:  I provide a schema with every record that can be used by the archives. It wouldn’t have to have a centralized schema, but it would have to retain the one that was used for that record.

 

Mr. Dolak:  The government should follow that guidance.

 

Ms. Dena Cocos:  Do you mean that you store the schema with each record?

 

Mr. Royal:  That’s an option.

 

Mr. Mike Todd:  DoD is working on the next version of its architecture framework. Architectures will render the architectural namespace such that all products can be rendered in XML. Also, the core architecture data model has been rendered into XML. We try to make headway so that the framework and data model is in XML as the first step. As part of 5015.2, as part of a federal mandate, graduate students are producing a standard that lays out in plain English how it’s done.

 

We’re also having discussions in DoD about having all of the standard requirements documents, mission statements, and operational requirements documents converted to a template of style sheets for standard format for requirements and solutions for requirements. That will allow scans and searches easily, with the same taxonomy. We’re looking at transforming the way we do business—communicating across the Department using XML. Right now XML seems to be the solution.

 

Mr. Royal:  That addresses XML in the architecture. We’re doing it in our OMB committee also. The point is infusion of record management into architectures.

 

Mr. Todd:  We’re also looking at—we built a model—as I interview people, I ask that the record management officer be present, raising visibility and awareness that almost everything officers issue are references that constitute official records. We’re trying to integrate the way we look at information. We’re speaking of net-centricity, trying to plant the seed of information-centricity. The network is just a conduit. That most valued piece of information is the official record. We need to get people out of the mindset that records are archival. They need to understand that this information is a decision point used to determine the official record. Correspondence sometimes becomes official records. “How do you find information in the context of what’s an official record?” It scares people, because everything might be an official record.

 

Mr. Niemann:  Our agency separates it into three categories:

1.      Relational databases. Leading companies, like Oracle, for example, can generate XML schemas from database schemas on the fly. That’s straightforward.

2.      The second type is what you might call forms. What you can do with a tool like XML Spy is, with a mixed document, create a style sheet. A person can author it in a WYSIWYG environment, and display it on the desktop or browser XML Spy provides.

3.      The last is where you have content that’s not structured, but you want it in XML. You can get by without schemas, or forever, depending upon how you want to use the documents. As an example of what’s going on in those areas, it brings us the separation of content and presentation you need for accounting, etc.

 

Mr. Ken Thibodeau:  When we developed 5015.2, one constraint said that the standard had to apply across DoD everywhere. When you do that, you’re involved at a point where people manage records, but you can’t optimize for any line of business. If you go back to the legislation, it’s not for the archives. The objective was to make agencies more efficient, so if you’re doing a good job of management, you’re helping your agency in its business.

 

The people in the business are defining what the schemas are. That approach is in places like the Patent and Trademark office. They have basic requirements, but they’re fine-tuned for business lines, so the management supports the business. From our perspective at NARA (my perspective), the most reliable records are records that the creators rely on to do business. That’s going to really tell you what they did. If we can improve record management support, we’ll do a better job for posterity. We need to get into the approach of not using generic record management. The requirements have to work within agencies and with CIOs across agencies.

 

Mr. Dodd:  In looking at a balanced approach, DoD talks about balance, guidance, and communities of interest…

 

Mr. Ambur:  It is not an either/or choice -- not just top-down or bottom-up. We need to use both approaches at once.

 

Ms. Cummings:  Both ISO and others talk about the standards.

 

Mr. Hofert:  There’s much value to using the same consistent format. It’s easier to search and extract information. We’ve also heard the case for documents prepared in a particular context, from the XML perspective. Since all XML files have schemas with them, if it’s implicit, or you say “I have a different one, but here’s the schema so you can process it” XML inherently supports the ability to have different content and approaches processed. You can say, “If you support the general approach and extend it, please bring your schema, and lets agree to create the transformation for that.” You can easily facilitate that, using XML’s inherent properties itself. It’s not like having to have the program. Anything that processes XML requires that schema, whether inherent or provided.

 

Mr. Royal:  We had a related conversation yesterday. Each agency should establish an XML namespace, have an agency-wide schema—whatever your agency requires for its mission—that could be a reference used by NARA. NARA may still need centralized mandatory elements to transform the information to permanent storage. I don’t know.

 

Mr. Hofert:  There are a lot of things you can agree on—title, date, etc. A namespace is a great way to extend that base of information. It lets you define a robust standard format that everyone can at least use some of.

 

Mr. Niemann:  With the Electronic Record and Document Management Task Force, the first task is to convert proprietary formats to XML. The challenging part is to show how XML can be used for the workflow part of records. It’s a more challenging, advanced use. It’s a standard that’s not as far along in the W3C process. Another example is that there are at least three roles of schemas:

 

1.      Validation—In SOAP messages, when you want to communicate information about content without replicating content everywhere, SOAP communicates information about your records.

2.      Workflow

3.      The third is the ontology. You want to integrate across repositories, and create new documents or databases. The other aspects of XML—workflow, SOAP, and ontology—have to be incorporated as well if you want a distributed architecture. You need SOAP messaging if you want to create new documents pulling together fragments of others. That’s the value proposition. Down the road, you can then mine it for the creation of documents.

 

Mr. Dolak:  An excellent opportunity is Homeland Security. Agencies are or will be orphaned from their own systems. They may choose to use stovepiped systems. Going to the enterprise architecture, it would be a massive effort to describe the enterprise architecture.

 

Mr. Dodd:  Every piece of information isn’t exposed to the outside world. There are key elements. “What is the pipeline of information exchange between agencies?”

 

Mr. Ambur:   Brand mentioned best practices, which reminds me that the CIO Council’s Best Practices Committee is forming a work group on E-records management, with a regulatory focus.  Regarding databases, XML makes it easy to parse elements for analysis and manipulation in databases, while at the same time storing the original records in inviolate form in E-records management systems.  The Department of the Interior is involved in litigation on Indian Trust Management that relates to this issue.  The Department has databases with the trust information, but it doesn’t have the original records.  The result is that the data in the databases cannot be audited, and since it cannot be verified, lengthy and costly litigation is ongoing.

 

End presentation

 

Mr. Ambur:  Our next presenter is David Hofert, who is here to talk to us about OpenOffice.org.

 

Mr. David Hofert

Manager, XML Emerging Technologies Group (Sun Microsystems)

“OpenOffice.org XML File Format”

 

Mr. Hofert:  I’m helping the OpenOffice group in Germany, so it’s certainly easier for me to come here from Boston. I’d like to talk about the XML product used with StarOffice, now called 6.0 from Sun. It’s a full office suite, similar to MS Office—it does documents, presentations, databases, spreadsheets, etc.. One thing we did with the new version was pursue XML format to contain all the information. Today I’ll talk about where we are, the format, and some benefits. From your perspective, you could look at it in two ways:

1.      Here’s a product which you can buy from Sun or get free through open source.

2.      The second is that some of our choices made and efforts used could be food for thought. Some of our solutions may be useful for yourself, such as multiple formats, stripping content from style, etc..

 

Slide 2  [Agenda]:  As noted, I’ll talk about format, details and benefits, and look where we’re going from here.

 

Slide 3  [Open Format Benefits]:  First and foremost, it’s open. You can go to XML.OpenOffice.org and see everything—the code, DTDs, documentation, mailing list. Does the Government get into open source much?

 

Mr. Niemann:  Susan [Turnbull] is out of the room, but yesterday she had a workshop. You should connect with Susan on that.

 

Mr. Hofert:  It’s a very powerful vehicle that’s evolved over the last several years. People from all walks of life work on it. In this case—developing office documents, they all worked on it, fixed bugs, etc..

 

Mr. Dodd:  The scientific community is heavily involved.

 

Mr. Hofert:  The nice thing is that you (the individual) can steer it as much as anybody.

 

Mr. Ambur:  I understand that NASA is involved in the open-source movement.  NSA is also involved, particularly with respect to security-enhanced Linux.  The NSA folks argue that every other operating system is flawed from a security standpoint.  If you get to the “root,” you have access to everything.  That’s unacceptable from a record management standpoint.

 

Mr. Hofert:  This is fully documented. There’s no mystery. We use open standards, where possible. I’ll talk later about other standards they used. If you go with this you don’t have to worry about single vendor.

 

Slide 4  [Suitable for Office Work and Editing]:  It’s an office application, suitable for text, drawings, formulas, presentations—all expressed in XML. XML was the choice because it’s suitable for editing. You could retain the document structure. For example, in the first instance of PDF, it’s a format. If you make a change to it there’s no way to reflect it. It’s not convenient for editing.

 

Mr. Ambur:  As a specific example, IDEAlliance produced their announcement for tomorrow’s conference in PDF.  I can’t edit it and it’s a record on the XML.gov website, so anyone who accesses it directly can’t tell from the announcement itself that is has been cancelled.

 

Slide 5  [Full XML Implementation]:  This is fully 100% XML, with no tricks. It’s native format. There’s nothing other than XML here. If you do have things that aren’t XML-like, that’s included as part of the packaging. For example, an OLE object can’t be replicated by XML. This means you can use any XML editor to read this. If you read it, this XML plug-in can look at it in a text editor. It contains all that’s needed to view the page. Everything is there for you to use, in an ASCII format.

 

Mr. Ambur:  This is important regarding Microsoft products. Will we be able to edit records created with Microsoft products with full fidelity using other vendors’ XML-enabled products?

 

Mr. Hofert:  You can use StarOffice for MS back to 95 or 97. You open the file at Sun. “Idelity” is good. You create the document, and once you save it, it’s in XML.

 

Slide 6  [One File Format]:  It is XML. We’re not storing it in an intermediate format. It’s native. It’s used by a variety of other sister and cousin organizations, like StarOffice, RedOffice, OpenOffice. It’s fully international, even in China and Southeast Asia. We’re growing the consumer side of it.

 

Slide 7  [Other Standards Incorporated]:  There are other standards involved. If you’ve worked with XHTML, you’ll recognize it. We borrowed from that because it’s simple. This is not .GIFs or .JPEGs. It’s drawings. We used SVG scalable graphics format. It lets you create, for example, a map of a city. Since it’s vector graphics, you can zoom in. When you get to a building, you can click on it. It can contain text, and it can be structured. For those of you in graphics, it’s something to look into. We used XSL-FO, MathML for equations, and XLink and Dublin Core for metadata.

 

Mr. Jack Callahan:  How do you integrate the information? 

 

Mr. Hofert:  It describes the information in an actual file, so you can look at it any time. The information is described as fonts, color, etc.. It reads it back into it.

 

Slide 8  [Document Details]:  The document root contains meta information and the document body. The meta information is standard—description, title, date, etc.. The  settings for open office itself allow you to have preferences, such as style declarations like headers, standard bodies, and indented bodies. You have auto styles, and stuff is treated automatically. This is all in the root.

 

Slide 9  [Document Body]:  In the body, here’s where you see all the content. Paragraphs, headers and lists first; then tables, then sections, then graphics and shapes. This is how it’s structured. There are a few specialized elements. Again, it’s all in XML, available for you.

 

Slide 10  [Sample Body]:  Here’s an XML example. It’s a “hello world” document with “hello” text, and it has a table. This is obviously very simple. The important thing is that it’s regular XML, using namespaces. The namespace designation is included in the root. If you want to work outside the OpenOffice domain, it’s easy enough to pull the text out, whether from the paragraph or the table. You can use a variety of techniques to extract it.

 

Slide 11  [Text]:  The text is very HTML-like. We’re trying to keep it simple. The paragraph is the basic text entity, with headers. Whitespace is treated. You don’t have to replicate all the spaces in a row. They can be tabbed. You can get some compression there. You can get a variety of types of lists. It allows for the ability to process it outside the OpenOffice program.

 

Slide 12  [Styles]:  There are two types of styles: user-defined and automatic. It’s document-wide style. For example, “warning” is defined. When you’re editing in OpenOffice, you click. It’s captured up front. There’s also the notion that when you’re within a paragraph, you may want to modify some text. It’s all captured in XML. It uses the exact same syntax throughout. Everything that can be leveraged is. It looks like Word document styles.

 

Slide 13  [File Packaging]:  When you look at an OpenOffice file (I didn’t know this until this talk), it turns out it’s zipped XML files. It’s all laid out. As far as you know, you’re just parsing a file. It doesn’t look zipped. All these XML files are in there. It’s set up so that it unzips it, and creates a structure. The structure and content, meta information, and graphics are separated.

 

Mr. Ambur:  Do any of the browsers automatically run an XSI file?

 

Unidentified member:  It’s not going to auto unzip it, but you can do it.

 

Mr. Hofert:  They don’t have a plug-in, but you could create one since the XML is there. You get the compression of the zip, so that’s a benefit, and there’s the option for encryption.

 

Unidentified member:  Also with scripts and services from other applications you can do indexing that would do that unwrapping. You don’t need to worry about styles, so you can automate quicker on some of your tasks.

 

Mr. Ambur:  This presentation is available on the XML.gov site in PDF and XSI.  To use XSI you’ll have to save and unzip it. Regarding the search function, we’re affiliated with FirstGov search. Will it index the XSI file?

 

Unidentified member:  If you want it to.

 

Mr. Hofert:  You just need to ask them whether they support zip files. It’s such a common format that it’ll recognize it and unzip it. If you want, the first thing you can try is to just unzip it..

 

Mr. Ambur: I would just suggest that if you expect it to be used, those are things you need to think through.

 

Slide 14  [Benefits]:  It’s easy to parse. Here’s an example from a spreadsheet formula for a cell. The value and all the information is captured in the stored format. Again it’s convenient to use via OpenOffice, but if you‘re using another mechanism, all the information is there and available for you. You don’t have to jump through hoops. For example, if you’re looking for a number, you can pull it right out. It’s consistent, so if you can parse one place, you can parse everywhere.

 

Mr. Niemann:  One issue with Excel 2002 is that it’s not valid XML. It’s what Microsoft calls Spreadsheet XML. It doesn’t validate.

 

Mr. Hofert:  This is valid. Microsoft tends to do it, for whatever reason. Here, the intent was to use native XML so you could use XML tools. Most people around the business community, the fact is, want to process information with two or three other things—for example, searching or extracting data. That’s why these guys went to XML itself.

 

Mr. Todd:  It sounds like two different philosophies-open systems and proprietary. One company said the open system approach to buying software is un-American.

 

Mr. Hofert:  Sun got its start through standards, so from day one we’ve gone that way. For example, if you want free use of OpenOffice, that’s fine. If you’d like our support, then you can buy and use our products.

 

Mr. Ambur:  I want the record to show that if companies think they have the right to hold “We the People’s” records hostage to proprietary dependencies, they have another think coming.  Moreover, government decision-makers who buy into the notion of imposing proprietary dependencies upon the public ought to be fired.  They have no business being in public service.

 

Slide 15  [Easy to Generate]:  You can work with OpenOffice and save information using XML, but you can also generate it. Here’s an example of the simplest document you can create. It’s not very complicated. The header is essentially fixed, so you can drop that in.  After that, you just worry about a few pieces of text. We’ve already separated content and style. You can have an official U.S. Government style.  You go back to the legacy data, convert it to simple XML, and when you bring it up it’ll be in the official U.S. Government format.

 

You can generate content fairly simply. That lets you do lots of interesting things. For example, run database queries, collect news feeds, government information, create XML documents and open them in OpenOffice. You can print, share  it with Word, etc.. You can also update styles. If you want to say 2003 now for a date, you can change the style, and all your documents auto-update. It’s a powerful feature.

 

Slide 16  [XML Transformation Capability]:  Transformation talks about display. Since  all of it is XML, you can use XSLT. It’s a powerful feature of XML. It allows complex transformations from one data format to another.

 

Mr. Ambur:  Which browsers support XSLT?

 

Mr. Hofert:  XSLT is something you use to generate XML.

 

Mr. Ambur:  So you’re talking about server-side transformations?

 

Mr. Hofert:  Yes. You can produce any kind of output—flat text, binary, XML, anything. XSLT is complicated, but powerful. It’s easy to run XSLT transformations on components. Since OpenOffice is XSLT format, the HTML is pretty good. It’s a nice feature for someone at a remote site. Or, someone who doesn’t have OpenOffice can also do this transformation.

 

There is a tool kit—part of the “Cocoon” project at Apache.org. An AxKit in there uses PERL, and has transformation available. There’s another open source effort. I forget where it is, working on using “DocBook.” It’s a widely-used document format in XML. You can go back and forth—take an OpenOffice document, turn it into a DocBook document, or go back. At the XML level, you have the ability to exchange between XML variants.

 

Slide 17  [More Transformation Benefits]:  Using XSLT is powerful. Filtering is very costly. Someone has to create a special program. Using XSLT, you can find people readily to do a transformation. Someone writes it, and you’re there. It’s very cost-effective. It’s easy to keep up, and to keep track of performance. It simplifies your development.

 

Slide18  [Partial Viewing/Editing]:  You can look at pieces of documents easily. If you are doing a query, you can pull fragments, look at them, change them, and put them back. It’s hard to do that with a binary format. This is nice for devices. We talked about using XML for phones, but say you want to dump text out of a document for viewing on a little device. It’s very easy to do. Many times, it’s all you want.

 

Slide 19  [Archiving and Indexing]:  We think XML has legs, because it’s ASCII, and it’s already self-describing. I think it’s a good bet that you’ll be able to read it for quite a while. It has all the elements you need.

 

Slide 20  [Office as Layout Engine]:  When you do generate a file, you can use OpenOffice as a layout engine. You can put it into XML format, and when you bring it up in OpenOffice, the layout comes up and you can tweak it. It’s easier than writing fancy scripts. You can get into XML, bring it up into OpenOffice, and let it worry about page breaks, etc.. I’ve already talked about the ability to edit and change styles.

 

Slide 21  [An Extensible Format]:  It is XML, so it’s extensible. You can add to it. For example, we added a text grid after it was released. You can do it several ways. You use good namespaces throughout, to avoid name clashes. OpenOffice itself is tolerant of alien attributes. If you create an XML file and put in your information, it will tolerate it, just like HTML does.

 

Mr. Kevin Williams:  If you open it in OpenOffice or StarOffice, will it be preserved?

 

Mr. Hofert:  Yes it will. It will ignore those things it doesn’t know. OpenOffice uses this for backward compatibility. They can use an older version to look at a newer one. It’ll ignore what it doesn’t understand, but won’t get rid of it.

 

Slide 22  [Future Developments]:  Where do we go from here? The plan is for this to go to standardization. There is the XForms work, but this is for an office format. Now we take it to a standards body.

 

Mr. Dodd:  JSR?

 

Mr. Hofert:  No, something like OASIS or W3C. It’s the format itself. This is a great opportunity for you guys. You can help shape it, change what needs to be changed—get involved to the extent you want to and help shape it. It’s somewhat developed. Now let’s go to a standards body and bring it out as a standard. We want to evolve the format, do more layout work. We’re talking about digital signature and encryption. It would be nice if you could sign or encrypt a particular paragraph. Difficult, but nice if you could do that.

 

Slide 23  [Summary]:  The bottom line is that it makes a lot of sense. Developers can take advantage of it, and it makes sense for users because of transformation—both within XML and the ability to transform to different formats. It has longevity. You can save your money because it can look at an application, it can transform to other formats for browsers, PDAs, phones, etc.. Since it’ll be a standard soon, you don’t have to be locked into any particular format.

 

Mr. Ambur:  Are there any questions for David?

 

Mr. Williams:  This is great! I see lots of effort for solid standardization. I’m scratching my head about the XSI format. I’m thinking of a Web library of OpenOffice documents and what to index. You can’t do it natively with XSI. You must extract the XML elements. It limits use.

 

Mr. Hofert:  You can save your settings as binary. The XSI file is just a zip file. It contains multiple files. It’s just a way of managing, a usability thing. The example on the website is just a presentation document. It looks like just a file, very convenient. But three files and a subdirectory aren’t as convenient to move around. I think it’s flexible. You can get around zip and unzip. It uses the unzipped format. I’ll bet that OpenOffice.org site, someone has done that. Zipping isn’t a huge barrier.

 

Unidentified member:  It’s similar to the model described earlier. It takes binary files associated with it. They don’t have to include PDF, and they’re bound into one file.

 

Mr. Bohdan Kantor:  Are there obstacles to using it in a Windows 2000 environment? For example, class path, Unicode, code other than ASCII?

 

Mr. Hofert:  I think it does. As far as 2000, I think it’s fine. I’m running Windows here, and nothing special is happening.

 

Mr. Kantor:  When I use Java I have problems finding some of the tools. You have to put in a lot of the class paths.

 

Mr. Hofert:  Just use the wizard.

 

Unidentified member:  It uses Java, but the core of the application isn’t in Java.

 

Mr. Niemann:  What about support of other XML vocabularies? Specifically, we need XGML for georeferences, so when we use a map we can pull up the document. How easy is it to plug in other vocabularies and tags?

 

Mr. Hofert:  I think it’s pretty easy, if you want the program to recognize it. I’m not sure what that takes. If it doesn’t know it, it’ll just ignore it. I’d go to that URL.

 

Ms. Turnbull:  Python and XML make a great combination. Do Python developers like this environment.?

 

Mr. Hofert:  I don’t know. The price is right, but I don’t know.

 

Unidentified member:  It’s popular in Linux, Lindows, and Red Hat. Walmart had a deal to sell StarOffice. You pay $99, and you get everything we have access to—both share ware and commercial applications.

 

Mr. Callahan:  What are Microsoft’s plans?

 

Mr. Hofert:  They do export to XML already. One different thing is that it’s not native format. I’m not sure if everything is in there. My guess is that this will push them more toward native XML.

 

Mr. Callahan:  Get convergence.

 

Mr. Hofert:  If this is the format, I’d hope there would be convergence.

 

 

End presentation

 

 

John Turnbull

Senior Program Manager, XML Content Solutions (Corel)

XML Authoring for Government Documents  

 

Mr. Turnbull:  I’m John Turnbull, Senior Program Manager for XML Content Solutions. That covers a wide range—XML authoring, and XML plant management tools as well. I can talk about this in terms of packages if you’d like. I’ve spent a lot of time in the past explaining what markup is all about, so it’s a pleasure to see that you’re explaining things to each other.

 

Let me also say that Corel is Canadian. I come from where we don’t worry about being un-American.

 

Slide 2  [Corel and SoftQuad, a Brief History]:  We started off in ‘86, with IBM and SGML, and we made an industrial-strength SGML editor called “Author Editor.” Then we made money with our HTML editor, so we took Author Editor, changed it to edit HTML DTDs, made money, and lost it.

 

We entered a period when we weren’t making much money. Peter Sharp co-created SQML. It was literally four people in Peter Sharp’s basement. This happened in ’96. I was a bystander. I was hired by SoftQuad in ’96. We then came back to launch the first SoftQuad editor. We should be getting version 3 out soon. In March of this year, we were  purchased by Corel. They assumed that the future of any document-creation activity would have to be in native XML. They didn’t know how to do it, and they thought we did. Now they’re finding out.

 

Slide 3  [Three Assumptions About XML Authoring]:  We make three assumptions here, and I’d really like you to come back at me with your opinions.

  1. We suggest that your metadata is more useful to XML files.
  2. Authors must have a model, so the semantics of the process must be expressed, rather than the authors expressing the process to the semantics. Authors must use the semantics of the process.
  3. If they accept that, in the sense of a computer algorithm, you can offer back useful behaviors and automate what they do. 

 

Slide 4  [Metadata is More Than Style]:  You can “style” metadata, but you can’t “metadata” style. The degree of metadata is complex enough that you don’t have it in styles—for example, italics. What does it mean? Citation, or emphasis? How does a machine distinguish that? It doesn’t, yet. Useful metadata is an asset that’s way bigger than just what someone wants to look at while doing their work. 

 

There is a large variety of types. There needs to be some way to get them to operate with what the author is doing. Finally, we have  attributes. It’s difficult to handle, The question, “Is this thing a paragraph?” is fine, but—is it approved? Has it been perused by technical or legal? All that information has to be there.

 

I want to talk about conversion. We used it earlier. You take two equal things and interchange them, back and forth, with nothing lost. It doesn’t happen in XML, not because you can’t write filters. The difference is that when you come into XML, you discover that the information is not in the original document. You have legal text from 1954. You can transform it into XML, but it’s not conversion, because the information isn’t there  (Who wrote it, and why?”). So there’s no such thing as conversion. Secondly, schemas for instances and schemas for metadata. I’m going to challenge that, but I think you’re right. One person’s data is another person’s metadata. We don’t know what will be important in 2020. It’s important that all the information is expressed. That means that we can’t put just a label on it and pass it on. Years from now, someone will want to know what’s in the document. XML doesn’t make a distinction between metadata and the document.

 

Slide 5  [Example from the GAO Report]:  This is a fairly flat document. There are only two levels, but lots of stuff. Imagine trying to populate the metadata in styles. It would be very hard. Someone could write it in Word, but it would be very difficult. How do you get Office to do all that stuff?

 

If you have that, then what happens when you have legislation? It’s a tough problem.

 

Slide 6  [Authors Need Models]:   We’re trying to automate by DTDs or schemas that express structures through good software. I could, as an employee, have it expressed to me to see what software is in the document. The document has to be parsable. It might be loose or tight, but it has to be parsed. It must be XML, or maybe valid XML. The structure for a particular design will be maintained by that model, so that the XML can be templated and also act as a contract.

 

Slide 7  [Example]:  This is a bill from the U.S. House of Representatives. It has lots of metadata. XMetal was used to create it. You can speak with Joe Carvel, who did a wonderful job. The idea that you can do it without the help of the author is naive.

 

Slide 8  [Models Enable Software Behaviors]:  Here’s quid pro quo. The software can look up names and numbers, preconstruct the skeleton document, and pick up bits of the document from elsewhere. All those processes that make the production efficient can be built into the software, and none of the styling comes along with it, because it’s valid XML. We can provide those behaviors with script. It’s no longer necessary to write in “C” or “C++” or “Sharp” to get it done. You can use the document object model to present (the W3C recommendation) the document to the author, “this is how you develop, edit, etc..” For example, take a script to walk around in the document, and see if there’s an abstract or not. The cost of development is plunging because of these efficiencies.

 

Slide 9  [Additional Example]:  Here’s an example of a build on XMetal. Here’s what one would look like if you’re a securities expert in New York. The point is, it looks like nothing new. It looks and acts like a Word document. If I change the number of offices in my company, or move around in a table with the tab key, if I’m writing bullets, I can do spell check, etc.. The document is ordinary, and that’s the point. Because it’s XML, I could create behavior on the stock symbol, that’s in script.

 

Dates—everybody’s concerned about date formats. Behind the string there should be a standard version of the date. In this case, we represent it with a dialogue. That dialogue is associated with the content, not in the content or the application, or the binary. It’s publicly available and associated with a particular element in the structure. This text box is an Active X control embedded in the document. The information comes in real time from a database, so it’s the up-to-date version. GIFs are useful, but not all that useful. Had someone approached me a year ago and said, “I understand XML. We need a way to understand the XML in graphs” I’d have said, “Sorry, it can’t be done.” But now, it can and it’s a year later. Here’s an interactive table, a value in a table responded to by an image. Both the document and the image are text. XMetal has a scripting layer that looks at them, but puts in SVG files and presents it, so now we have a picture responding to text.

 

The thing everyone understands is that they can be transformed to other things. Here’s a transformation to a simple looking document. That stuff is generated from what I just wrote in the HTML document. Generating print is more complex from a technical standpoint. From the user standpoint it’s about the same. Those transformations can be done. It’s single source. One piece of information, canonical, and approved.

 

Ms. Turnbull:  So that would be your tag PDF?

 

Mr. Turnbull:  We don’t really believe in tag PDF. We believe in tag single source that can be translated into any PDF that’s out there now. This really is XML. I should show you the tags so that you can be sure it is XML. It would be possible, though not necessary, to show the author what elements are available. For example, in a description I can’t add tags. It’s an interactive conversation with the MOE. It’s expressed in the GUI as a list. It can be expressed  in other ways as well.

I can’t add a tag with a new name in this environment. I can’t make a new schema, but I can add in a response to something the model comes up with as I go along.  [Mr. Turnbull added several tags to the slide example.]

 

Mr. Dodd:  Is this version of XMetal out now?

 

Mr. Turnbull:  Yes.

 

Mr. Niemann:  Is this being validated real-time on the underlying schema?

 

Mr. Turnbull:  Yes.

 

Mr. Niemann:  Can you give us an example of something that isn’t allowed?

 

Mr. Turnbull:  Yes…

 

Mr. Niemann:  What does it provide for the user? For example, with XML Spy, if they enter a 3-digit phone number, and the schema only allows two, it tells the person it’s not allowed, highlights it in the schema, and gives the option to modify it on the fly.

 

Mr. Turnbull:  OK, That’s something like this. This is a purchase order. We use it to show that we operate with schemas as well as DTDs. Schemas allow reporting of regular expressions within a model, so that if I have a Canadian postal code, I can’t expect it to be valid according to a U.S. ZIP code string.

 

Mr. Niemann:  Can the user fix it?

 

Mr. Turnbull:  I’ve just shown the user. Obviously that’s not adequate. Let’s drag in my box, tell the user you can’t use it, explain it, and give him a radio button or something to figure it out. So it really is XML. It does operate like Word.

 

Unidentified member:  What about import/export capabilities?

 

Mr. Turnbull:  Yes, we have ways of importing Word—cut and paste, file opening, etc.. It can support XSL files. We’ve even played with PowerPoint. It works if you’re not too critical. I’ll refer you to a document available on our website—a reviewers’ and evaluators’ guide, PDF document. It’s longish, and meant for developers. It talks a bit about schemas, talks about multiple output from single source, integration with document, and content management.

 

Mr. Ambur:  Record management.

 

Mr. Turnbull:  This is s list of all the ways we know that peoples write documents, and what you can do with them with XML. We go from technical editing, like author/editor or epic—format-based editing.

 

Mr. Ambur:  Can you provide us with the URL?

 

Mr. Turnbull:  It’s on a slide at the end.

 

Slide 10  [Future Directions for XMetal and WordPerfect]:  So where are we going with XMetal? (This is a core product for XML authoring. WordPerfect has done XML and XGML for some time.) The difficulty with markup editing is two-fold:

 

Slide 11  [Two Distinct XML Authoring Types]:  There are two kinds of authors—users of the technology, and component authors (experts in a narrow field, for example), so those component authors think of authoring as a part of a general work flow. We responded with XMetal as an Active X control that any author can embed.

 

Slide 12  [Next Steps]:  XMetal is freely evaluatable for a month. Various integrations are free.  There’s a Reviewer’s Guide for developers. Scott Edwards is the contact, and he’s Washington all the time.

 

I’d like to leave you with the thought that Microsoft owns a lot of Corel (1/3 of Corel’s treasure chest was from Microsoft). Before Microsoft bought SoftQuad, Microsoft was its biggest customer. They bought a license awhile ago, and it’s used in almost every part of the company. The most interesting part is that the technical documentation of Microsoft products is in XMetal. So the answer to the question of what Microsoft is doing is that it’s doing it in XMetal.

 

Mr. Ambur:  Are there any questions for John?

 

Mr. Williams:  XMetal provides a schema development tool?

 

Mr. Turnbull:  No, we don’t have one right now. We have a schema parser. In the next product in January, that will be the schema editor. It always has been the text editor.

 

Mr. Niemann:  What about style sheets?

 

Mr. Turnbull:  Same thing. We’re working on one. We don’t think anyone has a visual paradigm to express.

 

Mr. Niemann:  Including XML Spy?

 

Mr. Turnbull:  XML Spy is great. When I write a document, I use XMetal, Spy, etc.. People in the future are going to need a different visual tool.

 

Mr. Callahan:  In the editor itself, you have the structure on the left, and editing is done through styling. How is the editing style done?

 

Mr. Turnbull:  Do you mean that…

 

Mr. Callahan:  It’s right there. That’s not XML; it’s a style projection.

 

Mr. Turnbull:  There’s an XML instance, and another file (A CSS file and an XForms file) to be used to overlay the field. When I save the file as XML, the XML CSS and the XML stay with it. If you permit the user, you can allow him to write some formats if you wish. You can write your own dialogues. It’s a particular expression to the file. XMetal raw will take your instance, say “Where’s the DTD or schema,” find it, express a default style, and give you your tags. Then you build behaviors on top of that.

 

Mr. Callahan:  Do I have to have a DTD?

 

Mr. Turnbull:  No. You can start from scratch. Lots of our clients use tools to generate beginner DTDs.

 

Ms. Turnbull:  Is SVG supported?

 

Mr. Turnbull:  Yes. It was approved several months ago. SVG is wonderful.

 

Mr. Hofert:  There’s a project at Apache…

 

Ms. Turnbull:  Throughout Europe there are fascinating projects. It seemed to aid us in our work on section 508.

 

Mr. Turnbull:  I had this discussion with a speech pathologist. Part of speech pathology identifies the development of the speaker, and the interface.

 

Ms. Turnbull:  I’m a speech pathologist too.

 

Mr. Ambur:  Glenn’s typing in Word, and we post comments in HTML.  We’re going to try to use WebDAV to post the minutes and let people edit them.  It will be interesting to see whether the speakers can use their own software to do so.

 

I’m a WordPerfect user.  I like it and I don’t’ want to change. There’s a migration issue: How do I get to XML?  There’s a discussion at the Fish and W-ildlife Service about OpenOffice.org.  Our current text exchange standard is WordPerfect 6.1, but MS Word comes with every PC we buy.  What matters is that the record is in XML, and everyone can interact with it using their own applications. For example, if we want others to be able to edit the minutes, how best do we make that possible?

 

Mr. Turnbull:  Within our evaluator, you’ll find something called “Meeting Minutes” and an interactive resource editor. Secondly, WordPerfect does XGML and XML in a way, and I believe that’s it’s likely that WordPerfect will have, in subsequent versions, a capacity for converting to XML in a server process. It could be a limited capacity.

 

Mr. Hofert:  At Sun, we went to FrameMaker. People were kicking and screaming to go to a WYSISYG editor. People want to follow their expertise. XML provides the answer because you can go from several formats. For example, the minutes—if you want to provide them in an XML format, that’s fine, because I can take them in and provide them in whatever form you want. The key to use is to provide it in XML, and then we can do that.

 

End of presentation.

 

 

Attendees:

 

Last Name

First Name

Organization

Ambur

Owen

FWS

Billups

Prince

DISA

Bjornsen

Terry

Booz Allen

Boyle

Carrie

LMI

Callahan

Jack

Sphere

Cocos

Dena

LMI

Collins

Jamey

GAO

Cummings

Susan

NARA

Dodd

John

CSC

Edwards

Scott

Corel

Hofert

David

Sun Microsystems

Kantor

Bohdan

Library of Congress

Miller

Mark

Booz Allen

Niemann

Brand

EPA

Thibodeau

Ken

NARA

Todd

Mike

OSD

Turnbull

John

Corel

Turnbull

Susan

GSA

Weber

Lisa

NARA

Williams

Kevin

Blue Oxide