Trying to locate useful information on a file server crammed with binary documents can be a challenge. Microsoft hopes to change that with the XML features built into its pending Office 2003 Professional suite.
With data stored in XML format, says Martin Sawicki, a lead program manager for Word, server-side tools can automatically assemble and generate reports and presentations. He calls it “liberation” of data, saying XML is an ideal format for data mining and repurposing. “It should be easy, it should be automatable,” he says.
Microsoft knows many Office users will never use the XML features, but hopes IT managers will make ready use of the new tools at their disposal.
In Word 2003 documents can be saved as XML data-only files, without any formatting information. When opened, a default XSL stylesheet -- which describes how data sent via XML should be presented -- can be nominated to format the file, or an XSL template can be chosen. However, while a rudimentary XSL-generation tool will be released, Sawicki says developers will need to use Visual Studio.Net to create advanced features such as document actions.
A new document type called a “smart document” allows developers to create a customised interface for end users. A document actions pane lets a user select an action that might automate form completion, or update local data from a network data source.
A schema library allows new schemas to be applied with a single click. Documents can be validated against a schema in real time, and violations are marked. This allows schema designers to stipulate which data can or must be included within a document, and to define the type of data which is expected. XML expansion packs can specify that schemas or actions should be periodically updated from a central repository. If an updated schema no longer matches XML data within a document, the namespace is renamed so data isn’t lost.
A research library will perform local or remote searches using web services. A number of web services are bundled with Office, Sawicki says, including a thesaurus, dictionary and Microsoft's Encarta encyclopaedia.
Microsoft hopes third-party web services will be created for Office users, and is building a repository at office.microsoft.com/marketplace. Sawicki says Office will check for new services when launched.
Excel documents, for their part, can be saved in one of two XML formats, as an XML spreadsheet or as XML data. XML spreadsheets store links to data, which can be dynamically updated.
New to Office 2003 is InfoPath, a tool for building rich forms such as invoices and purchase orders, Sawicki says. It ships with over 100 predefined forms and will attempt to select a form based on a file’s XML data. Other applications in the suite are FrontPage, Access and Visio.
Office 2003 also allows developers to limit changes made in certain areas of a document to specified users, Sawicki says.
Microsoft XML moves are not without some controversy. Some observers have been sceptical it will stick to standard, unaltered XML and disclose the underlying format of its XML files. While XML is a W3C standard, because the language allows developers to define tags, companies can still generate tags that are proprietary.
Office 2003 is expected to ship in late October.