After working on the e-government website and participating in web standards reviews, open-source developer Matthew Cruickshank built an application that converts Microsoft Word files to HTML.
The software was written in response to the publishing needs of people who care about standards, he says.
The software, called Docvert, is a web application which takes word processor files, such as Microsoft’s .doc, and converts them to Open Document Format and HTML. The OpenDocument file can be converted to HTML or any XML. The result is returned in a .zip file, says Cruickshank.
The software is built using PHP, XSLT and XML Pipelines, he says.
Docvert is used by the State Services Commission and other government agencies. It lets users inspect and review every part of the conversion, says Cruickshank. “You can control every part of the output and make the conversion fit into any site,” he says.
Because Docvert is a web service, it allows people to build software on top of it, says Cruickshank. “When converting OpenDocument files it does it all internally with code written for Docvert. However, if you wish to convert Microsoft Word documents then you can attach OpenOffice.org or Abiword,” he says.
The software is available under the GNU Lesser General Public Licence. Cruickshank is also in the process of getting Docvert into the Debian operating system, which will make it easily available to a greater audience, he says.
A recently added feature to the application is the Document Generator, which allows users to turn multiple webpages into word-processing documents.