XML and Databases

Ronald Bourret

Consulting, writing, and research in XML and databases

XML Guild

Member

XML Data Binding Resources

Copyright (c) 2001-2011 by Ronald Bourret
Last updated: October 7, 2011

1.0 Introduction

This is a list of resources about XML data binding.

Special thanks go to Sean Sullivan, who provided the initial list of links and who continues to provide valuable input, and to Brendan Macmillan, to whose list of links I shamelessly helped myself and who also helped me understand and categorize tools for working with XML and objects.

1.1 XML Data Binding

XML data binding is the binding of XML documents to objects designed especially for the data in those documents. This allows applications (usually data-centric) to manipulate data that has been serialized as XML in a way that is more natural than using the DOM. For example, consider the following sales order document:

   <SalesOrder SONumber="12345">
      <Customer CustNumber="543">
         <CustName>ABC Industries</CustName>
         <Street>123 Main St.</Street>
         <City>Chicago</City>
         <State>IL</State>
         <PostCode>60609</PostCode>
      </Customer>
      <OrderDate>981215</OrderDate>
      <Item ItemNumber="1">
         <Part PartNumber="123">
            <Description>
               <p><b>Turkey wrench:</b><br />
               Stainless steel, one-piece construction,
               lifetime guarantee.</p>
            </Description>
            <Price>9.95</Price>
         </Part>
         <Quantity>10</Quantity>
      </Item>
      <Item ItemNumber="2">
         <Part PartNumber="456">
            <Description>
               <p><b>Stuffing separator:<b><br />
               Aluminum, one-year guarantee.</p>
            </Description>
            <Price>13.27</Price>
         </Part>
         <Quantity>5</Quantity>
      </Item>
   </SalesOrder>

This could be bound to the SalesOrder, Customer, Item, and Part classes, so that when data is transferred from the XML document, the result is a tree of objects:

                     SalesOrder
                    /    |    \
             Customer   Item   Item
                         |      |
                        Part   Part

It is important to note that these objects are not the DOM, which models the document itself, rather than the data in the document and would result in the following tree:

                      Element --- Attr
                    (SalesOrder) (SONumber)
               ____/   /   \   \_____
              /       /     \        \
       Element     Text     Element  Element
     (Customer) (OrderDate)  (Item)    (Item)
          |                    |         |
         etc.                 etc.      etc.

Obviously, a sales order application would find it easier to use SalesOrder, Customer, Item, and Part objects rather than Element, Attr, Text, etc. objects. This is what XML data binding is used for. In particular, XML data binding products support some way to map an XML schema (in the form of a DTD or XML Schema document) to an object schema (or vice versa). Based on this mapping, the product can then create objects from XML documents ("unmarshalling") or serialize objects as XML ("marshalling"). For a more complete explanation of XML data binding, see the papers below.

1.2 Limitations of XML Data Binding

XML data binding products have a number of limitations, although most of these are not serious in practice. The first set of limitations are round-tripping limitations. These are limitations on what is preserved if an XML document is round-tripped through an XML data binding product -- that is, its contents are transferred from an XML document to a set of objects and back again.

All XML data binding products can round-trip elements, attributes, and text, as well as the hierarchical relationships among them. However, most XML data binding products cannot preserve anything else, such as comments or entity references. (A notable exception to this is XMLBeans, which is designed to round-trip XML documents.) As a general rule, this is not a serious problem, since applications that use XML data binding tend to be interested only in the data in an XML document, rather the way in which it is represented.

Round-trip limitations include:

  • Sibling order. While XML data binding products preserve the hierarchical relationships between elements and attributes, most are (probably?) limited to the extent in which they preserve sibling order -- that is, the order in which sibling elements and text occur. For example, products can probably serialize most data-centric XML correctly, but probably cannot handle things like repeating sub-sequences. This is generally a problem only to the extent that an XML data binding product generates documents that are invalid because of sibling order.

  • Physical structure. Most (all?) XML data binding products do not preserve physical constructs such as CDATA sections and entity references. The reason for this is that such constructs do not have corresponding constructs in object languages.

  • Comments and processing instructions. Most XML data binding products do not preserve comments and processing instructions. The reason for this is that these do not map easily to object schemas, since they can occur anywhere in the document.

  • DTD. I do not know if XML data binding products preserve DTDs.

  • XML declaration. I do not know if XML data binding products preserve the XML declaration, including the encoding and standalone declarations.

The second set of limitations are features. No XML data binding product can handle all possible situtations in which XML data binding is used. For example, most XML data binding products cannot handle wildcards in XML Schemas. Again, this is not a serious limitation for most applications -- either they do not need the feature or there is a workaround.

Feature limitations include:

  • Incomplete schema support. Most XML data binding products support a subset of XML Schemas. In particular, mixed content, wild cards, substitution groups, key/keyref, complex type restriction, and some content models(?) are not commonly supported. A notable exception to this is XMLBeans, which supports the entire XML Schemas recommendation.

  • Transformation limitations. Incoming XML documents commonly have a structure that differs from the structure of the classes used by the application. This is particularly true when XML Schemas are defined externally, such as with industry-standard schemas. As a result, applications commonly use XSLT to transform incoming documents to a format which can be mapped to their classes. Some of the mapping languages used by XML data binding products can perform a limited set of transformations, which may reduce the use of XSLT. JiBX has a particularly good mapping language in this respect.

  • Document fragment limitations. It appears that no XML data binding products can operate on document fragments. That is, they cannot extract data from one or more fragments of an XML document, expose that data using schema-specific objects, and re-write those fragments to the document, leaving the rest of the document unchanged. This is a real problem in workflow scenarios, where a series of applications works on a document, each modifying part of the document and passing it to the next application. As a workaround, applications must extract and re-insert fragments themselves.

1.3 Categorizing XML Data Binding Products

This paper divides XML data binding products into the following two categories:

  • Design-Time Products These products require configuration before they can be used. This usually means mapping XML documents to classes with a language or GUI tool provided for that purpose. Most of these products usually also include a utility for generating classes from an XML schema language, usually DTDs or XML Schemas. A few allow classes to be annotated with mapping information or derive from an XML mapping class. The advantage of design-time products over run-time products is that they are usually more flexible in the mappings they can support.

  • Run-Time Products These products do not require any configuration. Instead, they can be used directly in code to serialize and de-serialize objects as XML. In exchange for ease of use, the user generally has no control over how classes are mapped to XML. For example, some products use Java Reflection to discover property names and use these as element type names. Others exploit the ability of languages such as Python to use the fields in a class without first declaring them. Still other products use proprietary XML languages for storing data.

1.4 Disclaimer

DISCLAIMER: I have undoubtedly missed some products and am not current on others. Because of this, be sure to check your favorite vendor's Web site for the latest information. Also, please note that I have not used any of these products. I have gathered product information from documentation, Web sites, and product reviews and therefore encourage you to use this information as an introduction only.

2.0 Products

2.1 Design-Time Products


Artix Data Services (formerly C24 Integration Objects?)

Developer: Iona
URL: http://www.iona.com/products/artix/data_services/welcome.htm
License: Commercial
Entry last updated: June, 2004

[Ed. -- Iona purchased Century 24 Solutions in 2007. It appears that they repackaged C24 Integration Objects as Artix Data Services. The following is a description of C24 Integration Objects and has not been rewritten for Artix Data Services, other than changing names.]

Artix Data Services is a general data binding product that supports bindings to a variety of data formats, including XML, relational databases, CSV files, and RTF.

Artix Data Services Integration Objects uses four principal concepts for generating code: structure, presentation, transport, and bindings. A structure is a model for a particular set of data. This uses a proprietary format that has very similar modeling capabilities to those found in XML Schemas. A presentation defines an external format for a set of data, such as XML, CSV, or a relational database. A transport is the way in which data is sent to and from a particular presentation, such as a file, JMS, FTP, or JDBC.

A binding binds a structure and input and output presentations. It can be used to generate code and JavaDocs for the bound items. During code generation, the structure defines what classes will be created. The input and output presentations define the source and target of the methods used to load and save data. For example, to duplicate traditional XML data binding, both the input and output presentations would be XML, so the generated classes could be populated from XML documents and serialized as XML documents. Other combinations are possible as well. For example, data could be loaded from a relational database and saved as XML, or loaded from CSV files and saved as fixed-length records. Transports generally have no effect on code generation, but are instead wrapped in a generic Source class. An exception to this is when relational databases are used, where the relational database presentation is bound to a particular database and transport, although even this can be decoupled to a certain extent.

The generated classes can be used simply as an intermediate point in a data conversion, or directly by an application. In addition, applications can use XPath to query the objects, regardless of the input presentation. This is particularly useful when the input presentation is something other than XML.

Artix Data Services can create a structure from a DTD, an XML Schema, a RELAX-NG schema, or an XML document instance; the structure can then be used to generate code. Users can define restrictions on individual fields, such as to a set of enumerated values or a range of numeric values, as well as on multiple fields. The latter is done by specifying an XPath expression that specifies a condition applying to the restricted fields -- for example, if one exists, the other must also exist --, a snippet of Java code, or a custom class that implements a restriction interface. Further customization is possible through user-written presentation and transport classes. Other features include a tool for determining the differences between two data models, such as two different versions of the same model.

autoXML

Developer: Jim Kent
URL: http://hgwdev.cse.ucsc.edu/~kent/src/ddjKentXml.zip
License: Free for non-commercial use
Entry last updated: December, 2006

autoXML generates C data structures that can hold the data in an XML document, as well as the functions to load the structures from an XML document and save the data in the structures as an XML document. As input, autoXML takes a the name of a DTD file and a prefix for the structure and function names. The DTD can contain FLOAT and INT entities as attribute types; these are mapped to the C double and int types. Similarly, element type definitions can use the REAL and INTEGER entities in place of #PCDATA; these are also mapped to the C double and int types.

autoXML includes a utility, autoDTD, for generating DTDs from XML instance documents. This checks the types of attribute and element values and uses the FLOAT, INT, REAL, and INTEGER entities as appropriate. The output of autoDTD is a DTD and a statistics file, which is used by a companion utility for loading XML data into a database.

autoXML is shipped with three other utilities: autoSQL, which generates code for transferring data between objects and a relational database, and sqlToXml and xmlToSql, which transfer data between an XML document and a relational database.

Breeze XML Studio

Developer: The Breeze Factor
URL: http://www.breezefactor.com/overview.html
License: Commercial
Entry last updated: July, 2001

From the Web page:

"[Creates] Java classes that encapsulate XML parsing and validation and which have methods that map directly to your XML data elements and attributes. ... Breeze gives the programmer access to XML elements and attributes as properties of JavaBeans. Breeze also generates methods to read and write XML objects to and from XML document streams."

Generates Java classes from XML Schemas, DTDs, and relational schema. Includes a GUI tool for specifying data types and changing names.

C++ XML Objects

Developer: Paul Hamilton
URL: http://sourceforge.net/projects/cppxmlobj/
http://cppxmlobj.sourceforge.net/
License: Open Source
Entry last updated: January, 2009

From the Web page:

"C++ XML Objects is a framework for persisting hierarchies of C++ objects to and from XML. This project allows your classes to derive from a single object (called "xmlobj"), provide a few extra methods which allow the visitor pattern to work on them and register them so that they can be read or written to an XML stream."

See also XmlObjects, XMLObject, and xmlobjects.

Castor

Developer: exolab.org
URL: http://www.castor.org/
License: Open Source
Entry last updated: July, 2001

From the Web page:

"[A]n open source data binding framework for Java[tm]. It's basically the shortest path between Java objects, XML documents, SQL tables and LDAP directories. Castor provides Java to XML binding, Java to SQL/LDAP persistence, and then some more."

Generates Java classes from XML Schemas. Includes a language for mapping Java classes to XML, RDBMSs, and LDAP.

Castor is also a run-time product, as it, "... supports introspection of Beans, and will attempt to match elements and attributes to classes and fields of a class."

Related to Castor is O2XMapper, a GUI-based tool for creating and editing Castor mapping files.

Codalogic LMX

Developer: Codalogic
URL: http://codalogic.com/lmx/
License: Commercial
Entry last updated: August, 2007

Codalogic LMX is a data binding utility that generates C++ code from XML Schemas. The generated classes include code to unmarshall XML to objects (using a lightweight pull parser shipped with the product) and marshall objects to XML. Before marshalling, each class can be checked to see if enough values have been set to produce a valid XML document. In addition, a compiler definition specifies whether the generated code checks input values against the facets in the schema when mutator (set) methods are called.

Codalogic LMX supports most of XML Schemas, and notably includes support for wildcards and mixed content. anyAttribute is supported by methods that get or set the name of each wildcard attribute and its value, which is a string. any (element) is supported by methods that get or set the value of each wildcard element, which is a string that includes the start and end tags for the element.

Codalogic LMX comes with a predefined library of C++ data types to which the XML Schema data types are mapped. The user may override these, as well as defining classes which convert between strings (in XML documents) and C++ data types. The user may also define a library that performs pattern matching for pattern matching facets. By default, no pattern matching is done.

Of note, Codalogic LMX comes with very complete documentation.

CodeSynthesis XSD

Developer: Code Synthesis Tools CC
URL: http://codesynthesis.com/products/xsd/
License: Open Source
Entry last updated: April, 2007

From the company:

"CodeSynthesis XSD is an open-source, cross-platform W3C XML Schema to C++ data binding compiler. Provided with a schema, it generates C++ classes that represent the given vocabulary as well as parsing and serialization code. You can then access the data stored in XML using types and functions that semantically correspond to your application domain rather than dealing with elements, attributes, and text in a direct representation of XML such as DOM or SAX."

"XSD supports both in-memory and stream-oriented processing models by implementing two C++ mappings: C++/Tree and C++/Parser. The C++/Tree mapping represents the information stored in XML instance documents as a tree-like, in-memory data structure. The C++/Parser mapping generates parser templates for data types defined in XML Schema. Using these parser templates you can build your own in-memory representations or perform immediate processing of XML instance documents."

"XSD features C++ standard library-based language mappings, configurable base character type (char/wchar_t), support for all XML Schema built-in types, custom XML Schema to C++ namespace mapping, and platform-independent generated code. XSD also supports some of the more advanced features and extensions of XML Schema, including polymorphism (substitution groups and xsi:type), anonymous types, element and attribute groups, statically-typed ID/IDREF cross-referencing, schema importing/inclusion, mapping of xsd:enumerations to C++ enums, customization of the generated code, and serialization to binary formats."

CodeSynthesis XSD/e

Developer: Code Synthesis Tools CC
URL: http://codesynthesis.com/products/xsde/
License: Open Source
Entry last updated: May, 2007

XSD/e uses an event-driven programming model, unlike the application-driven programming model used by other XML data binding products. In particular, XSD/e generates a set of skeleton C++ classes that contain schema-specific call-back methods. Applications implement their logic by overriding the call-back methods. XSD/e also generates schema-specific validation, dispatch, and data extraction code, which handles conversions between character data (in the XML document) and typed data (used in the call-back methods). The generated code is driven by a SAX parser.

To understand the difference between XSD/e and other XML data binding products, it is easiest to think about the difference between SAX and DOM. With DOM, an application instantiates the DOM objects, then uses them to access the XML data. With SAX, the parser calls methods implemented by the application and passes the data to the application. In other words, with DOM, the application controls the data flow; with SAX, the parser controls the data flow.

Similarly, with other XML data binding products, the application instantiates the schema-specific objects, then uses them to access the XML data. With XSD/e, the parser calls methods implemented by the application and passes the data to the application. In other words, with other XML data binding products, the application controls the data flow; with XSD/e, the parser controls the data flow.

There are two important differences between the methods generated by XSD/e and the methods in SAX. First, the methods generated by XSD/e are specific to a given schema. That is, while SAX has generic methods such as startElement and endElement, XSD/e generates schema-specific methods such as Price (for a Price element) and ShoeSize (for a ShoeSize element). In addition, instead of using a characters method to pass content to the application (as SAX does), the schema-specific methods generated by XSD/e use typed values to pass data. For example, the Price method's argument might be a float and the ShoeSize method's argument might be a user-defined type that is an integer value between 1 and 23 (corresponding to US shoe sizes).

The second important difference between the methods generated by XSD/e and methods in SAX is that the methods generated by XSD/e are organized into classes that mirror the hierarchy of the schema. For example, the Price and ShoeSize methods might be part of a Shoe class, which corresponds to the (complex) Shoe element. This means that, unlike SAX applications, XSD/e applications do not have to track where they are in the document hierarchy; instead, the hierarchy is implicit in the structure of the classes containing call-back methods.

In addition to overriding the call-back methods, the application must assemble the classes containing the overridden methods. If the call-back methods for any elements are omitted from this assembly process, the parser ignores that part of the XML document, including validation. This allows the application to work with only those parts of the XML document that it needs.

Because XSD/e uses an event-driven programming model and does not require the data for an entire XML document to be in memory at the same time, XSD/e applications have a smaller footprint than applications built using other XML data binding products. This allows the application to operate in limited-memory environments, such as embedded or mobile systems. Conversely, the classes generated by XSD/e are read-only. That is, if the application wants to manipulate the XML data in memory or write it back to an XML document, it must do this itself, rather than by using code generated by XSD/e.

DbToXml

Developer: SoftRUs
URL: http://www.soft-r-us.com/dbtoxml.asp
License: Commercial
Entry last updated: February, 2003

DbToXml is an XML data binding product that uses a table-based mapping. Unlike most XML data binding products, it generates code from a database schema instead of an XML schema. (Connections to the database use ODBC.) From a single table it can generate a VB COM object or a Java Bean. The table must have a single-column primary key, although the generated code is easily modifiable to support multi-column keys. DbToXml can also generate XML Schemas, BizTalk Schemas, DTDs, sample XML documents, and SQL scripts for inserting, updating, and deleting data.

The generated COM or Java object can populate itself from the database or an XML document and can store its data in the database or an XML document. It can also update or delete data in the database.

DbToXml comes with a GUI-based tool for generating code and configuring the product.

Delphi

Developer: Embarcadero Technologies
URL: http://www.codegear.com/products/delphi/win32
License: Commercial
Entry last updated: July, 2001

The Data Binding Wizard in Delphi allows you to generate classes from DTDs, XML Schemas, sample XML documents, and XML-Data Reduced documents. It also generates a wrapper class that can create objects in these classes from an XML document and writes a graph of these classes to an XML document. (Unlike any other XML data binding products I have seen, the classes generated by Delphi inherit from Delphi's implementation of the Node interface in DOM. While this does give users more flexibility in how to manipulate the data, it seems to confuse two separate means of working with the data in an XML document.)

Users should be careful not to confuse the Data Binding Wizard with another form of "data binding" offered by Delphi -- that of binding the data in an XML document to a client data set, which is effectively a materialized SQL result set.

Dingo

Developer: Peter Lin
URL: http://dingo.sourceforge.net/features.shtml
License: Open Source
Entry last updated: September, 2004

Dingo is an XML data binding utility that generates C# code from XML Schemas. It borrows ideas from both the XSD schema compiler in .NET and JAXB. Like the XSD compiler, it can support System.Xml.Serialization and generate concrete classes from XML Schemas. Like JAXB, it can generate interfaces and implementation classes from XML Schemas. Other features include the ability to have all generated classes implement a particular interface, have all generated classes extend a particular base class, and delegate generation of fields, methods, and/or properties to user-specified code.

generateDS.py

Developer: Dave Kuhlman
URL: http://www.rexx.com/~dkuhlman/generateDS.html
License: Open Source
Entry last updated: May, 2004

From the Web page:

"generateDS.py generates Python data structures (for example, class definitions) from an XML Schema document. These data structures represent the elements in an XML document described by the XML Schema. It also generates parsers that load an XML document into those data structures. In addition, a separate file containing subclasses (stubs) is optionally generated. The user can add methods to the subclasses in order to process the contents of an XML document."

"The generated Python code contains:
o A class definition for each element defined in the XML Schema document.
o A main and driver function that can be used to test the generated code.
o A parser that will read an XML document which satisfies the XML Schema from which the parser was generated. The parser creates and populates a tree structure of instances of the generated Python classes."

"The generated classes contain the following:
o A constructor method (__init__), with member variable initializers.
o Methods with names 'getX' and 'setX' for each member variable 'X' or, if the member variable is defined with maxOccurs="unbounded", methods with names 'getX', 'setX', and 'addX'.
o A "build" method that can be used to populate an instance of the class from a node in a minidom tree.
o An "export" method that will write the instance (and any nested sub-instances) to a file object as an XML document.
o An 'exportLiteral' method that will write out a text (literal) Python data structure that represents the content of the XML document."

gSOAP

Developer: Robert van Engelen
URL: http://gsoap2.sourceforge.net/
License: Open Source
Entry last updated: November, 2008

gSOAP is a tool for generating the code necessary to call or implement SOAP Web Services. Starting from one or more WSDL documents, gSOAP can generate stubs for calling Web Services, skeletons for implementing Web Services, and C structures or C++ classes that represent the data in a SOAP document. The generated code reads and writes SOAP headers, as well as marshalling and unmarshalling data between C/C++ and SOAP documents. It also includes schema-specific, validating XML pull parsers for unmarshalling XML data into C/C++ data. In addition, the generated code can perform validation to ensure that only valid data is marshalled to XML.

gSOAP can also start from a C/C++ header file that describes a set of functions and generate stubs, skeletons, XML Schema documents, and WSDL documents that can be used to expose those functions as a SOAP Web Service. gSOAP can handle a most C/C++ language features, including structures and classes, single inheritance, pointer-based structures (such as lists, trees, and acyclic and cyclic graphs), dynamic arrays, enumerations, unions, containers, and special data types such as struct tm, which are handled with user-defined serializers.

The code that binds C structures or C++ classes to XML can be used without the part of the code that works with SOAP. This allows gSOAP to perform traditional XML data binding. C/C++ data binding code can be generated from WSDL documents or XML Schema documents.

Of note, gSOAP has many options for customizing the way code is generated, customizing the mappings between C/C++ and XML, and handling various SOAP options. In addition, the documentation is very complete and well written.

HappyMapper

Developer: John Nunemaker
URL: http://happymapper.rubyforge.org/
http://railstips.org/2008/11/17/happymapper-making-xml-fun-again
License: Open Source
Entry last updated: January, 2009

An XML data binding library for Ruby. Not reviewed.

Hyperjaxb

Developer: Aleksei Valikov
URL: https://hyperjaxb2.dev.java.net/
License: Open Source
Entry last updated: May, 2006

Hyperjaxb uses Hibernate to provide a persistence layer for the objects generated by JAXB. In particular, it adds ability to generate Hibernate mappings from XML Schemas. For more information, click here.

Jakarta Digester

Developer: Apache
URL: http://commons.apache.org/digester/
License: Open Source
Entry last updated: May, 2004

Jakarta Digester is a Java-based XML document processor. It is also an XML data binding tool -- sort of. It differs from other data binding products in two important ways. First, it can create objects from XML, but cannot serialize objects as XML. Second, it does not generate code (like other design-time products) or use functionality like Reflection to automatically map XML to/from classes (like run-time products). It is listed as a design-time product because of the amount of code that must be written to use it.

Applications use the Digester by specifying a set of patterns and rules. Patterns identify different parts of an XML document, such as particular elements and attributes. They are essentially XPath expressions without predicates but including wildcards. Rules specify what actions to take when a particular pattern is encountered, such as creating an object or setting a property. For example, the pattern Employees/Employee would specify a second-level Employee element; a rule for this element might specify that an Emp object be created. The pattern */Name would specify a Name element occurring anywhere in a document; a rule for this element might specify that the Name property of the object associated with the parent element be set. Jakarta Digester comes with a number of predefined rules, such as creating objects, setting properties, and calling methods. Users can also write their own classes implementing the Rule interface.

As can be seen, the Digester can be configured to perform object creation in the same manner as other XML data binding tools. However, it can be used for other things as well. For example, it would be possible to write a SAX-like dispatcher that performed certain operations in response to events derived from an XML document. As a general rule, the Digester appears to be a hard way to perform normal XML data binding (where the XML document matches the objects to be created), but a relatively easy way to perform more complex operations, such as dispatching or data binding where the XML document does not match the objects to be created.

Javolution

Developer: Jean-Marie Dautelle
URL: http://javolution.org
License: Open Source
Entry last updated: June, 2005

Javolution is a "real-time framework" that is designed to be used when writing real-time Java applications. Java is generally a poor choice for writing real-time applications because applications have little or no control over when objects are created and recycled, both of which are expensive operations. Javolution solves this problem by allowing objects to be created in different "object spaces", each of which creates and recycles objects at different times. Applications decide which space is used for a given object, thus giving them more control over the overall speed at which different parts of them run.

Javolution includes classes for serializing objects as XML and reconstructing objects from XML. The mapping between XML and the object is specified by an instance of the XmlFormat class internal to the mapped class. It appears that this instance must be written by hand, rather than generated from an object or XML schema. Two side effects of doing mapping this way are that the mapping is inherited by all classes that extend the mapped class, and that the mapping can be changed at run time.

JAXB (Java Architecture for XML Binding, aka Adelard, aka JSR-31)

Developer: Sun Microsystems
URL: https://jaxb.dev.java.net/, http://java.sun.com/developer/technicalArticles/WebServices/jaxb/
License: Open Source
Entry last updated: April, 2003

JAXB is both a specification and an implementation of that specification from Sun Microsystems. This entry primarily discusses the implementation.

The class generator in JAXB generates both interfaces and classes from XML Schemas. The classes implement the interfaces. (Both classes and interfaces are generated for portability across different implementations of the JAXB specification. While the interfaces must be the same across all implementations of JAXB for a given XML Schema, the classes may be different.)

Users can control how classes are generated by annotations in the XML Schema file or (in the future) in a separate binding declaration document. In particular, users can control the names of generated classes and properties and whether/how validation is to be performed. Without any annotations, a default generation scheme is used.

Objects of the generated classes can be instantiated from XML documents in the form of an input source, URL, DOM tree, or SAX events. During this process, the incoming data can optionally be validated. Applications can also create empty objects without populating them from any XML document.

Objects of the generated classes can be validated at any time; this allows applications to validate their contents before serializing them as XML. Objects can be serialized to an output stream or DOM tree, or as a set of SAX events.

[Note: This entry has not been updated for the JAXB 2.0 specification, which became final in May, 2006.]

JaxMe

Developer: Jochen Wiedmann / Apache Software Foundation
URL: http://ws.apache.org/jaxme/
License: Open Source
Entry last updated: March, 2004

From the Web page:

"JaxMe 2 is an open source implementation of JAXB, the specification for Java/XML binding."

"A Java/XML binding compiler takes as input a schema description (in most cases an XML schema, but it may be a DTD, a RelaxNG schema, a Java class inspected via reflection, or a database schema). The output is a set of Java classes:
o A Java bean class matching the schema description. (If the schema was obtained via Java reflection, the original Java bean class.)
o Read a conforming XML document and convert it into the equivalent Java bean.
o Vice versa, marshal the Java bean back into the original XML document."

"In the case of JaxMe, the generated classes may also:
o Store the Java bean into a database. Preferrably an XML database like eXist, Xindice, or Tamino, but it may also be a relational database like MySQL. (If the schema is sufficiently simple. :-)
o Query the database for bean instances.
o Implement an EJB entity or session bean with the same abilities."

JBind

Developer: Stefan Wachter
URL: http://jbind.sourceforge.net/
License: Open Source
Entry last updated: May, 2003

JBind is an XML data binding product that consists of a compiler for generating Java code from XML Schemas and a run-time environment. The schema compiler generates four different sets of code:

o "Data interfaces" are generated for each complex type and have set/get methods for each attribute and child element in the type.

o "Behavior interfaces" inherit from data interfaces and contain declarations of user-defined methods (if any).

o "Behavior classes" implement the user-defined methods in behavior interfaces. They are always abstract because they do not implement the methods in the data interfaces (from which the behavior interfaces inherit).

o "Data classes" inherit from behavior classes and implement the methods in the data interface.

When an XML Schema is recompiled -- presumably after changes -- any existing behavior interfaces and classes are not overwritten. That is, only the data interfaces and classes are regenerated. In this manner, existing "behavior" methods are preserved, but new code is generated to reflect changes to the "data", as represented by the XML Schema document.

The generated code can check both local constraints (such as data types) and global constraints (such as key/keyref pairs). It "uses a DOM tree to back the generated classes", which presumably means that the generated classes can either directly or indirectly access the XML document from which they were generated.

How code is generated can be controlled either by the use of special attributes in the XML Schema document, or by the placement of these attributes in a separate document, known as a "schema adjunct document". Things that can be controlled include the packages in which the generated code is placed, additional JavaDoc documentation, how method names are constructed, and whether behavior methods are generated.

Of interest, JBind supports the entire XML Schemas recommendation. This means that, unlike virtually all other XML data binding products, it supports things like mixed content and wild cards. In addition, it supports XPath-based constraints and XPath-based accessors -- that is, methods that constrain or access data according to a particular XPath.

JBind also supports something it calls "XML code". There are two types of XML code: configuration code and application code. Configuration code consists of methods that are called as objects are created and destroyed. Application code consists of a single method (execute) which can be called to perform a certain action. Presumably, this is equivalent to the main method in a class.

JiBX

Developer: Dennis Sosnoski
URL: http://jibx.sourceforge.net/
License: Open Source
Entry last updated: April, 2003

JiBX is an XML data binding product that differs from most other products in four ways:

o Instead of generating code from an XML schema, it binds existing Java classes to XML documents. This is done through an XML-based binding language.

o Instead of using generic, Reflection-based marshalling and unmarshalling engines, it modifies the byte code of existing Java .class files, adding class-specific methods for marshalling and unmarshalling data.

o It does not require an exact match between Java class structure and XML document structure. Although it does not appear that arbitrary transformations are possible, some flexibility is allowed in the mapping.

o It uses a parser that implements the XmlPull API instead of SAX.

LDX+ XML Generator

Developer: Lolke Dijkstra
URL: http://www.xml2java.net/
License: Commercial
Entry last updated: December, 2012

LDX+ XML Generator is an XML data binding framework designed to handle very large XML documents. These are documents that are so large their contents cannot all be held in memory at the same time.

Most XML data binding products parse an XML document completely and then pass a graph of objects to the application. LDX+ XML Generator passes objects to the application during the parsing process. In particular, it calls the application at the start and end of each complex element and passes an object containing the data for that element. This allows the application to process and discard objects immediately, which frees memory for other objects.

The application can also minimize memory usage with a run-time configuration file. This specifies which objects should be created, allowing the application to process only the data it needs. It also specifies whether child objects should be "detached" from their parent object -- that is, whether the child objects should be stored separately from their parent object, allowing them to be discarded before the parent object.

LDX+ XML Generator supports two types of validation. It can validate the entire document before processing or it can allow the application to decide how to handle invalid sections of the document. For example, the application might ignore invalid sections and process valid sections.

LDX+ XML Generator uses XML Schema documents to generate classes for each complex element at design time.

Unlike most other XML data binding frameworks, LDX+ XML Generator cannot serialize object data as XML.

HydraExpress (formerly LEIF)

Developer: Rogue Wave
URL: http://www.roguewave.com/products/hydra/hydraexpress.php
License: Commercial
Entry last updated: April, 2005

[Ed. -- HydraExpress is the "next generation of Rogue Wave LEIF". The following description is for LEIF and has not been updated for HydraExpress.]

LEIF is a GUI-based tool for creating C++ Web Services. In addition to being able to create Web services and Web services clients, it contains an XML data binding tool (XML Object Link) that may be used while creating a Web service or as a standalone tool.

XML Object Link (apparently) generates a schema-specific validating parser from an XML Schema document. This can be done automatically or based on a custom mapping. The marshaling and unmarshaling behavior can be customized as well.

Liquid XML Studio

Developer: Liquid Technologies
URL: http://www.liquid-technologies.com/xml-data-binding.aspx
License: Commercial
Entry last updated: March, 2011

The Developer Edition of Liquid XML Studio includes a database binding utility, the Liquid XML Data Binder. The Liquid XML Data Binder can generate code from XML Schemas, DTDs, and XML-Data Reduced schemas. Code can be generated in C++, Java, C#, VB .Net, and Visual Basic 6 (COM objects).

The Liquid XML Data Binder allows users to choose which elements will be used to generate classes. Users can override generated names, as well as the default value of properties. The generated code has methods for populating objects from XML and serializing objects as XML. (Supported XML formats are XML documents, DOM objects, and Fast Infoset documents.) The reading and writing methods both validate XML documents. In addition, applications can control how to handle validation, such as whether to ignore unknown or missing elements. Users can modify the generated code using "hand-coded blocks" that are not overwritten when code is regenerated, such as when the XML schema changes.

XML Schema support includes the usual support for complex types, child elements, and so on. It also includes support for complex type extension, complex type restriction, substitution groups, and wildcards.

The Liquid XML Data Binder can be run from the Liquid XML Studio or the command line. In addition to generating code, it can also create HTML documentation for the generated classes.

mel

Developer: Alan Linton
URL: http://xmel.sourceforge.net/
License: Open Source
Entry last updated: July, 2003

mel is an XML data binding tool for C. It takes a DTD and creates corresponding .c and .h files. These contain structures that can hold the data from the XML document, as well as functions to marshall and unmarshall data, free the structures, and resolve ID references -- that is, return the structure to which the value of an IDREF attribute points. These functions call the mel library, which must be linked to the user's application.

mel also has partial support for XML Schemas.

O2XMapper

Developer: Shelly Mujtaba
URL: http://o2xmapper.sourceforge.net/
License: Open Source
Entry last updated: October, 2002

O2XMapper is not actually an XML data binding product. Instead, it is a GUI-based tool for creating and editing Castor mapping files. As input, O2XMapper takes either the classpath for the Java classes to be mapped or a Castor mapping file. As output, O2XMapper generates a Castor mapping file. When the input is Java classes, the tool automatically generates most of the mapping information; this can then be edited. The tool also performs various checks (such as type checks) to ensure that the specified mapping is valid.

Oracle XML Class Generator for Java (AKA Oracle JAXB Class Generator)

Developer: Oracle
URL: http://www.oracle.com/technology/tech/xml/xdkhome.html
License: Commercial (free with registration)
Entry last updated: December, 2006

The Oracle XML Class Generator for Java is part of the Oracle XML Developer's Kit. It generates a set of Java classes from a DTD or XML Schema. The generated classes support JAXB and can serialize themselves as XML or be loaded from an XML document.

OSS XSD Tools for C/C++ and Java

Developer: OSS Nokalva, Inc.
URL: http://www.oss.com/xml/products/xml.html
http://www.oss.com/xml/products/xmljava.html
License: Commercial
Entry last updated: March, 2007

From the company:

"The OSS XSD Tools are a family of XML Schema binding products for C/C++ and Java. Both products consist of a schema compiler and runtime libraries. The products use a built-in XML parser and serializer. In addition to XML, serialization and parsing of standardized binary formats are supported, making the tools suitable for use in bandwidth- and resource-constrained environments, such as embedded systems, as well as in environments where high performance is a requirement."

"The compiler generates target language data types for application use (C/C++ type definitions or Java classes), and supplementary control-table information used by the runtime libraries. Generated types represent data in native form, for example, integers for xsd:integer, binary arrays for xsd:base64Binary, native enumerations for XSD enumeration facet etc. Mixed content model, element and attribute wildcards, list and union types, and element substitution groups are supported. Information about the order of child elements within xsd:all groups is preserved."

"The runtime libraries implement serialization, parsing, and validation of XML and binary documents. The XML serializer/parser is not dependent on any third party open source or commercial component. The following formats are supported:"

"C/C++: XML, ASN.1 Binary encodings, and Fast Infoset
Java: XML, ASN.1 Binary encodings"

"In addition, the OSS C/C++ Tools provide a streaming C/C++ SAX API for parsing XML and binary documents. Both the C/C++ and Java products can also be used for the development and deployment of XML Web Services and Fast Web Services using XML and Fast SOAP."

Quick

Developer: JXML
URL: http://jxquick.sourceforge.net/quick3/
License: Open source
Entry last updated: March, 2002

From the Web page:

"Quick is a tool for generating and processing XML. Quick converts arbitrary object structures into trees of XML elements. Converts Cross-linked XML documents into structures of objects."

"Quick is a data modeling system for transforming XML into Java objects and Java objects into XML. Quick builds on QJML, a binding schema that connects XML elements to Java classes. Quick fully supports Java inheritance, including abstract and interface elements. The developer is given fine-grained control over code generation,so the generated code can extend and interoperate with pre-existing classes."

"Quick works with Java Beans and Bean Property Editors. Developer-provided Bean Property Editors allow the use of custome data types (Java classes) when processing XML attributes and simple elements with text content. Quick provides a thread-safe framework (the ocm package) for simple and complex data transformations."

"Quick provides utlities for transforming DTDs into QJML, QJML into marshaling logic, QJML into documentation (HTML), and QJML into data classes that are based on its MVC framework."

Ruby Objects to XML Mapping Library (ROXML)

Developer: Anders Engstrom
URL: http://roxml.rubyforge.org/
License: Open Source
Entry last updated: January, 2009

From the Web site:

"ROXML is a Ruby library designed to make it easier for Ruby developers to work with XML. Using simple annotations, it enables Ruby classes to be custom-mapped to XML. ROXML ... provides the following capabilities:

  • Read Ruby objects from XML (marshal)
  • Write Ruby objects to XML (unmarshal)
  • Smart defaults for XML mapping
  • Annotation-style methods (also known as macros) for XML mapping
  • One-to-one (composition) Ruby to XML
  • One-to-many (aggregation) Ruby with array to XML
  • UTF-8 support for multi-lingual documents
  • Handling text elements with attributes
  • Support for mapped Ruby objects in modules"

Sample Code Generator

Developer: Microsoft
URL: http://www.microsoft.com/downloads/details.aspx?FamilyID=89e6b1e5-f66c-4a4d-933b-46222bb01eb0&DisplayLang=en
License: Commercial
Entry last updated: November, 2008

From the Web page:

"The Sample Code Generator (XSDObjectGen) tool takes an XSD schema as input and generates sample code showing how to mark up C# and VB.Net classes so that when serialized with the XML serializer, the resulting XML will be valid according to the original schema. This update fixes some documentation changes and corrects a problem where the wizard did not generate code in some environments."

Schema2Java Compiler

Developer: Creative Science Systems, Inc.
URL: http://www1.creativescience.com/Products/schema2java.shtml
License: Commercial
Entry last updated: May, 2002

Generates a set of Java classes based on an XML Schema. (This can be done through a command line or a GUI.) The generated classes run within a framework that can create objects in these classes from an XML document and serialize these objects as an XML document. Of interest, the generated classes include code to ensure that all data values are valid with respect to the XML Schema.

Skyron

Developer: John Wilson
URL: http://www.wilson.co.uk/skyron/skyron.html
License: Open Source
Entry last updated: May, 2004

Skyron is a Python-based XML document processor. Like Jakarta Digester, it is also an XML data binding tool -- sort of. It differs from other data binding products in two important ways. First, it can create objects from XML, but cannot serialize objects as XML. Second, it does not generate code (like other design-time products) or use functionality like Reflection to automatically map XML to/from classes (like run-time products). It is listed as a design-time product because of the amount of code that must be written to use it.

Applications use Skyron by specifying a "recipe". A recipe identifies individual elements and attributes in a document and specifies what to do when they start or end. For example, a recipe might instruct Skyron to construct Python objects from the data in an document -- in other words, perform traditional XML data binding. It might also do something that has nothing to do with data binding, such as calling Python functions to store data in a database. In particular, Skyron supports the following operations: constructing an object, calling a function, executing inline Python code, and storing and retrieving values from variables.

As a general rule, Skyron appears to be a hard way to perform normal XML data binding (where the XML document matches the objects to be created), but a relatively easy way to perform more complex operations, such as dispatching or data binding where the XML document does not match the objects to be created.

Versant Object Database

Developer: Versant Corp.
URL: http://www.versant.com/en_US/products/objectdatabase
License: Commercial
Entry last updated: December, 2008

From the documentation:

"The Versant XML Toolkit (VXML) adds XML/object mapping support to the Versant Object Database product. Via command line tools or Java APIs users can generate XML from defined object graphs and likewise generate objects from XML."

XBinder

Developer: Objective Systems, Inc.
URL: http://www.obj-sys.com/xbinder.shtml
License: Commercial
Entry last updated: July, 2007

From the company:

"XBinder is an XML Schema to C/C++ Data Binding Tool. XML data binding is a process in which XML schema information items are transformed into type definitions and functions in a computer language."

"The source code produced by the XBinder compiler is C or C++ code that consists of type definitions and encode/decode functions. This provides a complete Application Programming Interface (API) for working with all of the message definitions contained within an XML schema specification."

"In addition to the compiler, a run-time library of common encode/decode functions is also part of the package. This library contains routines to encode and decode the base XML schema simple types (integer, string, hexBinary, etc.). The XBinder compiler assembles a series of calls to these functions to accomplish the encoding or decoding of more complex message types."

XML2Java

Developer: Patrick Ohl
URL: http://www.jNerd.de/xml2java.html
License: Open Source
Entry last updated: November, 2001

Generates Java classes from a DTD, as well as Reader and Writer classes to transfer data between those classes and XML documents.

XMLBeans

Developer: Apache (donated by BEA)
URL: http://xmlbeans.apache.org/
License: Open Source
Entry last updated: May, 2004

XMLBeans is an XML data binding tool that provides complete(!) support for XML Schemas and XML documents. That is, all XML Schema constructs are supported, including wild cards, substitution groups, and complex type restriction, and the entire InfoSet is supported, including document order, mixed content, white space, comments, and processing instructions.

XMLBeans provides a schema compiler that creates a set of interfaces from an XML Schema. One interface is generated for each complex type, with accessor and mutator methods for attributes and child elements. Included in each interface is a static factory class for creating objects that implement the interface. These objects may be empty (representing an empty document) or may be populated from the contents of an XML document. The interfaces also include methods for validating the current state of the subtree represented by the object against the complex type associated with the object, and for serializing its contents as XML.

XMLBeans provides two other capabilities not found in most other XML data binding products: cursors and schema objects. Cursors provide several different ways to navigate through the objects that represent an XML document. The first way is DOM-like, allowing applications to move the current position of the cursor through the document, such as to the first attribute, the first child, the next sibling, or simply the next "token", where a token is the start or end of an element, an attribute, a namespace attribute, child text, a comment, or a processing instruction. The second way is to execute a query and to move from one query result to the next. (Apache's version of XMLBeans supports XPath; BEA's version of XMLBeans supports XQuery.) The third way is to either bookmark locations in the document or to push locations onto a location stack. The cursor can then return to a bookmarked location or pop the previous location off the stack. Cursors also allow applications to modify documents in DOM-like fashion, such as by inserting attributes, deleting elements, or moving the contents of an element to a new location.

Schema objects are an object model for XML Schemas -- imagine that a set of classes was generated from the XML Schema document that defines the XML Schema language -- and allow applications to explore the schema associated with a document. A schema object can be retrieved from any object. The schema object contains information about the complex type associated with the object, such as its content model, whether it extends another type, and whether it has mixed content.

XMLFoundation

Developer: Brian Aberle
URL: http://www.codeproject.com/KB/XML/XMLFoundation.aspx
License: Open Source
Entry last updated: March, 2010

XMLFoundation is an XML data binding framework that can be used with C/C++ and Java objects, as well as COM, DCOM, and CORBA objects. It includes its own non-validating parser.

Objects that are bound to XML must derive from the XMLObject class and implement the MapXMLTagsToMembers method. This method maps XML attributes and elements to member variables, primarily by calling the MapAttribute and MapMember functions. MapAttribute is used to map XML attributes to member variables. MapMember is used to map XML elements to member variables and to link objects to sub-objects. The latter functionality is necessary when an XML element contains a child element that is mapped to a sub-object, rather than a member variable.

XML elements and attributes that are not mapped in MapXMLTagsToMembers can still be handled through a virtual method that is implemented by the application. The application passes a pointer to this method to XMLFoundation's object factory. When the object factory is reading XML and encounters an element or attribute that has not been mapped, it calls the method.

XMLObject keeps state data for each member variable, such as whether the variable's value has been changed, whether the variable's value has been set, and whether the variable's value should be serialized to XML. XMLObject maintains this state data when its methods (such as SetMember) are called. However, applications must maintain the state data themselves (such as by calling setMemberDirty) when setting member variables directly.

Applications transfer data between XML (as a string) and objects with the FromXML and ToXML methods in XMLObject. XMLFoundation has a number of options for serializing data as XML, including ordering child elements alphabetically or in their original order, only serializing data that has been updated by the application, excluding mapped or unmapped attributes, and including a DOCTYPE declaration.

XMLFoundation can cache objects that have an ID, which can be set directly by the application or from an oid attribute in XML. It also contains implementations of common classes (such as lists, hashtables, stacks, and trees) and utilities (such as sorts, performance times, and INI profiles).

XmlObjects

Developer: Luis M. Pena
URL: http://www.byteslooser.com/csharp/xmlobjects/
License: Open Source
Entry last updated: January, 2009

XmlObjects is a C# class for marshalling C# objects as XML and unmarshalling XML to C# objects. Users decorate C# classes with attributes that specify how to serialize those classes as XML. For example, they can specify whether to serialize fields as attributes or child elements, the attribute or element names to use, and the order in which to serialize child elements. In addition, users can add methods to classes that are called before a class is serialized as XML, after a class is serialized as XML, and after a class is created as XML.

See also C++ XML Objects, XMLObject, and xmlobjects.

XML Schema Definition Tool (.NET Framework)

Developer: Microsoft
URL: http://msdn.microsoft.com/en-us/library/x6c1kb0s.aspx
License: Commercial
Entry last updated: March, 2002

At design time, the XML Schema Definition Tool allows you to either generate a set of C# or Visual Basic classes from an XML Schema or generate an XML Schema from a set of classes (DLL or .EXE).

At run time, the XmlSerializer class allows you to serialize graphs of objects to XML and deserialize graphs objects from XML. You can control which properties are serialized/deserialized, as well as how they are represented in XML (as elements, attributes, etc.), through .NET "attributes". (.NET attributes are "keyword-like descriptive declarations" that you add to your code to "annotate programming elements such as types, fields, methods, and properties.") These attributes can be overridden at run time, giving additional flexibility.

You can also serialize/deserialize object graphs to SOAP documents. Again, .NET attributes can be used to control the output SOAP document.

XML-Serializer

Developer: Adaptinet
URL: None
License: Shareware
Entry last updated: April, 2002

[Ed: Although the Adaptinet Web site no longer exists, the product appears to still be available through various download sites, such as http://www.download.com/XML-Serializer/3000-7241_4-10102850.html.]

XML-Serializer can generate Java classes from XML Schemas and DTDs. These classes can validate both data content and structure. At run time, they use an XML-Serializer component that populates them from XML as well as serializing them to XML. XML-Serializer has built-in support for JBuilder, as well as its own GUI design tool.

XML Spy

Developer: Altova
URL: http://www.altova.com/features_code.html
License: Commercial
Entry last updated: April, 2005

XML Spy is an XML IDE and editor that includes facilities for generating Java, C++, and C# classes from an XML Schema. The generated code can support Microsoft MSXML, JAXP, or Microsoft System.XML and users can "replace the underlying parsing and validating engine". Users can also customize the mappings from XML Schema built-in data types to primitive data types in the target language. The code generator also generates project files for a variety of IDEs.

XML Thunder

Developer: Canam Software Labs, Inc.
URL: http://www.xmlthunder.com/
License: Commercial
Entry last updated: October, 2009

XML Thunder generates COBOL or C code to transfer data between XML documents and COBOL copybooks or C data structures. The code, known as a handler, is based on a mapping between the data structures and the XML schema.

Users create mappings with a GUI mapping tool. This can be used to manually map existing data structures to existing XML schemas. It can also create maps automatically. In this case, it starts with an XML Schema, DTD, or sample XML document and generates COBOL or C data structures and a map; or it starts with a COBOL or C data structure and generates an XML Schema and a map.

Applications call handlers to transfer data. There are two types of handlers: XML Readers populate application data structures from an XML document, and XML Writers create an XML document from the data in application data structures. In addition, handlers can be either document-level handlers or node-level handlers.

Document-level handlers only need to be called once. They transfer data between an entire XML document and a single data structure. Their only limitations are that applications must specify the maximum number of times a repeating element can occur and allocate a buffer large enough to hold the entire XML document.

Node-level handlers are used when the maximum number of times a repeating element occurs is unknown, or the XML document is too large to fit in a single buffer. Each node-level handler reads or writes a particular element (node) and its descendants. (In this sense, it is similar to a document-level handler except that it handles a fragment of an XML document instead of the entire document.) Applications make separate calls to node-level handlers to handle each part of the XML document.

Zeus

Developer: Object Web
URL: http://zeus.objectweb.org/
License: Open Source
Entry last updated: July, 2001

From the Web page:

"... includes complete code generation from any legal XML DTD, marshalling and unmarshalling of XML documents and Java objects, and an extensible means of plugging in your own classes for conversion to Java."

Zeus is designed so that its code generation facilities can be easily extended in the future to include XML Schemas, RELAX, and other XML schema languages.

Zope

Developer: Digital Creations
URL: http://www.zope.org/
License: Open Source
Entry last updated: July, 2001

Zope is an Open Source Web application server. Unlike most Web application servers, Zope is really an object-oriented programming system, based on the Zope Object Database (ZODB). Developers can create and manipulate objects both through Python and through Document Template Markup Language (DTML). Each DTML document is therefore a separate object and DTML itself is closer to a programming language than it is to a simple tag set. It provides control-of-flow tags (if-the-else, loops, etc.) and allows users to create their own methods.

Zope uses XML data binding in two places. First, XML documents can be mapped to a tree of Zope objects according to an XSLT stylesheet. Second, ZDOM apparently allows any tree of objects in ZODB to be viewed/manipulated with a subset of the DOM.

2.2 Run-Time Products


Amara Bindery (formerly Anobind)

Developer: Uche Ogbuji
URL: http://wiki.xml3k.org/Amara
License: Open Source
Entry last updated: April, 2005

Amara Bindery provides XML data binding capabilities for the Amara XML Toolkit, a set of tools for working with XML in Python. Amara Bindery generates Python objects directly from XML documents, not XML schemas.

Amara Bindery generates separate Python objects for each element in a document. The generated elements have properties for each of the element's attributes, as well as the element's child elements. The latter are list-valued properties when multiple child elements have the same name. In addition, a predefined property (xml_children) can be used to access all of the element's children (elements, text, processing instructions, and comments) as a list; this can be used to access mixed content. Finally, a predefined method can be used to access the text content of an element.

Bindings can be customized through a set of rules, which correspond to classes executed by the Bindery. Amara Bindery comes with predefined rules (classes) for creating properties (instead of objects) from simple elements, ignoring elements, and stripping insignificant whitespace from the document. These rules use a user-specified XPattern expression to determine where they should be applied. Users can also create their own rule classes to further customize bindings.

Finally, Amara Bindery supports XPath-based navigation through a document. That is, users can execute an XPath expression on a node to retrieve other nodes relative to that node.

Betwixt

Developer: James Strachan
URL: http://commons.apache.org/betwixt/
License: Open Source
Entry last updated: April, 2002

From the Web page:

"Betwixt is a library which maps Java Beans to XML. It provides an XMLIntrospector in a similar manner to the Introspector in the java.beans package which defines how a bean appears as XML. There are many ways of encoding beans as XML. Betwixt provides a default representation based on introspection which can be customized to taste to get nicer looking XML or to map your beans to some external XML schema. In addition to the mapping API Betwixt also provides objects to turn beans into XML and to parse XML and turn it into beans. Namely BeanReader and BeanWriter."

JOX

Developer: Wutka Consulting, Inc.
URL: http://www.wutka.com/jox.html
License: Open Source
Entry last updated: July, 2001

From the Web page:

"JOX is a set of Java libraries that make it easy to transfer data between XML documents and Java Beans. You can think of JOX as a special form of Java Object Serialization, using XML as the serialization format."

JOX attempts to use a modest amount of intelligence to match element type and attribute names to property names, giving it some flexibility. It can also use a DTD to help format the XML and in some cases requires it.

KBML (Koala Bean Markup Language), KOML (Koala Object Markup Language)

Developer: Koala
URL: http://koala.ilog.fr/kbml/
http://koala.ilog.fr/koml/
License: Open Source
Entry last updated: July, 2001

KBML and KOML are packages for serializing/deserializing Java Beans/Java objects to/from XML documents. Both use proprietary XML languages for serialization.

Skaringa

Developer: The Skaringa Team
URL: http://skaringa.sourceforge.net/
License: Open Source
Entry last updated: May, 2004

Skaringa is an XML data binding tool that uses Reflection. Its main component is an Object Transformer, which creates Java objects from XML and serializes Java objects as XML. Users may add special, Skaringa-specific methods to their classes that Skaringa will call after creating objects or before serializing them. In addition, they may specify XSLT transformations to be applied to XML documents before they are used to create objects or after they have been created by serializing objects. They may also control some aspects of generated XML, such as pretty-printing and encoding.

The Object Transformer can also generate an XML Schema document from a Java object and transform one Java object into another Java object according to an XSLT document(!) (Presumably, a tree of objects, headed by a single object, can be transformed into another tree of objects by transforming the head object. However, this is not clear.)

TreeBind

Developer: Eric van der Vlist, et al.
URL: http://savannah.nongnu.org/projects/treebind/
License: Open Source
Entry last updated: March, 2006

TreeBind is a data binding product designed to transfer data between XML, Java, RDF, and LDAP. It is based on an information set that consists of names, complex properties, and leaf properties. Names have two parts, which allows them to model such things as Java class names (domain and class name), XML element names (optional namespace and local name), and LDAP property names (local name only). Complex properties are named and have a set of child properties; these can be leaf properties or other complex properties. Leaf properties are also named and have a value; in other words, they carry data.

Data is transferred between sources and sinks using a SAX-like interface. An application links the source to the sink and the source streams properties to the sink. One or more filters between the source and sink transform the data as necessary, such as transforming Java classes and properties into XML elements and attributes. TreeBind comes with filters for transforming Java to XML, XML to Java, RDF to Java, LDAP to Java, and LDAP to XML. (Of note, the filters for Java classes use Reflection to allow TreeBind to be used with existing classes without additional configuration.)

In addition to transferring data between sources and sinks, users can access TreeBind hierarchies through a DOM-like interface. They can also extend TreeBind by writing their own filters as well as writing sources, sinks, and filters for other hierarchical data sources, such as relational databases.

java.beans.XMLEncoder, java.beans.XMLDecoder

Developer: Sun Microsystems
URL: http://java.sun.com/j2se/1.4/docs/api/java/beans/XMLEncoder.html
License: Commercial (free with registration)
Entry last updated: July, 2001

Simple classes for serializing and deserializing Java objects in XML. These use a proprietary XML language. These classes were introduced in the Java 2 Platform, Standard Edition, version 1.4.

xml_pickle, xml_objectify

Developer: David Mertz
URL: http://gnosis.cx/download/Gnosis_Utils-current.tar.gz
License: Open Source
Entry last updated: July, 2001

Classes for transferring data between XML documents and Python objects. xml_pickle uses a proprietary XML serialization syntax. xml_objectify attempts to create an object from any XML document, but cannot serialize the object back to XML.

XMLObject

Developer: Jordi Bunster
URL: http://xml-object.rubyforge.org/
License: Open Source
Entry last updated: January, 2009

XMLObject is a Ruby library for reading (but not writing) XML. Opening an XML document creates a object whose structure is the same as that of the XML document. Child elements and attributes are both mapped to instance variables of the class. (An alternate syntax is provided for accessing instance variables that correspond to XML attributes in the case that a child element and an attribute share the same name.) Multiple child elements with the same name are treated as collections. XMLObject supports adaptors for different XML parsers.

See also C++ XML Objects, XmlObjects, and xmlobjects.

xmlobjects

Developer: Mikeal Rogers
URL: http://code.google.com/p/xmlobjects/
License: Open Source
Entry last updated: January, 2009

xmlobjects is a Python library for working with XML documents. It features a single class (XMLObject) that contains basic methods for working with XML documents, such as marshalling objects as XML, unmarshalling XML to objects, and setting namespaces. The root element name is set as an argument on the XMLObject constructor. Child elements are set by simply using the element name as a data attribute of an XMLObject instance. This is possible because Python does not require data attributes to be declared before using them. For example:
   part = xmlobjects.XMLObject('Part')
   part.price = 10.99
Attributes are set using Python's dictionary syntax. (A Python dictionary is an associative array that uses keys -- immutable types such as strings or numbers -- as indices.) The attribute name is used as the key. For example:
   part.price['currency'] = 'USD'
See also C++ XML Objects, XmlObjects, and XMLObject.

XStream

Developer: Joe Walnes
URL: http://xstream.codehaus.org/
License: Open Source
Entry last updated: September, 2008

XStream is a Java XML data binding tool that (apparently) uses Reflection to create objects from XML and serialize objects as XML. It can serialize object graphs, keeping duplicate references intact using IDs or XPath references. Users have several customization options. First, they can specify element type names to be used for each class. Second, they can write their own code for how Java types are converted to/from XML values. (Default converters are provided for many types, such as numeric data types, dates, binary data (using Base64), arrays, Collections, and Maps.) Third, they can serialize to/from any tree structure -- not just XML -- by implementing an interface that works with that tree structure.

2.3 Test Suites and Benchmarks


bindmark

Developer: Kirill Grouchnikov
URL: https://bindmark.dev.java.net/
License: Open Source
Entry last updated: February, 2005

From the Web site:

"The goal of this project is to provide a comparison of the existing open-source and commercial (when available for free evaluation download) libraries for binding XML data to Java classes. The libraries are evaluated in several areas, including ease of use (the amount of effort needed to invest to the first successful run), the size of the accompanying jar files and the performance. In this project, the main emphasis is put into providing the performance comparisons, both in time and in memory."

The Web site contains detailed results for more than 15 data binding tools.

3.0 Articles, Books, etc.

3.1 Articles: General

NOTE: For other introductory articles about XML data binding, see Articles: JAXB and Castor and Articles: Other Products.

3.2 Articles: JAXB and Castor

WARNING: The JAXB specification changed significantly in summer, 2002. According to an article by Dennis Sosnoski, the main changes are that:

  • Code is generated from XML Schemas, not DTDs.
  • The code generator generates both interfaces and classes that implement those interfaces, instead of just generating classes. This allows applications to write to the interfaces (which are portable across JAXB implementations), while different JAXB implementations can generate different classes that implement those interfaces.
  • Validation is optional.

Because of this, the technical details in articles written before summer, 2002, are out of date. However, the general ideas used by JAXB are the same in both versions, so the older articles still provide a useful introduction.

  • XML Data Binding with Castor by Dion Almaer
    Discusses how to use Castor with both implicit and explicit mappings. Includes code examples.

  • Professional XML - Chapter 15: Data Binding by Louise Barr
    A detailed introduction to XML data binding, with an in-depth look at Castor. Includes lots of code examples.

  • XML Data-Binding: Comparing Castor to .NET by Niel Bornstein
    Compares the XML data binding capabilities of Castor and .NET by generating sample code from an XML Schema with both products.

  • Use XML data binding to do your laundry by Sam Brodkin
    A discussion of XML data binding, with code examples for JAXB and Castor.

  • XML Data Binding (PDF) by Peter Gerstbach
    Eine Einfuehrung in XML Data Binding, insbesondere in JAXB. Die Arbeit enthaelt JAXB Beispielanwendungen und erklaert die Mappingregeln (XML Schema <=> Java) und Anpassungsmoeglichkeiten fuer JAXB. Sprache: Deutsch.

  • The JAXB API by Kohsuke Kawaguchi
    A discussion of the JAXB API, which is used to marshall Java objects to XML documents, unmarshall XML documents to Java objects, and validate Java objects against an XML Schema.

  • Java Architecture for XML Binding (JAXB) by Ed Ort and Bhakti Mehta
    An nice overview of JAXB from the folks at Sun, including a brief description of how to customize bindings.

3.3 Articles: Other Products

3.4 Articles: Other Techniques

3.5 Articles: Mapping XML <=> UML, Objects, etc.

3.6 Books

  • Java & XML Data Binding by Brett McLaughlin
    A book describing JAXB and Open Source data binding products (Zeus, Castor, and Quick). The book can be bought from the linked-to page.

3.7 Specifications

3.8 Links

4.0 Comments and Feedback

Please send comments and feedback to Ronald Bourret at rpbourret@rpbourret.com.

4.1 Suggested Products

If there is a product that you would like to see listed on this page, please send me information at the above email address. You are welcome to include a description of the product. This will help get your product listed faster, as I am always behind in researching/writing such descriptions. Your description will be clearly labeled as "From the company:" and I reserve the right to edit it. Any changes I make will be returned to you for technical review/approval.

Here are some guidelines to help you out:

  • Audience. The intended audience is programmers.

  • Technical detail. The purpose of the description is to give a concise technical description of the product: what it does, how it does it, and what features it has.

  • Focus on XML. Focus on the XML aspects of the product, even if these took less time to write than that cool GUI tool. Mention the GUI tool in one sentence or less. ("Other features include ...")

  • User tasks. Be sure to describe what the user must do. Writing code is different from filling in forms, both in terms of flexibility and difficulty.

  • No vaporware. The features you describe must be in a released version of the product, not just in the specification or "under development".

  • No marketing lingo. All marketing lingo will be deleted. Obvious examples of marketing terms are "enterprise-wide solution" and "business critical". Less obvious examples are such unverifiable terms as "fast", "scalable", and "robust".

  • Still not sure what to write? Look at the descriptions of similar products in the list.


Copyright (c) 2010, Ronald Bourret