All Packages Class Hierarchy This Package Previous Next Index
java.lang.Object | +----org.xmlmiddleware.xmlutils.DOMNormalizer
The methods in this class behave as if the DOM tree consisted only of element, attribute, and fully-normalized text nodes. Except for serialize(), they modify the DOM tree as necessary, replacing entity references with their children, discarding comment and processing instruction nodes, merging adjacent text and CDATA nodes, and so on.
For example, suppose we have the following DOM tree:
ELEMENT(A) | ------------------------------------- | | | | ELEMENT(B) TEXT("foo") ENTITYREF TEXT("bar") | ----------------------- | | | CDATA("asdf") PI ELEMENT(C)
This class behaves as if the first child of element node A is element node B, the next sibling of element node B is the text node "fooasdf", and the next sibling of the text node "fooasdf" is the element node C. That is, it normalizes the tree to the following:
ELEMENT(A) | ---------------------------------------- | | | | ELEMENT(B) TEXT("fooasdf") ELEMENT(C) TEXT("bar")
The serialized form of this is:
<A><B/>fooasdf<C/>bar</A>
The code assumes that the tree will be traversed in depth-first, width-second order, with the methods in this class, such as getFirstChild, replacing the corresponding methods in DOM's Node interface. This is done so that the tree can be processed in a single pass, rather than a normalization pass and a reading pass. The result of using methods in this class when traversing the tree in any other order is undefined.
public DOMNormalizer()
public static Node getFirstChild(Node node)
public static Node getNextSibling(Node node)
public static Node normalizeNode(Node node)
This method expands entity references in place, discards processing instruction and comment nodes, and concatenates adjacent text and CDATA nodes.
public static Node normalizeText(Node node)
If the input node is a text or CDATA node, concatenate its value with the values of all immediately following text or CDATA nodes. A following text or CDATA node is considered to be immediately following if the only nodes between it and the input node are text, CDATA, comment, or processing instruction nodes. (Comment and processing instruction nodes are discarded.)
If the input node is not a text or CDATA node, or if the input node has no parent, then no normalization takes place and the input node is returned.
If the input node is a CDATA node, it is converted to a text node.
public static Node expandEntityRef(Node node)
This method returns the first child of the entity reference. This child is now at the same level that the entity reference node was at. If the input node is not an entity reference node, or if the input node does not have a parent, null is returned.
This method does not attempt to normalize text, nor does it expand entity reference nodes that are children of the input node. It is normally called only by other methods, which do normalize text and expand nested entity reference nodes.
public static String serialize(Node node, boolean childrenOnly, boolean escapeMarkup)
This method behaves as if entity references were expanded, CDATA sections replaced with text, and comments and PIs didn't exist. It does not actually modify the DOM tree.
All Packages Class Hierarchy This Package Previous Next Index