Holger ,Holger 曾在德国的 University of Bonn 攻读数学专业。自 1996 年开始,他就一直在 IBM Global Business Services 工作,为客户开发网站的解决方案。由于经常在自己的项目中使用这些技术,Holger 积累了大量 XPath 和 Java 编程方面的经验
如果想要在 XPath 表达式中使用名称空间,必须提供对此名称空间 URI 所用前缀的链接。本文介绍了向名称空间映射提供前缀的三种不同方式。本文亦包含了示例代码以方便您编写自己的 NamespaceContext。
*** Zero example - no namespaces provided ***
First try asking without namespace prefix:
--> booklist/book
Result is of length 0
Then try asking with namespace prefix:
--> books:booklist/science:book
Result is of length 0
The expression does not work in both cases.
在两种情况下,XPath 求值并不返回任何节点,而且也没有任何异常。XPath 找不到节点,因为缺少前缀到 URI 的映射。
public class HardcodedNamespaceResolver implements NamespaceContext {
/**
* This method returns the uri for all prefixes needed. Wherever possible
* it uses XMLConstants.
*
* @param prefix
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix == null) {
throw new IllegalArgumentException("No prefix provided!");
} else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return "http://univNaSpResolver/book";
} else if (prefix.equals("books")) {
return "http://univNaSpResolver/booklist";
} else if (prefix.equals("fiction")) {
return "http://univNaSpResolver/fictionbook";
} else if (prefix.equals("technical")) {
return "http://univNaSpResolver/sciencebook";
} else {
return XMLConstants.NULL_NS_URI;
}
}
public String getPrefix(String namespaceURI) {
// Not needed in this context.
return null;
}
public Iterator getPrefixes(String namespaceURI) {
// Not needed in this context.
return null;
}
}
*** First example - namespacelookup hardcoded ***
Using any namespaces results in a NodeList:
--> books:booklist/technical:book
Number of Nodes: 1
<?xml version="1.0" encoding="UTF-8"?>
<science:book xmlns:science="http://univNaSpResolver/sciencebook">
<title xmlns="http://univNaSpResolver/book">Learning XPath</title>
<author xmlns="http://univNaSpResolver/book">Michael Schmidt</author>
</science:book>
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/technical:book/:author
Michael Schmidt
public class UniversalNamespaceResolver implements NamespaceContext {
// the delegate
private Document sourceDocument;
/**
* This constructor stores the source document to search the namespaces in
* it.
*
* @param document
* source document
*/
public UniversalNamespaceResolver(Document document) {
sourceDocument = document;
}
/**
* The lookup for the namespace uris is delegated to the stored document.
*
* @param prefix
* to search for
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return sourceDocument.lookupNamespaceURI(null);
} else {
return sourceDocument.lookupNamespaceURI(prefix);
}
}
/**
* This method is not needed in this context, but can be implemented in a
* similar way.
*/
public String getPrefix(String namespaceURI) {
return sourceDocument.lookupPrefix(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
// not implemented yet
return null;
}
}
*** Second example - namespacelookup delegated to document ***
Try to use the science prefix: no result
--> books:booklist/science:book
The resolver only knows namespaces of the first level!
To be precise: Only namespaces above the node, passed in the constructor.
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe
正如输出所示,在 book 元素上声明的、具有前缀 science 的名称空间并未被解析。求值方法抛出了一个 XPathExpressionException。要解决这个问题,需要从文档提取节点 science:book 并将此节点用作代表(delegate)。但是这将意味着对此文档要进行额外的解析,而且也不优雅。
public class UniversalNamespaceCache implements NamespaceContext {
private static final String DEFAULT_NS = "DEFAULT";
private Map<String, String> prefix2Uri = new HashMap<String, String>();
private Map<String, String> uri2Prefix = new HashMap<String, String>();
/**
* This constructor parses the document and stores all namespaces it can
* find. If toplevelOnly is true, only namespaces in the root are used.
*
* @param document
* source document
* @param toplevelOnly
* restriction of the search to enhance performance
*/
public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
examineNode(document.getFirstChild(), toplevelOnly);
System.out.println("The list of the cached namespaces:");
for (String key : prefix2Uri.keySet()) {
System.out
.println("prefix " + key + ": uri " + prefix2Uri.get(key));
}
}
/**
* A single node is read, the namespace attributes are extracted and stored.
*
* @param node
* to examine
* @param attributesOnly,
* if true no recursion happens
*/
private void examineNode(Node node, boolean attributesOnly) {
NamedNodeMap attributes = node.getAttributes();
for (int i = 0; i < attributes.getLength(); i++) {
Node attribute = attributes.item(i);
storeAttribute((Attr) attribute);
}
if (!attributesOnly) {
NodeList chields = node.getChildNodes();
for (int i = 0; i < chields.getLength(); i++) {
Node chield = chields.item(i);
if (chield.getNodeType() == Node.ELEMENT_NODE)
examineNode(chield, false);
}
}
}
/**
* This method looks at an attribute and stores it, if it is a namespace
* attribute.
*
* @param attribute
* to examine
*/
private void storeAttribute(Attr attribute) {
// examine the attributes in namespace xmlns
if (attribute.getNamespaceURI() != null
&& attribute.getNamespaceURI().equals(
XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
// Default namespace xmlns="uri goes here"
if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
putInCache(DEFAULT_NS, attribute.getNodeValue());
} else {
// The defined prefixes are stored here
putInCache(attribute.getLocalName(), attribute.getNodeValue());
}
}
}
private void putInCache(String prefix, String uri) {
prefix2Uri.put(prefix, uri);
uri2Prefix.put(uri, prefix);
}
/**
* This method is called by XPath. It returns the default namespace, if the
* prefix is null or "".
*
* @param prefix
* to search for
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return prefix2Uri.get(DEFAULT_NS);
} else {
return prefix2Uri.get(prefix);
}
}
/**
* This method is not needed in this context, but can be implemented in a
* similar way.
*/
public String getPrefix(String namespaceURI) {
return uri2Prefix.get(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
// Not implemented
return null;
}
}
请注意在代码中有一个调试输出。每个节点的属性均被检查和存储。但子节点不被检查,因为构造函数内的布尔值 toplevelOnly 被设置为 true。如果此布尔值被设为 false,那么子节点的检查将会在属性存储完毕后开始。有关此代码,有一点需要注意:在 DOM 中,第一个节点代表整个文档,所以,要让元素 book 读取这些名称空间,必须访问子节点刚好一次。
*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe
*** Fourth example - namespaces all levels cached ***
The list of the cached namespaces:
prefix science: uri http://univNaSpResolver/sciencebook
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Now the use of the science prefix works as well:
--> books:booklist/science:book
Number of Nodes: 1
<?xml version="1.0" encoding="UTF-8"?>
<science:book xmlns:science="http://univNaSpResolver/sciencebook">
<title xmlns="http://univNaSpResolver/book">Learning XPath</title>
<author xmlns="http://univNaSpResolver/book">Michael Schmidt</author>
</science:book>
The fiction namespace is resolved:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe
但是如果您无法控制 XML 文件,并且别人可以发送给您任何前缀,最好是独立于他人的选择。您可以编码实现您自己的名称空间解析,如示例 1 (HardcodedNamespaceResolver)所示,并将它们用于您的 XPath 表达式。
在上述这些情况下,解析自此 XML 文件的 NamespaceContext 能够让您的代码更少、并且更为通用。
其他用户评论
暂无评论
联合使用 Java 的 NamespaceContext 对象和 XPath
作者: Holger
发表时间:
2009-06-23
作者简介
Holger ,Holger 曾在德国的 University of Bonn 攻读数学专业。自 1996 年开始,他就一直在 IBM Global Business Services 工作,为客户开发网站的解决方案。由于经常在自己的项目中使用这些技术,Holger 积累了大量 XPath 和 Java 编程方面的经验
如果想要在 XPath 表达式中使用名称空间,必须提供对此名称空间 URI 所用前缀的链接。本文介绍了向名称空间映射提供前缀的三种不同方式。本文亦包含了示例代码以方便您编写自己的 NamespaceContext。
*** Zero example - no namespaces provided ***
First try asking without namespace prefix:
--> booklist/book
Result is of length 0
Then try asking with namespace prefix:
--> books:booklist/science:book
Result is of length 0
The expression does not work in both cases.
在两种情况下,XPath 求值并不返回任何节点,而且也没有任何异常。XPath 找不到节点,因为缺少前缀到 URI 的映射。
public class HardcodedNamespaceResolver implements NamespaceContext {
/**
* This method returns the uri for all prefixes needed. Wherever possible
* it uses XMLConstants.
*
* @param prefix
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix == null) {
throw new IllegalArgumentException("No prefix provided!");
} else if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return "http://univNaSpResolver/book";
} else if (prefix.equals("books")) {
return "http://univNaSpResolver/booklist";
} else if (prefix.equals("fiction")) {
return "http://univNaSpResolver/fictionbook";
} else if (prefix.equals("technical")) {
return "http://univNaSpResolver/sciencebook";
} else {
return XMLConstants.NULL_NS_URI;
}
}
public String getPrefix(String namespaceURI) {
// Not needed in this context.
return null;
}
public Iterator getPrefixes(String namespaceURI) {
// Not needed in this context.
return null;
}
}
*** First example - namespacelookup hardcoded ***
Using any namespaces results in a NodeList:
--> books:booklist/technical:book
Number of Nodes: 1
<?xml version="1.0" encoding="UTF-8"?>
<science:book xmlns:science="http://univNaSpResolver/sciencebook">
<title xmlns="http://univNaSpResolver/book">Learning XPath</title>
<author xmlns="http://univNaSpResolver/book">Michael Schmidt</author>
</science:book>
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/technical:book/:author
Michael Schmidt
public class UniversalNamespaceResolver implements NamespaceContext {
// the delegate
private Document sourceDocument;
/**
* This constructor stores the source document to search the namespaces in
* it.
*
* @param document
* source document
*/
public UniversalNamespaceResolver(Document document) {
sourceDocument = document;
}
/**
* The lookup for the namespace uris is delegated to the stored document.
*
* @param prefix
* to search for
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return sourceDocument.lookupNamespaceURI(null);
} else {
return sourceDocument.lookupNamespaceURI(prefix);
}
}
/**
* This method is not needed in this context, but can be implemented in a
* similar way.
*/
public String getPrefix(String namespaceURI) {
return sourceDocument.lookupPrefix(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
// not implemented yet
return null;
}
}
*** Second example - namespacelookup delegated to document ***
Try to use the science prefix: no result
--> books:booklist/science:book
The resolver only knows namespaces of the first level!
To be precise: Only namespaces above the node, passed in the constructor.
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe
正如输出所示,在 book 元素上声明的、具有前缀 science 的名称空间并未被解析。求值方法抛出了一个 XPathExpressionException。要解决这个问题,需要从文档提取节点 science:book 并将此节点用作代表(delegate)。但是这将意味着对此文档要进行额外的解析,而且也不优雅。
public class UniversalNamespaceCache implements NamespaceContext {
private static final String DEFAULT_NS = "DEFAULT";
private Map<String, String> prefix2Uri = new HashMap<String, String>();
private Map<String, String> uri2Prefix = new HashMap<String, String>();
/**
* This constructor parses the document and stores all namespaces it can
* find. If toplevelOnly is true, only namespaces in the root are used.
*
* @param document
* source document
* @param toplevelOnly
* restriction of the search to enhance performance
*/
public UniversalNamespaceCache(Document document, boolean toplevelOnly) {
examineNode(document.getFirstChild(), toplevelOnly);
System.out.println("The list of the cached namespaces:");
for (String key : prefix2Uri.keySet()) {
System.out
.println("prefix " + key + ": uri " + prefix2Uri.get(key));
}
}
/**
* A single node is read, the namespace attributes are extracted and stored.
*
* @param node
* to examine
* @param attributesOnly,
* if true no recursion happens
*/
private void examineNode(Node node, boolean attributesOnly) {
NamedNodeMap attributes = node.getAttributes();
for (int i = 0; i < attributes.getLength(); i++) {
Node attribute = attributes.item(i);
storeAttribute((Attr) attribute);
}
if (!attributesOnly) {
NodeList chields = node.getChildNodes();
for (int i = 0; i < chields.getLength(); i++) {
Node chield = chields.item(i);
if (chield.getNodeType() == Node.ELEMENT_NODE)
examineNode(chield, false);
}
}
}
/**
* This method looks at an attribute and stores it, if it is a namespace
* attribute.
*
* @param attribute
* to examine
*/
private void storeAttribute(Attr attribute) {
// examine the attributes in namespace xmlns
if (attribute.getNamespaceURI() != null
&& attribute.getNamespaceURI().equals(
XMLConstants.XMLNS_ATTRIBUTE_NS_URI)) {
// Default namespace xmlns="uri goes here"
if (attribute.getNodeName().equals(XMLConstants.XMLNS_ATTRIBUTE)) {
putInCache(DEFAULT_NS, attribute.getNodeValue());
} else {
// The defined prefixes are stored here
putInCache(attribute.getLocalName(), attribute.getNodeValue());
}
}
}
private void putInCache(String prefix, String uri) {
prefix2Uri.put(prefix, uri);
uri2Prefix.put(uri, prefix);
}
/**
* This method is called by XPath. It returns the default namespace, if the
* prefix is null or "".
*
* @param prefix
* to search for
* @return uri
*/
public String getNamespaceURI(String prefix) {
if (prefix == null || prefix.equals(XMLConstants.DEFAULT_NS_PREFIX)) {
return prefix2Uri.get(DEFAULT_NS);
} else {
return prefix2Uri.get(prefix);
}
}
/**
* This method is not needed in this context, but can be implemented in a
* similar way.
*/
public String getPrefix(String namespaceURI) {
return uri2Prefix.get(namespaceURI);
}
public Iterator getPrefixes(String namespaceURI) {
// Not implemented
return null;
}
}
请注意在代码中有一个调试输出。每个节点的属性均被检查和存储。但子节点不被检查,因为构造函数内的布尔值 toplevelOnly 被设置为 true。如果此布尔值被设为 false,那么子节点的检查将会在属性存储完毕后开始。有关此代码,有一点需要注意:在 DOM 中,第一个节点代表整个文档,所以,要让元素 book 读取这些名称空间,必须访问子节点刚好一次。
*** Third example - namespaces of toplevel node cached ***
The list of the cached namespaces:
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Try to use the science prefix:
--> books:booklist/science:book
The cache only knows namespaces of the first level!
The fiction namespace is such a namespace:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe
*** Fourth example - namespaces all levels cached ***
The list of the cached namespaces:
prefix science: uri http://univNaSpResolver/sciencebook
prefix DEFAULT: uri http://univNaSpResolver/book
prefix fiction: uri http://univNaSpResolver/fictionbook
prefix books: uri http://univNaSpResolver/booklist
Now the use of the science prefix works as well:
--> books:booklist/science:book
Number of Nodes: 1
<?xml version="1.0" encoding="UTF-8"?>
<science:book xmlns:science="http://univNaSpResolver/sciencebook">
<title xmlns="http://univNaSpResolver/book">Learning XPath</title>
<author xmlns="http://univNaSpResolver/book">Michael Schmidt</author>
</science:book>
The fiction namespace is resolved:
--> books:booklist/fiction:book
Number of Nodes: 2
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust I</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
<?xml version="1.0" encoding="UTF-8"?>
<fiction:book xmlns:fiction="http://univNaSpResolver/fictionbook">
<title xmlns="http://univNaSpResolver/book">Faust II</title>
<author xmlns="http://univNaSpResolver/book">Johann Wolfgang von Goethe</author>
</fiction:book>
The default namespace works also:
--> books:booklist/fiction:book[1]/:author
Johann Wolfgang von Goethe