问题描述
- WebMagic不支持官方的xpath表达式吗?为什么我的代码会报错?求高人指点
-
我想获取节点的父节点的link,我按官方文档写出如下表达式,为什么成员运行会抛出异常呢?page.getHtml().xpath("//img/parent::a").links().all();
异常信息如下:
org.jsoup.select.Selector$SelectorParseException: Could not parse query 'parent::*': unexpected token at '::*'
at us.codecraft.xsoup.XPathParser.findElements(XPathParser.java:141)
at us.codecraft.xsoup.XPathParser.parse(XPathParser.java:51)
at us.codecraft.xsoup.XPathParser.parse(XPathParser.java:375)
at us.codecraft.xsoup.XPathParser.combinator(XPathParser.java:85)
at us.codecraft.xsoup.XPathParser.parse(XPathParser.java:49)
at us.codecraft.xsoup.XPathParser.parse(XPathParser.java:375)
at us.codecraft.xsoup.Xsoup.compile(Xsoup.java:20)
at us.codecraft.webmagic.selector.XpathSelector.(XpathSelector.java:21)
at us.codecraft.webmagic.selector.Selectors.xpath(Selectors.java:32)
at us.codecraft.webmagic.selector.HtmlNode.xpath(HtmlNode.java:42)
at com.zg.crawler.amazon.core.url.ProductUrlProcessor.process(ProductUrlProcessor.java:60)
at us.codecraft.webmagic.Spider.processRequest(Spider.java:420)
at us.codecraft.webmagic.Spider$1.run(Spider.java:322)
at us.codecraft.webmagic.thread.CountableThreadPool$1.run(CountableThreadPool.java:74)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
解决方案
http://forums.codeguru.com/showthread.php?546883-JSoup-and-XSoup-quot-Couldn-t-parse-query-quot