使用 xpath 仅获取 text() 的一部分
get only a part of text() with xpath
我正在尝试获取此网站的作者数组:
http://www.intechopen.com/books/latest/1/list
用这个 xpath:
response.xpath("//div[@id='sizer']/div[@id='content']/div[@class='grid']/div[@class='main-content']/div[@id='tc']/div/ul[@class='book-listing entity-listing']/li/dl/dd[@class='meta']/text()[count(preceding-sibling::br) = 0]").extract()
但我只想要名字,没有 "editor",我该怎么做?
选择文本后,使用带捕获组的正则表达式函数re()
以排除不需要的文本:
response.xpath("//div[@id='sizer']/div[@id='content']/div[@class='grid']/div[@class='main-content']/div[@id='tc']/div/ul[@class='book-listing entity-listing']/li/dl/dd[@class='meta']/text()[count(preceding-sibling::br) = 0]")
.re(r'Editor\s*(.*)')
我正在尝试获取此网站的作者数组:
http://www.intechopen.com/books/latest/1/list
用这个 xpath:
response.xpath("//div[@id='sizer']/div[@id='content']/div[@class='grid']/div[@class='main-content']/div[@id='tc']/div/ul[@class='book-listing entity-listing']/li/dl/dd[@class='meta']/text()[count(preceding-sibling::br) = 0]").extract()
但我只想要名字,没有 "editor",我该怎么做?
选择文本后,使用带捕获组的正则表达式函数re()
以排除不需要的文本:
response.xpath("//div[@id='sizer']/div[@id='content']/div[@class='grid']/div[@class='main-content']/div[@id='tc']/div/ul[@class='book-listing entity-listing']/li/dl/dd[@class='meta']/text()[count(preceding-sibling::br) = 0]")
.re(r'Editor\s*(.*)')