如何使用 Scrapy XPATH select 这个元素?
How to select this element with Scrapy XPATH?
唯一要求:需要引用thread-navigation
class,因为那个页面还有很多其他的分页元素
<section id="thread-navigation" class="group">
<div class="float-left">
<div class="pagination talign-mleft">
<span class="pages">Pages (6):</span>
<span class="pagination_current">1</span>
<a href="I want this text?page=2" class="pagination_page">2</a>
<a href=""I want this text?page=3" class="pagination_page">3</a>
<a href=""I want this text?page=4" class="pagination_page">4</a>
<a href=""I want this text?page=5" class="pagination_page">5</a>
<a href=""I want this text?page=6" class="pagination_last">6</a>
<a href=""I want this text?page=2" class="pagination_next">Next »</a> //<--- this one
</div>
</div>
</section>
我正在尝试这样的事情:
r.xpath('//*[@class="thread-navigation" and contains (., "Next")]').get()
但它总是returnsNone
谢谢
这个 xpath:
'//section[@id="thread-navigation"]//a/@href'
您指的不是 @class
属性,而是具有值 thread-navigation
的 @id
属性。所以试试这个 XPath-1.0 表达式:
r.xpath('//a[ancestor::*/@id="thread-navigation" and contains (text(), "Next")]/@href').get()
其结果是
I want this text?page=2
唯一要求:需要引用thread-navigation
class,因为那个页面还有很多其他的分页元素
<section id="thread-navigation" class="group">
<div class="float-left">
<div class="pagination talign-mleft">
<span class="pages">Pages (6):</span>
<span class="pagination_current">1</span>
<a href="I want this text?page=2" class="pagination_page">2</a>
<a href=""I want this text?page=3" class="pagination_page">3</a>
<a href=""I want this text?page=4" class="pagination_page">4</a>
<a href=""I want this text?page=5" class="pagination_page">5</a>
<a href=""I want this text?page=6" class="pagination_last">6</a>
<a href=""I want this text?page=2" class="pagination_next">Next »</a> //<--- this one
</div>
</div>
</section>
我正在尝试这样的事情:
r.xpath('//*[@class="thread-navigation" and contains (., "Next")]').get()
但它总是returnsNone
谢谢
这个 xpath:
'//section[@id="thread-navigation"]//a/@href'
您指的不是 @class
属性,而是具有值 thread-navigation
的 @id
属性。所以试试这个 XPath-1.0 表达式:
r.xpath('//a[ancestor::*/@id="thread-navigation" and contains (text(), "Next")]/@href').get()
其结果是
I want this text?page=2