使用 scrapy python 从 DIV 中的第二个 child 获取文本

Question

我正在尝试从网站上接收一些文本值，但出现了一个小问题。

包含房地产数据的网站有一些我正在努力恢复的功能。我对价格等 'main' 功能没有问题。我使用的代码如下。

def get_offer_details(self, response):
    
    offer_item = ItemLoader(item=estateItem(), selector=response)

    offer_item.add_xpath('tittle', "//h1[@class='css-11t1qm5']/text()")
    offer_item.add_xpath('price', '//strong[@class="css-1mojccp"]/text()')

    yield offer_item.load_item()

我可以在上面的示例中使用 'class' 选择器。

如何从该结构中的第二个 div 获取文本值（在本例中为“2”）？结构完全相同的功能很少，唯一的区别是 aria-label（卧室、市场等），所以我不能使用 'class' 选择器。

<div role="region" aria-label="bedrooms" class="css-11ic80g">
    <div title="bedrooms" class="css-152vbi8">bedrooms<!-- -->:</div>
    <div title="2" class="css-1s5nyln">2</div>
</div>

这是它的样子：

#I dont know.. maybe something like this? But it doesnt work..
offer_item.add_xpath('bedrooms', "//div[@aria-label='bedrooms'][1]/text()")

提前致谢。

Answer 1

尝试使用此 XPATH 从第一个 child:

获取文本

"//div[@aria-label='bedrooms']/div[1]/text()"

或

"//div[@aria-label='bedrooms']/div[2]/text()"

来自第二个的文本

Answer 2

另一种使用following-sibling::的方法：

offer_item.add_xpath('bedrooms', "//div[@title='bedrooms']/following-sibling::div[1]/text()")

使用 scrapy python 从 DIV 中的第二个 child 获取文本

Get text from second child in DIV using scrapy, python

python

scrapy

web-scraping