无法使用 xpath 从 href 标签中提取文本

Question

我正在尝试使用以下 xpath

从 this page 中提取趋势名称

//div[@class ='table-responsive']/table[@class = 'table table-striped table-hover dataTable no-footer']/tbody/tr/th/a/text()

它在网络浏览器上尝试时给出了 50 个结果。但是使用以下代码

import requests
import lxml.html

html = requests.get('https://twitter-trends.iamrohit.in/')
doc = lxml.html.fromstring(html.content)
trends_name = doc.xpath("//div[@class = 'table-responsive']/table[@class = 'table table-striped table-hover dataTable no-footer']/tbody/tr/th/a/text()")

我在 trends_name 变量中什么也得不到。我尝试打印 html.content 并且它提供原始 html 内容。此外，我使用同一页面的源代码在在线 xapth 选择器上尝试了相同的 xpath，它给出了 50 个趋势我不确定我在使用代码时做错了什么，因为我已经在其他具有不同 xpath 的站点上尝试过它并且它正在工作，请提供帮助。谢谢

Answer 1

只需从 table 的谓词中删除 "dataTable" 和 "no-footer" class 名称 - 这些 class 名称是在 table 在浏览器中呈现时添加的，但在页面源中不存在：

trends_name = doc.xpath("//div[@class = 'table-responsive']/table[@class = 'table table-striped table-hover']/tbody/tr/th/a/text()")

无法使用 xpath 从 href 标签中提取文本

unable to extract text from href tag using xpath

python

xpath

lxml