这个 xPath 没有给出任何结果,有什么原因吗?
This xPath is giving no results, any reason why?
import requests
from lxml import html
page = requests.get(url="http://www.cia.gov/library/publications/the-world-factbook/geos/ch.html")
tree = html.fromstring(page.content)
bordering = tree.xpath('//*[@id="wfb_data"]/table/tr[4]/td/ul[3]/li[4]/div[17]/span[2]/text()')
print bordering
我使用 chrome 开发人员模式检索了 xPath,但它仍然给我一个空的 "bordering" 变量。我不知道可能出了什么问题。
首先,你需要使用https
而不是http
:
https://www.cia.gov/library/publications/the-world-factbook/geos/ch.html
此外,还有一种获取边界数据的更简单方法 - 找到包含 border countries
文本的 span
并获取 next sibling's 文本:
bordering = tree.xpath('//*[@id="wfb_data"]//span[starts-with(., "border countries")]/following-sibling::span')[0]
print(bordering.text_content())
打印:
Afghanistan 91 km, Bhutan 477 km, Burma 2,129 km, India 2,659 km, Kazakhstan 1,765 km, North Korea 1,352 km, Kyrgyzstan 1,063 km, Laos 475 km, Mongolia 4,630 km, Nepal 1,389 km, Pakistan 438 km, Russia (northeast) 4,133 km, Russia (northwest) 46 km, Tajikistan 477 km, Vietnam 1,297 km
请在请求中使用 User-Agent 检查。
headers ={'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0'}
page = requests.get(url , headers=headers,timeout=5, verify=False)
如果可行,请告诉我。
谢谢。
import requests
from lxml import html
page = requests.get(url="http://www.cia.gov/library/publications/the-world-factbook/geos/ch.html")
tree = html.fromstring(page.content)
bordering = tree.xpath('//*[@id="wfb_data"]/table/tr[4]/td/ul[3]/li[4]/div[17]/span[2]/text()')
print bordering
我使用 chrome 开发人员模式检索了 xPath,但它仍然给我一个空的 "bordering" 变量。我不知道可能出了什么问题。
首先,你需要使用https
而不是http
:
https://www.cia.gov/library/publications/the-world-factbook/geos/ch.html
此外,还有一种获取边界数据的更简单方法 - 找到包含 border countries
文本的 span
并获取 next sibling's 文本:
bordering = tree.xpath('//*[@id="wfb_data"]//span[starts-with(., "border countries")]/following-sibling::span')[0]
print(bordering.text_content())
打印:
Afghanistan 91 km, Bhutan 477 km, Burma 2,129 km, India 2,659 km, Kazakhstan 1,765 km, North Korea 1,352 km, Kyrgyzstan 1,063 km, Laos 475 km, Mongolia 4,630 km, Nepal 1,389 km, Pakistan 438 km, Russia (northeast) 4,133 km, Russia (northwest) 46 km, Tajikistan 477 km, Vietnam 1,297 km
请在请求中使用 User-Agent 检查。
headers ={'User-Agent':'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Firefox/24.0'}
page = requests.get(url , headers=headers,timeout=5, verify=False)
如果可行,请告诉我。
谢谢。