从雅虎金融抓取时,Scrapy Returns 空列表
Scrapy Returns empty list when scraping from yahoo finance
当我在 Yahoo Finance 上使用此代码时,它 returns 是一个空列表,但在另一个站点上使用时它工作正常。它不是 xpath 中的错误。
import pandas as pd
import requests
from scrapy import Selector
html = requests.get('https://finance.yahoo.com/quote/SM/key-statistics?p=SM').content
sel = Selector(text=html)
# Naming Sheet
ticker = sel.xpath('//*[@id="quote-header-info"]/div[2]/div[1]/div[1]/h1/text()').getall()
print(ticker)
以这种方式生成数据:
import pandas as pd
import requests
#from scrapy import Selector
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'}
html = requests.get('https://finance.yahoo.com/quote/SM/key-statistics?p=SM', headers = headers).content
sel = pd.read_html(html)
print(sel)
# Naming Sheet
# ticker = sel.xpath('//*[@id="quote-header-info"]/div[2]/div[1]/div[1]/h1/text()').get()
# print(ticker)
当我在 Yahoo Finance 上使用此代码时,它 returns 是一个空列表,但在另一个站点上使用时它工作正常。它不是 xpath 中的错误。
import pandas as pd
import requests
from scrapy import Selector
html = requests.get('https://finance.yahoo.com/quote/SM/key-statistics?p=SM').content
sel = Selector(text=html)
# Naming Sheet
ticker = sel.xpath('//*[@id="quote-header-info"]/div[2]/div[1]/div[1]/h1/text()').getall()
print(ticker)
以这种方式生成数据:
import pandas as pd
import requests
#from scrapy import Selector
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'}
html = requests.get('https://finance.yahoo.com/quote/SM/key-statistics?p=SM', headers = headers).content
sel = pd.read_html(html)
print(sel)
# Naming Sheet
# ticker = sel.xpath('//*[@id="quote-header-info"]/div[2]/div[1]/div[1]/h1/text()').get()
# print(ticker)