抓取实时 google 金融价格
crawl realtime google finance price
我想创建一个小的 excel sheet,有点像 Bloomberg 的启动板,供我监控实时股票市场价格。到目前为止,在所有可用的免费数据源中,我只发现 Google finance 为我需要的交易所列表提供了实时价格。 Google 财务的问题是他们已经关闭了他们的财务 API。我正在寻找一种方法来帮助我以编程方式检索我在下图中圈出的实际价格,以便在我的 excel.
中实时更新
我一直在四处寻找,但到目前为止无济于事。我在这里读了一些 post:
How does Google Finance update stock prices? 但答案中建议的方法指向检索图表中的时间序列数据,而不是我需要的实时更新价格部分。我在chrome的检查中一直在检查网页的网络通信,没有发现任何要求returns我需要的实时价格部分。任何帮助是极大的赞赏。一些示例代码(可以是 VBA 以外的其他语言)将非常有益。感谢大家 !
有很多方法可以做到这一点:VBA、VB、C# R、Python 等。下面是从雅虎财经下载统计数据的方法。
Sub DownloadData()
Set ie = CreateObject("InternetExplorer.application")
With ie
.Visible = True
.navigate "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL"
' Wait for the page to fully load; you can't do anything if the page is not fully loaded
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
' Set a reference to the data elements that will be downloaded. We can download either 'td' data elements or 'tr' data elements. This site happens to use 'tr' data elements.
Set Links = ie.document.getElementsByTagName("tr")
RowCount = 1
' Scrape out the innertext of each 'tr' element.
With Sheets("DataSheet")
For Each lnk In Links
.Range("A" & RowCount) = lnk.innerText
RowCount = RowCount + 1
Next
End With
End With
MsgBox ("Done!!")
End Sub
我会留给您寻找其他具有相同功能的技术。例如,R 和 Prthon 可以做完全相同的事情,尽管脚本与执行此类工作的 VBA 脚本略有不同。
让它在 Python 中工作相当容易。您将需要一些库:
Library
Purpose
requests
to make a request to Google Finance and then return HTML.
bs4
to process returned HTML.
pandas
to easily save to CSV/Excel.
代码和full example in the online IDE:
from bs4 import BeautifulSoup
import requests, lxml, json
from itertools import zip_longest
def scrape_google_finance(ticker: str):
# https://docs.python-requests.org/en/master/user/quickstart/#passing-parameters-in-urls
params = {
"hl": "en", # language
}
# https://docs.python-requests.org/en/master/user/quickstart/#custom-headers
# https://www.whatismybrowser.com/detect/what-is-my-user-agent
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
}
html = requests.get(f"https://www.google.com/finance/quote/{ticker}", params=params, headers=headers, timeout=30)
soup = BeautifulSoup(html.text, "lxml")
ticker_data = {"right_panel_data": {},
"ticker_info": {}}
ticker_data["ticker_info"]["title"] = soup.select_one(".zzDege").text
ticker_data["ticker_info"]["current_price"] = soup.select_one(".AHmHk .fxKbKc").text
right_panel_keys = soup.select(".gyFHrc .mfs7Fc")
right_panel_values = soup.select(".gyFHrc .P6K39c")
for key, value in zip_longest(right_panel_keys, right_panel_values):
key_value = key.text.lower().replace(" ", "_")
ticker_data["right_panel_data"][key_value] = value.text
return ticker_data
# tickers to iterate over
tickers = ["DIS:NYSE", "TSLA:NASDAQ", "AAPL:NASDAQ", "AMZN:NASDAQ", "NFLX:NASDAQ"]
# temporary store the data before saving to the file
tickers_prices = []
for ticker in tickers:
# extract ticker data
ticker_data = scrape_google_finance(ticker=ticker)
# append to temporary list
tickers_prices.append({
"ticker": ticker_data["ticker_info"]["title"],
"price": ticker_data["ticker_info"]["current_price"]
})
# create dataframe and save to csv/excel
df = pd.DataFrame(data=tickers_prices)
# to save to excel use to_excel()
df.to_csv("google_finance_live_stock.csv", index=False)
输出:
ticker,price
Walt Disney Co,7.06
Tesla Inc,",131.21"
Apple Inc,6.99
"Amazon.com, Inc.",",321.61"
Netflix Inc,4.93
从 ticker_data
返回数据
{
"right_panel_data": {
"previous_close": "8.61",
"day_range": "6.66 - 9.20",
"year_range": "8.38 - 1.67",
"market_cap": "248.81B USD",
"volume": "9.98M",
"p/e_ratio": "81.10",
"dividend_yield": "-",
"primary_exchange": "NYSE",
"ceo": "Bob Chapek",
"founded": "Oct 16, 1923",
"headquarters": "Burbank, CaliforniaUnited States",
"website": "thewaltdisneycompany.com",
"employees": "166,250"
},
"ticker_info": {
"title": "Walt Disney Co",
"current_price": "6.66"
}
}
如果您想通过 line-by-line 解释来抓取更多数据,我的 Scrape Google Finance Ticker Quote Data in Python 博客 post 也涵盖了抓取 time-series 图表数据。
我想创建一个小的 excel sheet,有点像 Bloomberg 的启动板,供我监控实时股票市场价格。到目前为止,在所有可用的免费数据源中,我只发现 Google finance 为我需要的交易所列表提供了实时价格。 Google 财务的问题是他们已经关闭了他们的财务 API。我正在寻找一种方法来帮助我以编程方式检索我在下图中圈出的实际价格,以便在我的 excel.
中实时更新我一直在四处寻找,但到目前为止无济于事。我在这里读了一些 post: How does Google Finance update stock prices? 但答案中建议的方法指向检索图表中的时间序列数据,而不是我需要的实时更新价格部分。我在chrome的检查中一直在检查网页的网络通信,没有发现任何要求returns我需要的实时价格部分。任何帮助是极大的赞赏。一些示例代码(可以是 VBA 以外的其他语言)将非常有益。感谢大家 !
有很多方法可以做到这一点:VBA、VB、C# R、Python 等。下面是从雅虎财经下载统计数据的方法。
Sub DownloadData()
Set ie = CreateObject("InternetExplorer.application")
With ie
.Visible = True
.navigate "https://finance.yahoo.com/quote/AAPL/key-statistics?p=AAPL"
' Wait for the page to fully load; you can't do anything if the page is not fully loaded
Do While .Busy Or _
.readyState <> 4
DoEvents
Loop
' Set a reference to the data elements that will be downloaded. We can download either 'td' data elements or 'tr' data elements. This site happens to use 'tr' data elements.
Set Links = ie.document.getElementsByTagName("tr")
RowCount = 1
' Scrape out the innertext of each 'tr' element.
With Sheets("DataSheet")
For Each lnk In Links
.Range("A" & RowCount) = lnk.innerText
RowCount = RowCount + 1
Next
End With
End With
MsgBox ("Done!!")
End Sub
我会留给您寻找其他具有相同功能的技术。例如,R 和 Prthon 可以做完全相同的事情,尽管脚本与执行此类工作的 VBA 脚本略有不同。
让它在 Python 中工作相当容易。您将需要一些库:
Library | Purpose |
---|---|
requests |
to make a request to Google Finance and then return HTML. |
bs4 |
to process returned HTML. |
pandas |
to easily save to CSV/Excel. |
代码和full example in the online IDE:
from bs4 import BeautifulSoup
import requests, lxml, json
from itertools import zip_longest
def scrape_google_finance(ticker: str):
# https://docs.python-requests.org/en/master/user/quickstart/#passing-parameters-in-urls
params = {
"hl": "en", # language
}
# https://docs.python-requests.org/en/master/user/quickstart/#custom-headers
# https://www.whatismybrowser.com/detect/what-is-my-user-agent
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.60 Safari/537.36",
}
html = requests.get(f"https://www.google.com/finance/quote/{ticker}", params=params, headers=headers, timeout=30)
soup = BeautifulSoup(html.text, "lxml")
ticker_data = {"right_panel_data": {},
"ticker_info": {}}
ticker_data["ticker_info"]["title"] = soup.select_one(".zzDege").text
ticker_data["ticker_info"]["current_price"] = soup.select_one(".AHmHk .fxKbKc").text
right_panel_keys = soup.select(".gyFHrc .mfs7Fc")
right_panel_values = soup.select(".gyFHrc .P6K39c")
for key, value in zip_longest(right_panel_keys, right_panel_values):
key_value = key.text.lower().replace(" ", "_")
ticker_data["right_panel_data"][key_value] = value.text
return ticker_data
# tickers to iterate over
tickers = ["DIS:NYSE", "TSLA:NASDAQ", "AAPL:NASDAQ", "AMZN:NASDAQ", "NFLX:NASDAQ"]
# temporary store the data before saving to the file
tickers_prices = []
for ticker in tickers:
# extract ticker data
ticker_data = scrape_google_finance(ticker=ticker)
# append to temporary list
tickers_prices.append({
"ticker": ticker_data["ticker_info"]["title"],
"price": ticker_data["ticker_info"]["current_price"]
})
# create dataframe and save to csv/excel
df = pd.DataFrame(data=tickers_prices)
# to save to excel use to_excel()
df.to_csv("google_finance_live_stock.csv", index=False)
输出:
ticker,price
Walt Disney Co,7.06
Tesla Inc,",131.21"
Apple Inc,6.99
"Amazon.com, Inc.",",321.61"
Netflix Inc,4.93
从 ticker_data
{
"right_panel_data": {
"previous_close": "8.61",
"day_range": "6.66 - 9.20",
"year_range": "8.38 - 1.67",
"market_cap": "248.81B USD",
"volume": "9.98M",
"p/e_ratio": "81.10",
"dividend_yield": "-",
"primary_exchange": "NYSE",
"ceo": "Bob Chapek",
"founded": "Oct 16, 1923",
"headquarters": "Burbank, CaliforniaUnited States",
"website": "thewaltdisneycompany.com",
"employees": "166,250"
},
"ticker_info": {
"title": "Walt Disney Co",
"current_price": "6.66"
}
}
如果您想通过 line-by-line 解释来抓取更多数据,我的 Scrape Google Finance Ticker Quote Data in Python 博客 post 也涵盖了抓取 time-series 图表数据。