具有多个选项卡但只有一个 url 的 Webscraping 页面
Webscraping pages with multiple tabs but one url
我一直在尝试使用 beautiful soup 和 requests 库从 trading view earnings 中抓取“本周”选项卡下的收入数据,但我似乎无法使用基本方法获取数据我知道。不幸的是,上面 link 打开的默认选项卡是“今天”选项卡,我不熟悉使用相同 link.
导航选项卡
我该怎么做?
下面是我试过的,但它为 t:
返回了一个空列表
headers = {
'User-Agent': 'Chrome/39.0.2171.95'
}
page = requests.get(
'https://www.tradingview.com/markets/stocks-usa/earnings/', headers=headers
)
soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id='js-screener-container')
t=soup.find_all('tr', {'class':'tv-data-table__row tv-data-table__stroke tv-screener-table__result-row'})
您看到的数据是通过 JavaScript 加载的,因此 BeautifulSoup 看不到它。您可以使用 requests
模块模拟此 Ajax 请求,然后将这些数据提供给 pandas DataFrame。
示例:
import requests
import pandas as pd
api_url = "https://scanner.tradingview.com/america/scan"
payload = {
"filter": [
{"left": "market_cap_basic", "operation": "nempty"},
{
"left": "earnings_release_date,earnings_release_next_date",
"operation": "in_range",
"right": [1643000400, 1643605200], # <-- probably you need to tweak these values
},
{
"left": "earnings_release_date,earnings_release_next_date",
"operation": "nequal",
"right": 1643605200, # <-- and this value too
},
],
"options": {"lang": "en"},
"markets": ["america"],
"symbols": {"query": {"types": []}, "tickers": []},
"columns": [
"logoid",
"name",
"market_cap_basic",
"earnings_per_share_forecast_next_fq",
"earnings_per_share_fq",
"eps_surprise_fq",
"eps_surprise_percent_fq",
"revenue_forecast_next_fq",
"revenue_fq",
"earnings_release_next_date",
"earnings_release_next_calendar_date",
"earnings_release_next_time",
"description",
"type",
"subtype",
"update_mode",
"earnings_per_share_forecast_fq",
"revenue_forecast_fq",
"earnings_release_date",
"earnings_release_calendar_date",
"earnings_release_time",
"currency",
"fundamental_currency_code",
],
"sort": {"sortBy": "market_cap_basic", "sortOrder": "desc"},
"range": [0, 150],
}
result = requests.post(api_url, json=payload).json()
df = pd.DataFrame(
[r["d"] for r in result["data"]],
dtype="str",
columns=[
"logoid",
"name",
"market_cap_basic",
"earnings_per_share_forecast_next_fq",
"earnings_per_share_fq",
"eps_surprise_fq",
"eps_surprise_percent_fq",
"revenue_forecast_next_fq",
"revenue_fq",
"earnings_release_next_date",
"earnings_release_next_calendar_date",
"earnings_release_next_time",
"description",
"type",
"subtype",
"update_mode",
"earnings_per_share_forecast_fq",
"revenue_forecast_fq",
"earnings_release_date",
"earnings_release_calendar_date",
"earnings_release_time",
"currency",
"fundamental_currency_code",
],
)
print(df)
打印:
logoid name market_cap_basic earnings_per_share_forecast_next_fq earnings_per_share_fq eps_surprise_fq eps_surprise_percent_fq revenue_forecast_next_fq revenue_fq earnings_release_next_date earnings_release_next_calendar_date earnings_release_next_time description type subtype update_mode earnings_per_share_forecast_fq revenue_forecast_fq earnings_release_date earnings_release_calendar_date earnings_release_time currency fundamental_currency_code
0 apple AAPL 2779690484608.0 1.420761 2.1 0.200885 10.57782177 94209101391.0 123945000000.0 1651579200 1648684800 0 Apple Inc. stock common delayed_streaming_900 1.899115 119002502550.0 1643319000 1640908800 1 USD USD
1 microsoft MSFT 2310984201913.0 2.192646 2.48 0.161037 6.94435401 48971705236.0 51728000000.0 1651147200 1648684800 0 Microsoft Corporation stock common delayed_streaming_900 2.318963 50710806541.0 1643144880 1640908800 1 USD USD
2 tesla TSLA 849959567314.9999 2.238503 2.54 0.177177 7.49853036 18134248643.0 17719000000.0 1651665600 1648684800 0 Tesla, Inc. stock common delayed_streaming_900 2.362823 17131882446.0 1643231460 1640908800 1 USD USD
3 visa V 479940487395.0001 1.661801 1.81 0.107021 6.28434056 6868400149.0 7059000000.0 1651060800 1648684800 0 Visa Inc. stock common delayed_streaming_900 1.702979 6792172282.0 1643320980 1640908800 1 USD USD
4 johnson-and-johnson JNJ 452253807870.0 2.556371 2.13 0.013477 0.63675188 23810389493.0 24804000000.0 1650369600 1648684800 0 Johnson & Johnson stock common delayed_streaming_900 2.116523 25275614293.0 1643110380 1640908800 -1 USD USD
...and so on.
我一直在尝试使用 beautiful soup 和 requests 库从 trading view earnings 中抓取“本周”选项卡下的收入数据,但我似乎无法使用基本方法获取数据我知道。不幸的是,上面 link 打开的默认选项卡是“今天”选项卡,我不熟悉使用相同 link.
导航选项卡我该怎么做?
下面是我试过的,但它为 t:
返回了一个空列表headers = {
'User-Agent': 'Chrome/39.0.2171.95'
}
page = requests.get(
'https://www.tradingview.com/markets/stocks-usa/earnings/', headers=headers
)
soup = BeautifulSoup(page.content, 'html.parser')
results = soup.find(id='js-screener-container')
t=soup.find_all('tr', {'class':'tv-data-table__row tv-data-table__stroke tv-screener-table__result-row'})
您看到的数据是通过 JavaScript 加载的,因此 BeautifulSoup 看不到它。您可以使用 requests
模块模拟此 Ajax 请求,然后将这些数据提供给 pandas DataFrame。
示例:
import requests
import pandas as pd
api_url = "https://scanner.tradingview.com/america/scan"
payload = {
"filter": [
{"left": "market_cap_basic", "operation": "nempty"},
{
"left": "earnings_release_date,earnings_release_next_date",
"operation": "in_range",
"right": [1643000400, 1643605200], # <-- probably you need to tweak these values
},
{
"left": "earnings_release_date,earnings_release_next_date",
"operation": "nequal",
"right": 1643605200, # <-- and this value too
},
],
"options": {"lang": "en"},
"markets": ["america"],
"symbols": {"query": {"types": []}, "tickers": []},
"columns": [
"logoid",
"name",
"market_cap_basic",
"earnings_per_share_forecast_next_fq",
"earnings_per_share_fq",
"eps_surprise_fq",
"eps_surprise_percent_fq",
"revenue_forecast_next_fq",
"revenue_fq",
"earnings_release_next_date",
"earnings_release_next_calendar_date",
"earnings_release_next_time",
"description",
"type",
"subtype",
"update_mode",
"earnings_per_share_forecast_fq",
"revenue_forecast_fq",
"earnings_release_date",
"earnings_release_calendar_date",
"earnings_release_time",
"currency",
"fundamental_currency_code",
],
"sort": {"sortBy": "market_cap_basic", "sortOrder": "desc"},
"range": [0, 150],
}
result = requests.post(api_url, json=payload).json()
df = pd.DataFrame(
[r["d"] for r in result["data"]],
dtype="str",
columns=[
"logoid",
"name",
"market_cap_basic",
"earnings_per_share_forecast_next_fq",
"earnings_per_share_fq",
"eps_surprise_fq",
"eps_surprise_percent_fq",
"revenue_forecast_next_fq",
"revenue_fq",
"earnings_release_next_date",
"earnings_release_next_calendar_date",
"earnings_release_next_time",
"description",
"type",
"subtype",
"update_mode",
"earnings_per_share_forecast_fq",
"revenue_forecast_fq",
"earnings_release_date",
"earnings_release_calendar_date",
"earnings_release_time",
"currency",
"fundamental_currency_code",
],
)
print(df)
打印:
logoid name market_cap_basic earnings_per_share_forecast_next_fq earnings_per_share_fq eps_surprise_fq eps_surprise_percent_fq revenue_forecast_next_fq revenue_fq earnings_release_next_date earnings_release_next_calendar_date earnings_release_next_time description type subtype update_mode earnings_per_share_forecast_fq revenue_forecast_fq earnings_release_date earnings_release_calendar_date earnings_release_time currency fundamental_currency_code
0 apple AAPL 2779690484608.0 1.420761 2.1 0.200885 10.57782177 94209101391.0 123945000000.0 1651579200 1648684800 0 Apple Inc. stock common delayed_streaming_900 1.899115 119002502550.0 1643319000 1640908800 1 USD USD
1 microsoft MSFT 2310984201913.0 2.192646 2.48 0.161037 6.94435401 48971705236.0 51728000000.0 1651147200 1648684800 0 Microsoft Corporation stock common delayed_streaming_900 2.318963 50710806541.0 1643144880 1640908800 1 USD USD
2 tesla TSLA 849959567314.9999 2.238503 2.54 0.177177 7.49853036 18134248643.0 17719000000.0 1651665600 1648684800 0 Tesla, Inc. stock common delayed_streaming_900 2.362823 17131882446.0 1643231460 1640908800 1 USD USD
3 visa V 479940487395.0001 1.661801 1.81 0.107021 6.28434056 6868400149.0 7059000000.0 1651060800 1648684800 0 Visa Inc. stock common delayed_streaming_900 1.702979 6792172282.0 1643320980 1640908800 1 USD USD
4 johnson-and-johnson JNJ 452253807870.0 2.556371 2.13 0.013477 0.63675188 23810389493.0 24804000000.0 1650369600 1648684800 0 Johnson & Johnson stock common delayed_streaming_900 2.116523 25275614293.0 1643110380 1640908800 -1 USD USD
...and so on.