使用 BeautifulSoup 获取 table 内容
Getting table content using BeautifulSoup
我正在尝试使用以下 python 代码从此网站检索 table 内容:https://whalewisdom.com/filer/hillhouse-capital-advisors-ltd#tabholdings_tab_link
stat_table = soup.find_all('table', id_ = 'current_holdings_table', class_ = "table table-bordered table-striped table-hover")
但是当我使用 len(stat_table) 时,它返回了零值,表示无法从该网站检索到任何内容。有谁知道我哪里出错了?谢谢你的帮助。
您看到的数据是通过 JavaScript 从另一个 URL 加载的。要加载数据,您可以使用此示例:
import json
import requests
url = 'https://whalewisdom.com/filer/holdings?id=hillhouse-capital-advisors-ltd&q1=-1&type_filter=1,2,3,4&symbol=&change_filter=&minimum_ranking=&minimum_shares=&is_etf=0&sc=true&sort=current_mv&order=desc&offset=0&limit=25'
data = json.loads(requests.get(url).text)
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
for row in data['rows']:
print('{:<5} {:<50} {:<15} {:<15}'.format(row['symbol'], row['name'], row['current_shares'], row['current_mv']))
打印:
BGNE BeiGene Ltd ADR 147035258.0 28823321625.74
ZM Zoom Video Communications Inc 6856980.0 1738519000.0
IQ iQIYI Inc 46694629.0 1082848000.0
BABA Alibaba Group Holding Ltd ADR 3930086.0 847720000.0
PDD Pinduoduo Inc 9863866.0 846714000.0
UBER Uber Technologies Inc 19260700.0 598623000.0
TAL TAL Education Group American Depositary ADR 7906041.0 540615000.0
JD JD.com Inc ADR 7810402.0 470030000.0
BILI Bilibili Inc 9102063.0 421608000.0
CBPO China Biologic Products Holdings Inc 2962076.0 302665000.0
ESGR Enstar Group Ltd 1747840.0 267018000.0
ALGN Align Technology Inc 790365.0 216908000.0
APLS Apellis Pharmaceuticals Inc 5028289.0 164224000.0
FGEN FibroGen Inc 3955787.0 160328000.0
BBIO BridgeBio Pharma Inc 4711604.0 153645000.0
TSLA Tesla Inc 130378.0 140783000.0
CRM Salesforce.com Inc. 709495.0 132910000.0
ZTO ZTO Express Cayman Inc ADR 3433592.0 126047000.0
MDLZ Mondelez International Inc. (Kraft Foods) 2431164.0 124305000.0
VIE Viela Bio, Inc. 2815868.0 121983000.0
VIPS Vipshop Holdings Ltd ADR 5477392.0 109055000.0
BPMC Blueprint Medicines Corp 1364631.0 106441000.0
ARGX Argenx SE ADS ADR 470000.0 105858000.0
GOSS Gossamer Bio Inc 7420974.0 96473000.0
BEAM Beam Therapeutics Inc. 2966403.0 83059000.0
我正在尝试使用以下 python 代码从此网站检索 table 内容:https://whalewisdom.com/filer/hillhouse-capital-advisors-ltd#tabholdings_tab_link
stat_table = soup.find_all('table', id_ = 'current_holdings_table', class_ = "table table-bordered table-striped table-hover")
但是当我使用 len(stat_table) 时,它返回了零值,表示无法从该网站检索到任何内容。有谁知道我哪里出错了?谢谢你的帮助。
您看到的数据是通过 JavaScript 从另一个 URL 加载的。要加载数据,您可以使用此示例:
import json
import requests
url = 'https://whalewisdom.com/filer/holdings?id=hillhouse-capital-advisors-ltd&q1=-1&type_filter=1,2,3,4&symbol=&change_filter=&minimum_ranking=&minimum_shares=&is_etf=0&sc=true&sort=current_mv&order=desc&offset=0&limit=25'
data = json.loads(requests.get(url).text)
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
for row in data['rows']:
print('{:<5} {:<50} {:<15} {:<15}'.format(row['symbol'], row['name'], row['current_shares'], row['current_mv']))
打印:
BGNE BeiGene Ltd ADR 147035258.0 28823321625.74
ZM Zoom Video Communications Inc 6856980.0 1738519000.0
IQ iQIYI Inc 46694629.0 1082848000.0
BABA Alibaba Group Holding Ltd ADR 3930086.0 847720000.0
PDD Pinduoduo Inc 9863866.0 846714000.0
UBER Uber Technologies Inc 19260700.0 598623000.0
TAL TAL Education Group American Depositary ADR 7906041.0 540615000.0
JD JD.com Inc ADR 7810402.0 470030000.0
BILI Bilibili Inc 9102063.0 421608000.0
CBPO China Biologic Products Holdings Inc 2962076.0 302665000.0
ESGR Enstar Group Ltd 1747840.0 267018000.0
ALGN Align Technology Inc 790365.0 216908000.0
APLS Apellis Pharmaceuticals Inc 5028289.0 164224000.0
FGEN FibroGen Inc 3955787.0 160328000.0
BBIO BridgeBio Pharma Inc 4711604.0 153645000.0
TSLA Tesla Inc 130378.0 140783000.0
CRM Salesforce.com Inc. 709495.0 132910000.0
ZTO ZTO Express Cayman Inc ADR 3433592.0 126047000.0
MDLZ Mondelez International Inc. (Kraft Foods) 2431164.0 124305000.0
VIE Viela Bio, Inc. 2815868.0 121983000.0
VIPS Vipshop Holdings Ltd ADR 5477392.0 109055000.0
BPMC Blueprint Medicines Corp 1364631.0 106441000.0
ARGX Argenx SE ADS ADR 470000.0 105858000.0
GOSS Gossamer Bio Inc 7420974.0 96473000.0
BEAM Beam Therapeutics Inc. 2966403.0 83059000.0