通过 Python 使用导出按钮下载

Question

我有兴趣从 Morningstar 网站下载财务报表。这里有一个页面的例子：

http://financials.morningstar.com/cash-flow/cf.html?t=PIRC&region=ita&culture=en-US

右上角有导出到csv按钮，我想用Python点击它。紧迫检查，我有这个HTML标签：

<div class="exportButton">
    <span class="icon_1_span">
       <a href="javascript:SRT_stocFund.Export()" class="rf_export">
       </a> ==[=11=]

我的想法是使用bs4 - BeautifulSoup解析（完全不确定是否需要解析）页面并找到点击它的按钮。类似于：

quote_page = pageURL
page = urlopen(quote_page)
soup = BeautifulSoup(page, "html.parser")
bs = soup.find(href="javascript:SRT_stocFund.Export()", attrs={"class":"rf_export"})

很明显，这个returns没什么。你对我如何告诉 Python 导出 table 中的数据有什么建议吗？ IE。自动执行下载 csv 文件的过程，而不是去网页上自己做。

非常感谢！！

Answer 1

我会在 "headless" 模式下使用 Selenium WebDriver。试试 Selenium，它很容易理解和使用。 :)

Answer 2

加上 google chrome "http trace" 的扩展名，你可以知道，它是 link:

Export

可以做到，有requests库。

Example

我认为这是最简单的方法（我认为如果您修改 url 参数，您可以根据需要创建 excel 文件）。

此致！！！

通过 Python 使用导出按钮下载

Download with export button through Python

html

python

automation

export-to-csv