如何使用 pandas 将循环的每次迭代保存到单个 .csv 文件
How to save each iteration of a loop to a single .csv file using pandas
我正在使用下面的代码来抓取一些基金的最新每日价格:
import requests
import pandas as pd
urls = ['https://markets.ft.com/data/funds/tearsheet/historical?s=LU0526609390:EUR', 'https://markets.ft.com/data/funds/tearsheet/historical?s=IE00BHBX0Z19:EUR',
'https://markets.ft.com/data/funds/tearsheet/historical?s=LU1076093779:EUR']
def format_date(date):
date = date.split(',')[-2][1:] + date.split(',')[-1]
return pd.Series({'Date': date})
for url in urls:
ISIN = url.split('=')[-1].replace(':', '_')
ISIN = ISIN[:-4]
ISIN = ISIN + ".OTHER"
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
df['Date'] = df['Date'].apply(format_date)
del df['Open']
del df['High']
del df['Low']
del df['Volume']
df = df.rename(columns={'Close': 'last_traded_price'})
df = df.rename(columns={'Date': 'last_traded_on'})
df.insert(2, "id", ISIN)
df=df.head(1)
print (df)
df.to_csv(r'/Users/.../Testdata.csv', index=False)
目前,每次新循环开始时,Testdata.csv 文件都会被覆盖,我想找到一种方法将所有数据保存到 .csv 文件中,格式如下:
Col 1 Col 2 Col 3
last_traded_on last_traded_price id
Oct 07 2021 78.83 LU0526609390.OTHER
Oct 07 2021 11.1 IE00BHBX0Z19.OTHER
Oct 07 2021 155.56 LU1076093779.OTHER
我需要找到一种方法以某种方式将数据保存到循环外的 .csv 文件中,但我真的很难找到一种方法来做到这一点。
谢谢
使用文件处理程序:
with open(r'/Users/.../Testdata.csv', 'w') as csvfile
# Here, you need to write headers:
# csvfile.write("header1,header2,header3\n")
for url in urls:
ISIN = url.split('=')[-1].replace(':', '_')
... # The rest of your code
df.to_csv(csvfile, index=False, header=False)
或者最好的做法是将每个数据帧收集到一个列表中,然后使用 pd.concat
合并所有数据并保存到一个文件中:
dfs = []
for url in urls:
ISIN = url.split('=')[-1].replace(':', '_')
... # The rest of your code
dfs.append(df)
pd.concat(dfs).to_csv(r'/Users/.../Testdata.csv', index=False)
注意:您的输出看起来像是 df.to_string()
而不是 df.to_csv
的输出
我正在使用下面的代码来抓取一些基金的最新每日价格:
import requests
import pandas as pd
urls = ['https://markets.ft.com/data/funds/tearsheet/historical?s=LU0526609390:EUR', 'https://markets.ft.com/data/funds/tearsheet/historical?s=IE00BHBX0Z19:EUR',
'https://markets.ft.com/data/funds/tearsheet/historical?s=LU1076093779:EUR']
def format_date(date):
date = date.split(',')[-2][1:] + date.split(',')[-1]
return pd.Series({'Date': date})
for url in urls:
ISIN = url.split('=')[-1].replace(':', '_')
ISIN = ISIN[:-4]
ISIN = ISIN + ".OTHER"
html = requests.get(url).content
df_list = pd.read_html(html)
df = df_list[-1]
df['Date'] = df['Date'].apply(format_date)
del df['Open']
del df['High']
del df['Low']
del df['Volume']
df = df.rename(columns={'Close': 'last_traded_price'})
df = df.rename(columns={'Date': 'last_traded_on'})
df.insert(2, "id", ISIN)
df=df.head(1)
print (df)
df.to_csv(r'/Users/.../Testdata.csv', index=False)
目前,每次新循环开始时,Testdata.csv 文件都会被覆盖,我想找到一种方法将所有数据保存到 .csv 文件中,格式如下:
Col 1 Col 2 Col 3
last_traded_on last_traded_price id
Oct 07 2021 78.83 LU0526609390.OTHER
Oct 07 2021 11.1 IE00BHBX0Z19.OTHER
Oct 07 2021 155.56 LU1076093779.OTHER
我需要找到一种方法以某种方式将数据保存到循环外的 .csv 文件中,但我真的很难找到一种方法来做到这一点。
谢谢
使用文件处理程序:
with open(r'/Users/.../Testdata.csv', 'w') as csvfile
# Here, you need to write headers:
# csvfile.write("header1,header2,header3\n")
for url in urls:
ISIN = url.split('=')[-1].replace(':', '_')
... # The rest of your code
df.to_csv(csvfile, index=False, header=False)
或者最好的做法是将每个数据帧收集到一个列表中,然后使用 pd.concat
合并所有数据并保存到一个文件中:
dfs = []
for url in urls:
ISIN = url.split('=')[-1].replace(':', '_')
... # The rest of your code
dfs.append(df)
pd.concat(dfs).to_csv(r'/Users/.../Testdata.csv', index=False)
注意:您的输出看起来像是 df.to_string()
而不是 df.to_csv