将 pandas.DataFrame 添加到现有 Excel 文件
Adding a pandas.DataFrame to Existing Excel File
我有一个网络抓取工具,可以为本月的抓取创建一个 excel 文件。我想将今天的刮擦和那个月的每个刮擦作为一个新的 sheet 添加到那个文件中,每次它是 运行。然而,我的问题是它只用新的 sheet 覆盖现有的 sheet,而不是将其作为单独的新 sheet 添加。我已经尝试使用 xlrd、xlwt、pandas 和 openpyxl 来做到这一点。
对于 Python 来说仍然是全新的,因此非常感谢简单!
下面只是处理写入 excel 文件的代码。
# My relevant time variables
ts = time.time()
date_time = datetime.datetime.fromtimestamp(ts).strftime('%y-%m-%d %H_%M_%S')
HourMinuteSecond = datetime.datetime.fromtimestamp(ts).strftime('%H_%M_%S')
month = datetime.datetime.now().strftime('%m-%y')
# Creates a writer for this month and year
writer = pd.ExcelWriter(
'C:\Users\G\Desktop\KickstarterLinks(%s).xlsx' % (month),
engine='xlsxwriter')
# Creates dataframe from my data, d
df = pd.DataFrame(d)
# Writes to the excel file
df.to_excel(writer, sheet_name='%s' % (HourMinuteSecond))
writer.save()
更新:
此功能已添加到 pandas 0.24.0:
ExcelWriter now accepts mode
as a keyword argument, enabling append to existing workbooks when using the openpyxl engine (GH3441)
以前的版本:
Pandas 对此有一个 open feature request。
与此同时,这里有一个向现有工作簿添加 pandas.DataFrame
的函数:
代码:
def add_frame_to_workbook(filename, tabname, dataframe, timestamp):
"""
Save a dataframe to a workbook tab with the filename and tabname
coded to timestamp
:param filename: filename to create, can use strptime formatting
:param tabname: tabname to create, can use strptime formatting
:param dataframe: dataframe to save to workbook
:param timestamp: timestamp associated with dataframe
:return: None
"""
filename = timestamp.strftime(filename)
sheet_name = timestamp.strftime(tabname)
# create a writer for this month and year
writer = pd.ExcelWriter(filename, engine='openpyxl')
try:
# try to open an existing workbook
writer.book = load_workbook(filename)
# copy existing sheets
writer.sheets = dict(
(ws.title, ws) for ws in writer.book.worksheets)
except IOError:
# file does not exist yet, we will create it
pass
# write out the new sheet
dataframe.to_excel(writer, sheet_name=sheet_name)
# save the workbook
writer.save()
测试代码:
import datetime as dt
import pandas as pd
from openpyxl import load_workbook
data = [x.strip().split() for x in """
Date Close
2016-10-18T13:44:59 2128.00
2016-10-18T13:59:59 2128.75
""".split('\n')[1:-1]]
df = pd.DataFrame(data=data[1:], columns=data[0])
name_template = './sample-%m-%y.xlsx'
tab_template = '%d_%H_%M'
now = dt.datetime.now()
in_an_hour = now + dt.timedelta(hours=1)
add_frame_to_workbook(name_template, tab_template, df, now)
add_frame_to_workbook(name_template, tab_template, df, in_an_hour)
(Source)
我有一个网络抓取工具,可以为本月的抓取创建一个 excel 文件。我想将今天的刮擦和那个月的每个刮擦作为一个新的 sheet 添加到那个文件中,每次它是 运行。然而,我的问题是它只用新的 sheet 覆盖现有的 sheet,而不是将其作为单独的新 sheet 添加。我已经尝试使用 xlrd、xlwt、pandas 和 openpyxl 来做到这一点。
对于 Python 来说仍然是全新的,因此非常感谢简单!
下面只是处理写入 excel 文件的代码。
# My relevant time variables
ts = time.time()
date_time = datetime.datetime.fromtimestamp(ts).strftime('%y-%m-%d %H_%M_%S')
HourMinuteSecond = datetime.datetime.fromtimestamp(ts).strftime('%H_%M_%S')
month = datetime.datetime.now().strftime('%m-%y')
# Creates a writer for this month and year
writer = pd.ExcelWriter(
'C:\Users\G\Desktop\KickstarterLinks(%s).xlsx' % (month),
engine='xlsxwriter')
# Creates dataframe from my data, d
df = pd.DataFrame(d)
# Writes to the excel file
df.to_excel(writer, sheet_name='%s' % (HourMinuteSecond))
writer.save()
更新:
此功能已添加到 pandas 0.24.0:
ExcelWriter now accepts
mode
as a keyword argument, enabling append to existing workbooks when using the openpyxl engine (GH3441)
以前的版本:
Pandas 对此有一个 open feature request。
与此同时,这里有一个向现有工作簿添加 pandas.DataFrame
的函数:
代码:
def add_frame_to_workbook(filename, tabname, dataframe, timestamp):
"""
Save a dataframe to a workbook tab with the filename and tabname
coded to timestamp
:param filename: filename to create, can use strptime formatting
:param tabname: tabname to create, can use strptime formatting
:param dataframe: dataframe to save to workbook
:param timestamp: timestamp associated with dataframe
:return: None
"""
filename = timestamp.strftime(filename)
sheet_name = timestamp.strftime(tabname)
# create a writer for this month and year
writer = pd.ExcelWriter(filename, engine='openpyxl')
try:
# try to open an existing workbook
writer.book = load_workbook(filename)
# copy existing sheets
writer.sheets = dict(
(ws.title, ws) for ws in writer.book.worksheets)
except IOError:
# file does not exist yet, we will create it
pass
# write out the new sheet
dataframe.to_excel(writer, sheet_name=sheet_name)
# save the workbook
writer.save()
测试代码:
import datetime as dt
import pandas as pd
from openpyxl import load_workbook
data = [x.strip().split() for x in """
Date Close
2016-10-18T13:44:59 2128.00
2016-10-18T13:59:59 2128.75
""".split('\n')[1:-1]]
df = pd.DataFrame(data=data[1:], columns=data[0])
name_template = './sample-%m-%y.xlsx'
tab_template = '%d_%H_%M'
now = dt.datetime.now()
in_an_hour = now + dt.timedelta(hours=1)
add_frame_to_workbook(name_template, tab_template, df, now)
add_frame_to_workbook(name_template, tab_template, df, in_an_hour)
(Source)