使用 Pandas 将 Dataframe 行写入 excel sheet
Write Dataframe row to excel sheet using Pandas
如何将从数据框中返回的行保存到 excel sheet?
故事:我正在处理大型 txt 文件(170 万行),其中包含加拿大的邮政编码。我创建了一个数据框,并将我需要的值提取到其中。数据框的一列是省份 ID (df['PID'])
。我创建了一个在该 PID 列中找到的唯一值列表,并成功创建了 (13) sheets,每个都以唯一的 PID 命名,在新的 excel 传播 sheet 中.
问题:每个 sheet 只包含 headers,而不包含行的值。
我在将匹配行写入 sheet 时遇到问题。这是我的代码:
import pandas as pd
# parse text file into dataframe
path = 'the_file.txt'
df = pd.read_csv(path, sep='\t', header=None, names=['ORIG', 'PID','PCODE'], encoding='iso-8859-1')
# extract characters to fill values
df['ORIG'] = df['ORIG']
df['PID'] = df['ORIG'].str[11:13].astype(int)
df['PCODE'] = df['ORIG'].str[:6]
# create list of unique province ID's
prov_ids = df['PID'].unique().tolist()
prov_ids_string = map(str, prov_ids)
# create new excel file
writer = pd.ExcelWriter('CanData.xlsx', engine='xlsxwriter')
for id in prov_ids_string:
mydf = df.loc[df.PID==id]
# NEED TO WRITE VALUES FROM ROW INTO SHEET HERE*
mydf.to_excel(writer, sheet_name=id)
writer.save()
我知道应该在哪里写,但是我没有得到正确的结果。我怎样才能只将具有匹配 PID 的行写入它们各自的 sheets?
谢谢
以下应该有效:
import pandas as pd
import xlsxwriter
# parse text file into dataframe
# extract characters to fill values
df['ORIG'] = df['ORIG']
df['PID'] = df['ORIG'].str[11:13].astype(int)
df['PCODE'] = df['ORIG'].str[:6]
# create list of unique province ID's
prov_ids = df['PID'].unique().tolist()
#prov_ids_string = map(str, prov_ids)
# create new excel file
writer = pd.ExcelWriter('./CanData.xlsx', engine='xlsxwriter')
for idx in prov_ids:
mydf = df.loc[df.PID==idx]
# NEED TO WRITE VALUES FROM ROW INTO SHEET HERE*
mydf.to_excel(writer, sheet_name=str(idx))
writer.save()
例如数据:
df = pd.DataFrame()
df['ORIG'] = ['aaaaaa111111111111111111111',
'bbbbbb2222222222222222222222']
df['ORIG'] = df['ORIG']
df['PID'] = df['ORIG'].str[11:13].astype(int)
df['PCODE'] = df['ORIG'].str[:6]
print(df)
在我的 Sheet 11
中,我有:
氪
如何将从数据框中返回的行保存到 excel sheet?
故事:我正在处理大型 txt 文件(170 万行),其中包含加拿大的邮政编码。我创建了一个数据框,并将我需要的值提取到其中。数据框的一列是省份 ID (df['PID'])
。我创建了一个在该 PID 列中找到的唯一值列表,并成功创建了 (13) sheets,每个都以唯一的 PID 命名,在新的 excel 传播 sheet 中.
问题:每个 sheet 只包含 headers,而不包含行的值。
我在将匹配行写入 sheet 时遇到问题。这是我的代码:
import pandas as pd
# parse text file into dataframe
path = 'the_file.txt'
df = pd.read_csv(path, sep='\t', header=None, names=['ORIG', 'PID','PCODE'], encoding='iso-8859-1')
# extract characters to fill values
df['ORIG'] = df['ORIG']
df['PID'] = df['ORIG'].str[11:13].astype(int)
df['PCODE'] = df['ORIG'].str[:6]
# create list of unique province ID's
prov_ids = df['PID'].unique().tolist()
prov_ids_string = map(str, prov_ids)
# create new excel file
writer = pd.ExcelWriter('CanData.xlsx', engine='xlsxwriter')
for id in prov_ids_string:
mydf = df.loc[df.PID==id]
# NEED TO WRITE VALUES FROM ROW INTO SHEET HERE*
mydf.to_excel(writer, sheet_name=id)
writer.save()
我知道应该在哪里写,但是我没有得到正确的结果。我怎样才能只将具有匹配 PID 的行写入它们各自的 sheets?
谢谢
以下应该有效:
import pandas as pd
import xlsxwriter
# parse text file into dataframe
# extract characters to fill values
df['ORIG'] = df['ORIG']
df['PID'] = df['ORIG'].str[11:13].astype(int)
df['PCODE'] = df['ORIG'].str[:6]
# create list of unique province ID's
prov_ids = df['PID'].unique().tolist()
#prov_ids_string = map(str, prov_ids)
# create new excel file
writer = pd.ExcelWriter('./CanData.xlsx', engine='xlsxwriter')
for idx in prov_ids:
mydf = df.loc[df.PID==idx]
# NEED TO WRITE VALUES FROM ROW INTO SHEET HERE*
mydf.to_excel(writer, sheet_name=str(idx))
writer.save()
例如数据:
df = pd.DataFrame()
df['ORIG'] = ['aaaaaa111111111111111111111',
'bbbbbb2222222222222222222222']
df['ORIG'] = df['ORIG']
df['PID'] = df['ORIG'].str[11:13].astype(int)
df['PCODE'] = df['ORIG'].str[:6]
print(df)
在我的 Sheet 11
中,我有:
氪