如何遍历每个源文件并将特定列复制到新工作簿中,每个新 "paste" 转移到相邻列?
How do I loop through each source file and copy a specific column into a new workbook with each new "paste" shifting to the adjacent column?
我有 3 个 Excel 文件,每个工作簿(在 sheet 每个工作簿中有 1 个)。我想将源单元格中的数据复制到新工作簿中,但数据每次都必须移到新列中。
例如:
- 必须将文件 1 中的源单元格复制到新工作簿中的单元格 A1 到 A10;
- 必须将文件 2 中的源单元格复制到新工作簿中的单元格 B1 到 B10;和
- 必须将文件 3 中的源单元格复制到新工作簿中的单元格 C1 到 C10。
我正在努力想出在每次迭代中调整我的代码中的 "j" 的最佳方法。我也不确定 运行 不同源文件的每个函数的最简洁方法是什么。
关于如何使此代码更清晰的所有建议也将不胜感激,因为我承认现在它太乱了!
提前致谢!
import openpyxl as xl
filename_1 = "C:\workspace\scripts\file1.xlsx"
filename_2 = "C:\workspace\scripts\file2.xlsx"
filename_3 = "C:\workspace\scripts\file3.xlsx"
destination_filename = "C:\workspace\scripts\new_file.xlsx"
num_rows = 10
num_columns = 1
def open_source_workbook(path):
'''Open the workbook and worksheet in the source Excel file'''
workbook = xl.load_workbook(path)
worksheet = workbook.worksheets[0]
return worksheet
def open_destination_workbook(path):
'''Open the destination workbook I want to copy the data to.'''
new_workbook = xl.load_workbook(path)
return new_workbook
def open_destination_worksheet(path):
'''Open the worksheet of the destination workbook I want to copy the data to.'''
new_worksheet = new_workbook.active
return new_worksheet
def copy_to_new_file(worksheet, new_worksheet):
for i in range (1, num_rows + 1):
for j in range (1, num_columns + 1):
c = worksheet.cell(row = i, column = j)
new_worksheet.cell(row = i, column = j).value = c.value
worksheet = open_source_workbook(filename_1)
new_workbook = open_destination_workbook(destination_filename)
new_worksheet = open_destination_worksheet(new_workbook)
copy_to_new_file(worksheet, new_worksheet)
new_workbook.save(str(destination_filename))
Question: Loop files, copy a specific column, with each new “paste” shifting to the adjacent column?
此方法首先从 all 文件中聚合列单元格值。
然后重新排列它,使其可以被 openpyxl.append(...
方法使用。
因此,不需要目标列知识。
参考:
class collections.OrderedDict([items])
Ordered dictionaries are just like regular dictionaries but have some extra capabilities relating to ordering operations.
openpyxl.utils.cell.coordinate_to_tuple(coordinate)
Convert an Excel style coordinate to (row, colum) tuple
iter_rows(min_row=None, max_row=None, min_col=None, max_col=None, values_only=False)
Produces cells from the worksheet, by row. Specify the iteration range using indices of rows and columns.
-
Return an iterator that applies function to every item of iterable, yielding the results.
-
Make an iterator that aggregates elements from each of the iterables.
二手进口
import openpyxl as opxl
from collections import OrderedDict
定义OrderedDict
中的文件以保留文件<=>列顺序
file = OrderedDict.fromkeys(('file1', 'file2', 'file3'))
将范围定义为索引值。
将 Excel A1 表示法转换为索引值
min_col, max_col, min_row, max_row =
opxl.utils.cell.range_to_tuple('DUMMY!A1:A10')[1]
循环定义的文件,
加载每个工作簿并获取对默认工作表的引用
从定义的范围中获取单元格值:
min_col=1, max_col=1, min_row=1, max_row=10
for fname in file.keys():
wb = openpyxl.load_workbook(fname)
ws = wb.current()
file[fname] = ws.iter_rows(min_row=min_row,
max_row=max_row,
min_col=min_col,
max_col=max_col,
values_only=True)
定义一个新工作簿并获取对默认工作表的引用
wb2 = opxl.Workbook()
ws2 = wb2.current()
压缩所有文件中的值,每行一行
使用 lambda
将压缩的元组列表映射为一个行值列表。
将值列表附加到新工作表
for row_value in map(lambda r:tuple(v for c in r for v in c),
zip(*(file[k] for k in file))
):
ws2.append(row_value)
保存新工作簿
# wb2.save(...)
我有 3 个 Excel 文件,每个工作簿(在 sheet 每个工作簿中有 1 个)。我想将源单元格中的数据复制到新工作簿中,但数据每次都必须移到新列中。
例如:
- 必须将文件 1 中的源单元格复制到新工作簿中的单元格 A1 到 A10;
- 必须将文件 2 中的源单元格复制到新工作簿中的单元格 B1 到 B10;和
- 必须将文件 3 中的源单元格复制到新工作簿中的单元格 C1 到 C10。
我正在努力想出在每次迭代中调整我的代码中的 "j" 的最佳方法。我也不确定 运行 不同源文件的每个函数的最简洁方法是什么。
关于如何使此代码更清晰的所有建议也将不胜感激,因为我承认现在它太乱了!
提前致谢!
import openpyxl as xl
filename_1 = "C:\workspace\scripts\file1.xlsx"
filename_2 = "C:\workspace\scripts\file2.xlsx"
filename_3 = "C:\workspace\scripts\file3.xlsx"
destination_filename = "C:\workspace\scripts\new_file.xlsx"
num_rows = 10
num_columns = 1
def open_source_workbook(path):
'''Open the workbook and worksheet in the source Excel file'''
workbook = xl.load_workbook(path)
worksheet = workbook.worksheets[0]
return worksheet
def open_destination_workbook(path):
'''Open the destination workbook I want to copy the data to.'''
new_workbook = xl.load_workbook(path)
return new_workbook
def open_destination_worksheet(path):
'''Open the worksheet of the destination workbook I want to copy the data to.'''
new_worksheet = new_workbook.active
return new_worksheet
def copy_to_new_file(worksheet, new_worksheet):
for i in range (1, num_rows + 1):
for j in range (1, num_columns + 1):
c = worksheet.cell(row = i, column = j)
new_worksheet.cell(row = i, column = j).value = c.value
worksheet = open_source_workbook(filename_1)
new_workbook = open_destination_workbook(destination_filename)
new_worksheet = open_destination_worksheet(new_workbook)
copy_to_new_file(worksheet, new_worksheet)
new_workbook.save(str(destination_filename))
Question: Loop files, copy a specific column, with each new “paste” shifting to the adjacent column?
此方法首先从 all 文件中聚合列单元格值。
然后重新排列它,使其可以被 openpyxl.append(...
方法使用。
因此,不需要目标列知识。
参考:
class collections.OrderedDict([items])
Ordered dictionaries are just like regular dictionaries but have some extra capabilities relating to ordering operations.
openpyxl.utils.cell.coordinate_to_tuple(coordinate)
Convert an Excel style coordinate to (row, colum) tuple
iter_rows(min_row=None, max_row=None, min_col=None, max_col=None, values_only=False)
Produces cells from the worksheet, by row. Specify the iteration range using indices of rows and columns.
-
Return an iterator that applies function to every item of iterable, yielding the results.
-
Make an iterator that aggregates elements from each of the iterables.
二手进口
import openpyxl as opxl from collections import OrderedDict
定义
OrderedDict
中的文件以保留文件<=>列顺序file = OrderedDict.fromkeys(('file1', 'file2', 'file3'))
将范围定义为索引值。 将 Excel A1 表示法转换为索引值
min_col, max_col, min_row, max_row = opxl.utils.cell.range_to_tuple('DUMMY!A1:A10')[1]
循环定义的文件,
加载每个工作簿并获取对默认工作表的引用
从定义的范围中获取单元格值:
min_col=1, max_col=1, min_row=1, max_row=10
for fname in file.keys(): wb = openpyxl.load_workbook(fname) ws = wb.current() file[fname] = ws.iter_rows(min_row=min_row, max_row=max_row, min_col=min_col, max_col=max_col, values_only=True)
定义一个新工作簿并获取对默认工作表的引用
wb2 = opxl.Workbook() ws2 = wb2.current()
压缩所有文件中的值,每行一行
使用lambda
将压缩的元组列表映射为一个行值列表。
将值列表附加到新工作表for row_value in map(lambda r:tuple(v for c in r for v in c), zip(*(file[k] for k in file)) ): ws2.append(row_value)
保存新工作簿
# wb2.save(...)