使用 openpyxl 删除 Excel 行是否有更快的方法?

Is that any faster way to deleting Excel row using openpyxl?

我有一个 excel 行号的列表,我想使用 Openpyxl 删除长度为 2138 的行号。这是代码:

delete_this_row = [1,2,....,2138]

for delete in delete_this_row:
    worksheet.delete_rows(delete)

但是太慢了。完成该过程需要 45 秒到 1 分钟。

有没有更快的方法完成任务?

几乎总是一种更快的做事方式。有时成本太高但在这种情况下不是,我怀疑:-)

如果您只是想删除一组连续的行,您可以使用:

worksheet.delete_rows(1, 2138)

文档 here,为了完整性复制如下:

delete_rows(idx, amount=1): Delete row or rows from row==idx.

您的解决方案很慢,因为,每次 删除一行时,它都必须将该点下方的所有内容 向上移动一行然后删除最后一行。

通过传入行数,它改为进行 一次 移位,将行 2139..max 直接移位到行 1..max-2138,然后删除所有行低于 max-2138.

这可能比您现在的速度快大约 2,138 倍:-)


如果你的数组中有任意行号,你仍然可以使用这种方法尽可能地优化它。

这里的想法是首先将你的行列表变成一个元组列表,其中每个元组有:

  • 起始行;和
  • 要从那里删除的行数。

理想情况下,您还可以以相反的顺序生成它,这样您就可以按原样处理它。以下代码段显示了如何执行此操作,其中打印了 openpyxl 调用而不是调用:

def reverseCombiner(rowList):
    # Don't do anything for empty list. Otherwise,
    # make a copy and sort.

    if len(rowList) == 0: return []
    sortedList = rowList[:]
    sortedList.sort()

    # Init, empty tuple, use first item for previous and
    # first in this run.

    tupleList = []
    firstItem = sortedList[0]
    prevItem = sortedList[0]

    # Process all other items in order.

    for item in sortedList[1:]:
        # If start of new run, add tuple and use new first-in-run.

        if item != prevItem + 1:
            tupleList = [(firstItem, prevItem + 1 - firstItem)] + tupleList
            firstItem = item

        # Regardless, current becomes previous for next loop.

        prevItem = item

    # Finish off the final run and return tuple list.

    tupleList = [(firstItem, prevItem + 1 - firstItem)] + tupleList
    return tupleList

# Test data, hit me with anything :-)

myList = [1, 70, 71, 72, 98, 21, 22, 23, 24, 25, 99]

# Create tuple list, show original and that list, then process.

tuples = reverseCombiner(myList)
print(f"Original: {myList}")
print(f"Tuples:   {tuples}\n")
for tuple in tuples:
    print(f"Would execute: worksheet.delete_rows({tuple[0]}, {tuple[1]})")

输出为:

Original: [1, 70, 71, 72, 98, 21, 22, 23, 24, 25, 99]
Tuples:   [(98, 2), (70, 3), (21, 5), (1, 1)]

Would execute: worksheet.delete_rows(98, 2)
Would execute: worksheet.delete_rows(70, 3)
Would execute: worksheet.delete_rows(21, 5)
Would execute: worksheet.delete_rows(1, 1)