使用 file.write（行）从失败的文件中删除最后的空白行

Question

我在 stack.overflow 中尝试了很多问题，以从 2.txt 文件（输入）中删除 last 空行：

2.txt 文件:

-11
B1
5
B1
-2
B1
7
B1
-11
B1
9
B1
-1
B1
-3
B1
19
B1
-22
B1
2
B1
1
B1
18
B1
-14
B1
0
B1
11
B1
-8
B1
-15

唯一使用 print(line) 的是这个。但是当我尝试在我的最终 2.txt 文件（输出）中使用 f.write(line) 而不是 print(line) 时，如下所示：

2.txt 文件最终：

-11B15B1-2B17B1-11B19B1-1B1-3B119B1-22B12B11B118B1-14B10B111B1-8B1-15
18
B1
-14
B1
0
B1
11
B1
-8
B1
-15

但是，当我使用 print line) 而不是 f.write (line) 的代码时，我的 bash 终端显示 删除最后几行的输出 （见下文 print(line) result in terminal bash）但变形等于 2.txt file final，即它工作正常。我试图了解正在发生的事情，但没有取得任何进展。

在终端中打印（行）结果 bash

-11B15B1-2B17B1-11B19B1-1B1-3B119B1-22B12B11B118B1-14B10B111B1-8B1-15
18
B1
-14
B1
0
B1
11
B1
-8
B1
-15

更新：

我的脚本删除了 2.txt 文件的最后几行，但在终端中变形了第一行 bash:

for line in open('2.txt'):
  line = line.rstrip()
  if line != '':
    print (line)

我的脚本变形了 2.txt 文件的第一行并且也没有删除文件输出中所需的最后几行 3.txt:

with open("2.txt",'r+') as f:
  for line in open('3.txt'):
    line = line.rstrip()
    if line != '':
        f.write(line)

Answer 1

修复现有方法

rstrip() 删除 尾随换行符 以及其他内容，因此当您写入结果时，它会将光标留在同一行的末尾。

一种修复它的方法，清楚需要更改的内容（所有代码均未修改，但添加了最后一行）：

with open("2.txt",'r+') as f:
  for line in open('3.txt'):
    line = line.rstrip()
    if line != '':
        f.write(line)
        f.write(os.linesep)  # one extra line

或者，您可以将 f.write(line) 更改为 print(line, file=f)。

针对大文件快速优化到运行

如果您需要 trim 从任意大文件的末尾开始少量空白行，跳到该文件的末尾并向后工作是有意义的；这样一来，您不关心整个文件有多大，只关心需要删除多少内容。

也就是说，类似于：

import os, sys
block_size = 4096 # 4kb blocks; decent chance this is your page size & disk sector size.
filename = sys.argv[1] # or replace this with a hardcoded name if you prefer

with open(filename, 'r+b') as f:   # seeking backwards only supported on files opened binary
    while True:
        f.seek(0, 2)                            # start at the end of the file
        offset = f.tell()                       # figure out where that is
        f.seek(max(0, offset - block_size), 0)  # move up to block_size bytes back
        offset = f.tell()                       # figure out where we are
        trailing_content = f.read()             # read from here to the end
        new_content = trailing_content.rstrip() # remove all whitespace
        if new_content == trailing_content:     # nothing to remove?
            break                               # then we're done.
        if(new_content != ''):                  # and if post-strip there's content...
            f.seek(offset + len(new_content))   # jump to its end...
            f.write(os.linesep.encode('utf-8')) # ...write a newline...
            f.truncate()                        # and then delete the rest of the file.
            break
        else:
            f.seek(offset, 0)                   # go to where our block started
            f.truncate()                        # and delete *everything* after it
            # run through the loop again, to see if there's still more trailing whitespace.

使用 file.write（行）从失败的文件中删除最后的空白行

Remove last blank lines from a failed file using file.write (line)

python

file

line

修复现有方法

针对大文件快速优化到运行

使用 file.write（行）从失败的文件中删除最后的空白行

Remove last blank lines from a failed file using file.write (line)

python

file

line

修复现有方法

针对大文件快速优化到 运行

针对大文件快速优化到运行