使用关键字定界符拆分大文本文件
Split large text file using keyword delimiter
我正在尝试使用单词定界符将大型文本文件拆分为较小的文本文件。我尝试搜索,但只看到在 x 行后拆分文件的帖子。我对编程还很陌生,但我已经开始了。我想遍历所有行,如果它以 hello 开头,它会将所有这些行放入一个文件中,直到它到达下一个 hello。文件中的第一个词是你好。最终,我试图将文本放入 R 中,但我认为如果我先像这样拆分它会更容易。感谢任何帮助,谢谢。
text_file = open("myfile.txt","r")
lines = text_file.readlines()
print len(lines)
for line in lines :
print line
if line[0:5] == "hello":
如果您正在寻找一个非常简单的逻辑,试试这个。
text_file = open("myfile.txt","r")
lines = text_file.readlines()
print len(lines)
target = open ("filename.txt", 'a') ## a will append, w will over-write
hello1Found = False
hello2Found = False
for line in lines :
if hello1Found == True :
if line[0:5] == "hello":
hello2Found = True
hello1Found = False
break ## When second hello is found looping/saving to file is stopped
##(though using break is not a good practice here it suffice your simple requirement
else:
print line #write the line to new file
target.write(line)
if hello1Found == False:
if line[0:5] == "hello": ##find first occurrence of hello
hello1Found = True
print line
target.write(line) ##if hello is found for the first time write the
##line/subsequent lines to new file till the occurrence of second hello
我是 Python 的新手。我刚刚在东北大学完成了地理信息系统 class 的 Python。这是我想出来的。
import os
import sys
import arcpy
def files():
n = 0
while True:
n += 1
yield open('/output/dir/%d.txt' % n, 'w')
pattern = 'hello'
fs = files()
outfile = next(fs)
filename = r'C:\output\dir\filename.txt'
with open(filename) as infile:
for line in infile:
if pattern not in line:
outfile.write(line)
else:
items = line.split(pattern)
outfile.write
(items[0])
for item in items:
outfile = next(fs)
outfile.write(item)
filename.close();outfile.close();
我正在尝试使用单词定界符将大型文本文件拆分为较小的文本文件。我尝试搜索,但只看到在 x 行后拆分文件的帖子。我对编程还很陌生,但我已经开始了。我想遍历所有行,如果它以 hello 开头,它会将所有这些行放入一个文件中,直到它到达下一个 hello。文件中的第一个词是你好。最终,我试图将文本放入 R 中,但我认为如果我先像这样拆分它会更容易。感谢任何帮助,谢谢。
text_file = open("myfile.txt","r")
lines = text_file.readlines()
print len(lines)
for line in lines :
print line
if line[0:5] == "hello":
如果您正在寻找一个非常简单的逻辑,试试这个。
text_file = open("myfile.txt","r")
lines = text_file.readlines()
print len(lines)
target = open ("filename.txt", 'a') ## a will append, w will over-write
hello1Found = False
hello2Found = False
for line in lines :
if hello1Found == True :
if line[0:5] == "hello":
hello2Found = True
hello1Found = False
break ## When second hello is found looping/saving to file is stopped
##(though using break is not a good practice here it suffice your simple requirement
else:
print line #write the line to new file
target.write(line)
if hello1Found == False:
if line[0:5] == "hello": ##find first occurrence of hello
hello1Found = True
print line
target.write(line) ##if hello is found for the first time write the
##line/subsequent lines to new file till the occurrence of second hello
我是 Python 的新手。我刚刚在东北大学完成了地理信息系统 class 的 Python。这是我想出来的。
import os
import sys
import arcpy
def files():
n = 0
while True:
n += 1
yield open('/output/dir/%d.txt' % n, 'w')
pattern = 'hello'
fs = files()
outfile = next(fs)
filename = r'C:\output\dir\filename.txt'
with open(filename) as infile:
for line in infile:
if pattern not in line:
outfile.write(line)
else:
items = line.split(pattern)
outfile.write
(items[0])
for item in items:
outfile = next(fs)
outfile.write(item)
filename.close();outfile.close();