Python:从标准输出中提取模式并保存在 csv 中
Python: extract pattern from stdout and save in csv
我有大约 20000-30000 行长的日志文件,它们包含各种数据,每行以当前时间戳开头,后跟 files/linu 数字的路径,然后是对象的值加上一些额外的(不必要的信息)。
2016/08/31 17:27:43/usr/log/data/old/objec: 540: Adjustment Stat
2016/08/31 17:27:43/usr/log/data/old/objec: 570: Position: 1
2016/08/31 17:27:43/usr/log/data/old/object::1150: Adding new object in department xxxx
2016/08/31 17:27:43/usr/log/data/old/file1.java:: 728: object ID: 0
2016/08/31 17:27:43/usr/log/data/old/file2.java:: 729: Start location:1
2016/08/31 17:27:43/usr/log/data/old/file1.java:: 730: End location:55
2016/08/31 17:27:43/usr/log/data/old/: 728: object ID: 1
2016/08/31 17:27:43/usr/log/data/old/: 729: Start location:56
2016/08/31 17:27:43/usr/log/data/old/: 730: End location:67
2016/08/31 17:27:43/usr/log/data/old/: 728: object ID: 2
2016/08/31 17:27:43/usr/log/data/old/: 729: Start location:68
2016/08/31 17:27:43/usr/log/data/old/: 730: End location:110
Timer to Calculate location of object x took 0.004935 seconds
.....
...
...
相同的信息...对于新对象
每个文件有 30-40 个对象组,它们各不相同(在 ID 0-3 之间)
I want to extract information (next line after Adjustment Stat)and save in a text file like
Position ObjectID StartLocation EndLocation
0 1 55
1 56 67
2 68 110
...
...
...
(这里没有任何 Id 为 0 的对象)
1 1 50
2 51 109
...
Or may be store in csv file like
0,1,55
1,56,67
2,68,110
import csv
with open('out.csv', 'w') as output_file, open('in.txt') as input_file:
writer = csv.writer(output_file)
for l in input_file:
if 'object ID:' in l:
object_id = l.split(':')[-1].strip()
elif 'Start location:' in l:
start_loc = l.split(':')[-1].strip()
elif 'End location:' in l:
end_loc = l.split(':')[-1].strip()
writer.writerow((object_id, start_loc, end_loc))
2.6 版本:
import csv
import contextlib
with contextlib.nested(open('out.csv', 'w'), open('in.txt')) as (output_file, input_file):
writer = csv.writer(output_file)
for l in input_file:
if 'object ID:' in l:
object_id = l.split(':')[-1].strip()
elif 'Start location:' in l:
start_loc = l.split(':')[-1].strip()
elif 'End location:' in l:
end_loc = l.split(':')[-1].strip()
writer.writerow((object_id, start_loc, end_loc))
out.csv(in.txt
与 OP 相同)
0,1,55
1,56,67
2,68,110
我有大约 20000-30000 行长的日志文件,它们包含各种数据,每行以当前时间戳开头,后跟 files/linu 数字的路径,然后是对象的值加上一些额外的(不必要的信息)。
2016/08/31 17:27:43/usr/log/data/old/objec: 540: Adjustment Stat
2016/08/31 17:27:43/usr/log/data/old/objec: 570: Position: 1
2016/08/31 17:27:43/usr/log/data/old/object::1150: Adding new object in department xxxx
2016/08/31 17:27:43/usr/log/data/old/file1.java:: 728: object ID: 0
2016/08/31 17:27:43/usr/log/data/old/file2.java:: 729: Start location:1
2016/08/31 17:27:43/usr/log/data/old/file1.java:: 730: End location:55
2016/08/31 17:27:43/usr/log/data/old/: 728: object ID: 1
2016/08/31 17:27:43/usr/log/data/old/: 729: Start location:56
2016/08/31 17:27:43/usr/log/data/old/: 730: End location:67
2016/08/31 17:27:43/usr/log/data/old/: 728: object ID: 2
2016/08/31 17:27:43/usr/log/data/old/: 729: Start location:68
2016/08/31 17:27:43/usr/log/data/old/: 730: End location:110
Timer to Calculate location of object x took 0.004935 seconds
..... ... ... 相同的信息...对于新对象 每个文件有 30-40 个对象组,它们各不相同(在 ID 0-3 之间)
I want to extract information (next line after Adjustment Stat)and save in a text file like
Position ObjectID StartLocation EndLocation
0 1 55
1 56 67
2 68 110
... ... ...
(这里没有任何 Id 为 0 的对象) 1 1 50 2 51 109 ...
Or may be store in csv file like
0,1,55
1,56,67
2,68,110
import csv
with open('out.csv', 'w') as output_file, open('in.txt') as input_file:
writer = csv.writer(output_file)
for l in input_file:
if 'object ID:' in l:
object_id = l.split(':')[-1].strip()
elif 'Start location:' in l:
start_loc = l.split(':')[-1].strip()
elif 'End location:' in l:
end_loc = l.split(':')[-1].strip()
writer.writerow((object_id, start_loc, end_loc))
2.6 版本:
import csv
import contextlib
with contextlib.nested(open('out.csv', 'w'), open('in.txt')) as (output_file, input_file):
writer = csv.writer(output_file)
for l in input_file:
if 'object ID:' in l:
object_id = l.split(':')[-1].strip()
elif 'Start location:' in l:
start_loc = l.split(':')[-1].strip()
elif 'End location:' in l:
end_loc = l.split(':')[-1].strip()
writer.writerow((object_id, start_loc, end_loc))
out.csv(in.txt
与 OP 相同)
0,1,55
1,56,67
2,68,110