阅读 Python 中的表格文件

Question

我想使用 python 分隔符 '\t' 读取数据文件但是，数据没有定界。（我尝试使用分隔符 '\t' 和 ' '。）

识别为制表符的条件是什么？
如何在不修改数据文件的情况下解决这个问题？

数据文件

 0       303567       3584       Write       0.000000
 1       55590       3072       Write       0.000000
 0       303574       3584       Write       0.026214
 1       240840       3072       Write       0.026214
 1       55596       3072       Read       0.078643
 0       303581       3584       Write       0.117964
 1       55596       3072       Write       0.117964
 0       303588       3584       Write       0.530841
 1       55596       3072       Write       0.530841
 0       303595       3584       Write       0.550502
 1       240840       3072       Write       0.550502
 1       55602       3072       Read       0.602931
 0       303602       3584       Write       0.648806
 1       55602       3072       Write       0.648806
 0       303609       3584       Write       0.910950
 1       55602       3072       Write       0.910950
 0       303616       3584       Write       0.930611
 1       240840       3072       Write       0.930611
 1       55608       3072       Read       0.983040
 0       303623       3584       Write       1.028915
 1       55608       3072       Write       1.028915
 0       303630       3584       Write       1.330380
 1       55608       3072       Write       1.330380

代码

with open(datafile, 'rt') as f:
    data = csv.reader(f,delimiter = ' ')
    for d in data:
        pieces.append(d)
        x.append(count)
        count = count+1

打印结果（张）

['10', '', '', '', '', '', '', '700132', '', '', '', '', '', '', '512', '', '', '', '', '', '', 'Write', '', '', '', '', '', '', '4186.852539'] , ['1', ''^C, '', '', '', '', '', '272774', '', '', '', '', '', '', '1024 ', '', '', '', '', '', '', 'Write', '', '', '', '', '', '', '4186.852539'], [ '7', '', '', '', '', '', '', '273776', '', '', '', '', '', '', '1024', '' , '', '', '', '', '', 'Write', '', '', '', '', '', '', '4186.852539']

Answer 1

这种数据格式可以通过以下方式轻松处理：

for line in open(datafile):
    line_data = line.split()
    print(line_data)

Answer 2

你也可以用pandas阅读它（这可能对进一步处理有用）：

import pandas as pd
data = pd.read_table('foo.tab', header=None, sep=r'\s+')
#     0       1     2      3         4
#0   0  303567  3584  Write  0.000000
#1   1   55590  3072  Write  0.000000
#2   0  303574  3584  Write  0.026214

阅读 Python 中的表格文件

Read tabular file in Python

python

delimiter

数据文件

代码

打印结果（张）