从具有不同列数的 txt 文件中读取数据并将其保存为数据框

Question

我有一个 data.txt 文件，如下所示：

我希望将其转换为这样的数据框：

1000 1 2 3 
1000 4 5 6
2000 11 12 13
2000 14 15 16

我是 Python 的新手，尝试了不同的方法，但仍然无效，非常感谢您的帮助。

Answer 1

# read the file, as sep='\n', then use `str.split` to get the columns
obj = pd.read_csv('data.txt', sep='\n', header=None)[0]
df = obj.str.split(expand=True)

# handle the lable line `1000 or 2000`, as column 1 is null
cond = df[1].isnull()
# column 4 store the lable `1000` and `2000`
# use `ffill()` to fillna with the previous value
df.loc[cond, 4] = df.loc[cond, 0]
df[4] = df[4].ffill()

# reorder the column, and filter the lable row
df = df.loc[~cond,[4, 0, 1, 2]]
df.to_csv('demo.txt', sep=' ', index=False, header=None)
!cat demo.txt

    # 1000 1 2 3
    # 1000 4 5 6
    # 2000 11 12 13
    # 2000 14 15 16

df:

      4   0   1   2
1  1000   1   2   3
2  1000   4   5   6
4  2000  11  12  13
5  2000  14  15  16

从具有不同列数的 txt 文件中读取数据并将其保存为数据框

reading data from txt file with varying number of columns and saving it as a dataframe

multiple-columns

dataframe

pandas