python 熊猫帮助从文本文件到自定义格式

python panda help from text file to custom format

我正在 python 中寻求帮助,在那里我可以将以下内容转换为列。

文本文件中的数据:

---- [ Job Information : 2926 ] ----
Name                : Run26
User                : abc
Account             : xyz
Partition           : q_24hrs
Nodes               : node3
Cores               : 36
State               : COMPLETED
ExitCode            : 0:0
Submit              : 2020-12-15T10:23:22
Start               : 2020-12-15T10:23:22
End                 : 2020-12-15T14:13:50
Waited              :   00:00:00
Reserved walltime   : 1-00:00:00
Used walltime       :   03:50:28
Used CPU time       :   00:00:00

所需输出:- [保持此 header 不变]

Job id,Name,User,Account,Partition,Nodes,Cores
2926,abc,xyz,q_24hrs,node3,36

提前致谢....

您可以使用此示例如何使用 re 模块解析文本文件:

import re

with open("your_file.txt", "r") as f_in:
    data = f_in.read()

job_ids = re.findall(r"Job Information : (\d+)", data)
names = re.findall(r"Name\s*:\s*(.*)", data)
users = re.findall(r"User\s*:\s*(.*)", data)
accounts = re.findall(r"Account\s*:\s*(.*)", data)
partitions = re.findall(r"Partition\s*:\s*(.*)", data)
nodes = re.findall(r"Nodes\s*:\s*(.*)", data)
cores = re.findall(r"Cores\s*:\s*(.*)", data)

df = pd.DataFrame(
    zip(job_ids, names, users, accounts, partitions, nodes, cores),
    columns=[
        "Job id",
        "Name",
        "User",
        "Account",
        "Partition",
        "Nodes",
        "Cores",
    ],
)
print(df)
df.to_csv("data.csv", index=False)

创建 data.csv:

Job id,Name,User,Account,Partition,Nodes,Cores
2926,Run26,abc,xyz,q_24hrs,node3,36