将文本数据拆分为 Python 中的列
Split txt data into columns in Python
我有一个 .txt 数据集,格式如下:
01/01/2018 ['cat', 'bear', 'ant']
01/02/2018 ['horse', 'wolf', 'elephant']
01/03/2018 ['parrot', 'bird', 'fish]
我想使用 PYTHON 并将其设置为以下格式的 2 列:
'Date' 'Animal'
01/01/2018 cat
01/01/2018 bear
01/01/2018 ant
01/02/2018 horse
01/02/2018 wolf
01/02/2018 elephant
01/03/2018 parrot
01/03/2018 bird
01/03/2018 fish
(txt 文件实际上更长,但为了更好地理解而进行了简化)。我不确定如何继续:read_csv 或打开(但它会像对象一样读取它
)?.我应该设置分隔符吗?我尝试了几件事,但没有任何效果。
提前致谢
使用 pandas 创建 table:
import ast
import pandas as pd
dates = []
animals = []
lines = []
# Read file lines
with open('file.txt', 'r') as f:
lines = f.readlines()
for l in lines:
# Spliting date and animals
date_string, animals_string = l.split(' ', maxsplit=1)
# Safely evaluate animals list
animals_list = ast.literal_eval(animals_string)
# Duplicate date the amount of animals in that date
dates.extend([date_string] * len(animals_list))
# Append animals
animals.extend(animals_list)
# Create dataframe for the dates and animals
df = pd.DataFrame({'Date': dates, 'Animal': animals})
# Print the dataframe
print(df)
输出:
Date Animal
0 01/01/2018 cat
1 01/01/2018 bear
2 01/01/2018 ant
3 01/02/2018 horse
4 01/02/2018 wolf
5 01/02/2018 elephant
6 01/03/2018 parrot
7 01/03/2018 bird
8 01/03/2018 fish
我有一个 .txt 数据集,格式如下:
01/01/2018 ['cat', 'bear', 'ant']
01/02/2018 ['horse', 'wolf', 'elephant']
01/03/2018 ['parrot', 'bird', 'fish]
我想使用 PYTHON 并将其设置为以下格式的 2 列:
'Date' 'Animal'
01/01/2018 cat
01/01/2018 bear
01/01/2018 ant
01/02/2018 horse
01/02/2018 wolf
01/02/2018 elephant
01/03/2018 parrot
01/03/2018 bird
01/03/2018 fish
(txt 文件实际上更长,但为了更好地理解而进行了简化)。我不确定如何继续:read_csv 或打开(但它会像对象一样读取它 )?.我应该设置分隔符吗?我尝试了几件事,但没有任何效果。
提前致谢
使用 pandas 创建 table:
import ast
import pandas as pd
dates = []
animals = []
lines = []
# Read file lines
with open('file.txt', 'r') as f:
lines = f.readlines()
for l in lines:
# Spliting date and animals
date_string, animals_string = l.split(' ', maxsplit=1)
# Safely evaluate animals list
animals_list = ast.literal_eval(animals_string)
# Duplicate date the amount of animals in that date
dates.extend([date_string] * len(animals_list))
# Append animals
animals.extend(animals_list)
# Create dataframe for the dates and animals
df = pd.DataFrame({'Date': dates, 'Animal': animals})
# Print the dataframe
print(df)
输出:
Date Animal
0 01/01/2018 cat
1 01/01/2018 bear
2 01/01/2018 ant
3 01/02/2018 horse
4 01/02/2018 wolf
5 01/02/2018 elephant
6 01/03/2018 parrot
7 01/03/2018 bird
8 01/03/2018 fish