如何使用 pandas 拆分数据框
How to split Dataframe using pandas
我的列值类似于 1ST:[70]2ND:[71]3RD:[71]S1:[71]4TH:[77]5TH:[78]6TH:[78]S2:[78] FIN:[75] 在 csv 中,需要将所有合并的内容提取到单独的列中,如何做 pandas
需要O/p赞:
1ST 2ND 3RD S1 4TH 5TH 6TH S2 FIN
0 70 71 71 71 77 78 78 78 75
我在这里粘贴了该列值的一些行。
1ST:[80]2ND:[79]3RD:[75]S1:[78]4TH:[76]5TH:[80]6TH:[87]S2:[81]FIN:[80]
1ST:[75]2ND:[74]3RD:[81]S1:[77]4TH:[80]5TH:[78]6TH:[87]S2:[82]FIN:[80]
1ST:[58]2ND:[54]3RD:[65]S1:[59]4TH:[80]5TH:[72]6TH:[74]S2:[75]FIN:[67]
1ST:[90]2ND:[91]3RD:[82]S1:[88]4TH:[84]5TH:[88]6TH:[87]S2:[86]FIN:[87]
1ST:[83]2ND:[79]3RD:[82]S1:[81]4TH:[85]5TH:[84]6TH:[90]S2:[86]FIN:[84]
在数据框中,我有一列包含以上值。我需要分成不同的列,值将在行中。
您的问题似乎令人困惑。您的 objective 从解决方案结构方面看是什么?
你的文件有这样的价值
1ST:[70]2ND:[71]3RD:[71]S1:[71]4TH:[77]5TH:[78]6TH:[78]S2:[78]FIN:[75]
你想要的输出应该是这样的
1ST 2ND 3RD S1 4TH 5TH 6TH S2 FIN
0 70 71 71 71 77 78 78 78 75
或者像这样
0 1
0 1ST 70
1 2ND 71
2 3RD 71
3 S1 71
4 4TH 77
5 5TH 78
6 6TH 78
7 S2 78
8 FIN 75
现在,从给定输入获取输出的方法
import pandas as pd
# consider your input is string (you can use csv)
file_val = "1ST:[70]2ND:[71]3RD:[71]S1:[71]4TH:[77]5TH:[78]6TH:[78]S2:[78]FIN:[75]"
df = pd.DataFrame([i.split(':') for i in file_val.replace('[',"").split(']') if i!=""])
print(df)
0 1
0 1ST 70
1 2ND 71
2 3RD 71
3 S1 71
4 4TH 77
5 5TH 78
6 6TH 78
7 S2 78
8 FIN 75
请分享 csv 文件或几行的快照,以便我能够根据您的要求生成输出。
根据您的格式返回最终解决方案
# reading data
with open('sample.csv') as f:
dat = file.read(f)
# spliting rows
dat1 = dat.split(\n)
# method to convert each row to dict
def row_to_dict(row):
return dict([i.split(":") for i in row.replace('[',"").split(']') if i!=""])
# now apply method to each row of dat1 and create single dataframe out of it
# that is nothing but final output
res = pd.DataFrame(map(lambda x:row_to_dict(x), dat1))
print(res)
1ST 2ND 3RD 4TH 5TH 6TH FIN S1 S2
0 80 79 75 76 80 87 80 78 81
1 75 74 81 80 78 87 80 77 82
2 58 54 65 80 72 74 67 59 75
3 90 91 82 84 88 87 87 88 86
4 83 79 82 85 84 90 84 81 86
在R中找到上面的结果
a1=read.csv("c:/Users/Dell/Desktop/NewText.txt",header = FALSE)
a1$V1=as.character(a1$V1)
g1=NULL
g2=NULL
l=list()
for(i in 1:nrow(a1))
{
g1=strsplit(a1$V1[i],"]")
g1=strsplit(g1[[1]],":\[")
g2=data.frame(g1)
g2[] <- lapply(g2, as.character)
colnames(g2)=g2[1,]
g2=g2[-1,]
l[[i]]=g2
}
l=do.call('rbind',l)
我的列值类似于 1ST:[70]2ND:[71]3RD:[71]S1:[71]4TH:[77]5TH:[78]6TH:[78]S2:[78] FIN:[75] 在 csv 中,需要将所有合并的内容提取到单独的列中,如何做 pandas
需要O/p赞:
1ST 2ND 3RD S1 4TH 5TH 6TH S2 FIN
0 70 71 71 71 77 78 78 78 75
我在这里粘贴了该列值的一些行。
1ST:[80]2ND:[79]3RD:[75]S1:[78]4TH:[76]5TH:[80]6TH:[87]S2:[81]FIN:[80]
1ST:[75]2ND:[74]3RD:[81]S1:[77]4TH:[80]5TH:[78]6TH:[87]S2:[82]FIN:[80]
1ST:[58]2ND:[54]3RD:[65]S1:[59]4TH:[80]5TH:[72]6TH:[74]S2:[75]FIN:[67]
1ST:[90]2ND:[91]3RD:[82]S1:[88]4TH:[84]5TH:[88]6TH:[87]S2:[86]FIN:[87]
1ST:[83]2ND:[79]3RD:[82]S1:[81]4TH:[85]5TH:[84]6TH:[90]S2:[86]FIN:[84]
在数据框中,我有一列包含以上值。我需要分成不同的列,值将在行中。
您的问题似乎令人困惑。您的 objective 从解决方案结构方面看是什么?
你的文件有这样的价值
1ST:[70]2ND:[71]3RD:[71]S1:[71]4TH:[77]5TH:[78]6TH:[78]S2:[78]FIN:[75]
你想要的输出应该是这样的
1ST 2ND 3RD S1 4TH 5TH 6TH S2 FIN
0 70 71 71 71 77 78 78 78 75
或者像这样
0 1
0 1ST 70
1 2ND 71
2 3RD 71
3 S1 71
4 4TH 77
5 5TH 78
6 6TH 78
7 S2 78
8 FIN 75
现在,从给定输入获取输出的方法
import pandas as pd
# consider your input is string (you can use csv)
file_val = "1ST:[70]2ND:[71]3RD:[71]S1:[71]4TH:[77]5TH:[78]6TH:[78]S2:[78]FIN:[75]"
df = pd.DataFrame([i.split(':') for i in file_val.replace('[',"").split(']') if i!=""])
print(df)
0 1
0 1ST 70
1 2ND 71
2 3RD 71
3 S1 71
4 4TH 77
5 5TH 78
6 6TH 78
7 S2 78
8 FIN 75
请分享 csv 文件或几行的快照,以便我能够根据您的要求生成输出。
根据您的格式返回最终解决方案
# reading data
with open('sample.csv') as f:
dat = file.read(f)
# spliting rows
dat1 = dat.split(\n)
# method to convert each row to dict
def row_to_dict(row):
return dict([i.split(":") for i in row.replace('[',"").split(']') if i!=""])
# now apply method to each row of dat1 and create single dataframe out of it
# that is nothing but final output
res = pd.DataFrame(map(lambda x:row_to_dict(x), dat1))
print(res)
1ST 2ND 3RD 4TH 5TH 6TH FIN S1 S2
0 80 79 75 76 80 87 80 78 81
1 75 74 81 80 78 87 80 77 82
2 58 54 65 80 72 74 67 59 75
3 90 91 82 84 88 87 87 88 86
4 83 79 82 85 84 90 84 81 86
在R中找到上面的结果
a1=read.csv("c:/Users/Dell/Desktop/NewText.txt",header = FALSE)
a1$V1=as.character(a1$V1)
g1=NULL
g2=NULL
l=list()
for(i in 1:nrow(a1))
{
g1=strsplit(a1$V1[i],"]")
g1=strsplit(g1[[1]],":\[")
g2=data.frame(g1)
g2[] <- lapply(g2, as.character)
colnames(g2)=g2[1,]
g2=g2[-1,]
l[[i]]=g2
}
l=do.call('rbind',l)