如果我的行没有相同的 cols 编号,我如何将我的 .txt 数据导入数据框?
How can I import my .txt data to a dataframe if my rows doesn't have the same cols number?
上下文:
我有一个包含一些数据的 .txt。
我的数据是这样的:
|field1|field2|field3|field4|field5|:
|field1|field2|field3|field4|
|field1|field2|field3|
|field1|field2|field3|field4|field5|
|field1|field2|field3|field4|
|field1|field2|field3|field4|
该字段的值可以是数字或字符串,
而且我的文件不符合定义的模式...
这些行不像 "ABCABCABC..." 它们更像 "AMASOAUSAHA"
我需要找到一种方法将我的数据导入 .dataframe()
,这样我就可以从 (row[i], col[j])
的字段中获取值并在 [=13] 中替换它=].
问题:
I have a file that doesn't have the same number of cols, but respects
the use of the same separator over the rows. So, is there any way to
import my data to a dataframe if my rows doesn't have the same cols
number?
df <- read.table(stringsAsFactors = F, fill = T, sep = "|", text = "
|field1|field2|field3|field4|field5|
|field1|field2|field3|field4|
|field1|field2|field3|
|field1|field2|field3|field4|field5|
|field1|field2|field3|field4|
|field1|field2|field3|field4|")
df[2, 2] <- "foo"
df
# V1 V2 V3 V4 V5 V6 V7
# 1 NA field1 field2 field3 field4 field5 NA
# 2 NA foo field2 field3 field4 NA
# 3 NA field1 field2 field3 NA
# 4 NA field1 field2 field3 field4 field5 NA
# 5 NA field1 field2 field3 field4 NA
# 6 NA field1 field2 field3 field4 NA
... 在 R 中(您已标记)。
使用Python,如果data
看起来像
field1|field2|field3|field4
field1|field2|field3
field1|field2|field3|field4|field5
field1|field2|field3|field4
field1|field2|field3|field4
然后
import pandas as pd
import csv
with open('data', 'rb') as f:
df = pd.DataFrame((row for row in csv.reader(f, delimiter='|'))).fillna('')
df.iloc[1, 1] = 'foo'
print(df)
产量
0 1 2 3 4
0 field1 field2 field3 field4
1 field1 foo field3
2 field1 field2 field3 field4 field5
3 field1 field2 field3 field4
4 field1 field2 field3 field4
上下文:
我有一个包含一些数据的 .txt。
我的数据是这样的:
|field1|field2|field3|field4|field5|:
|field1|field2|field3|field4|
|field1|field2|field3|
|field1|field2|field3|field4|field5|
|field1|field2|field3|field4|
|field1|field2|field3|field4|
该字段的值可以是数字或字符串, 而且我的文件不符合定义的模式...
这些行不像 "ABCABCABC..." 它们更像 "AMASOAUSAHA"
我需要找到一种方法将我的数据导入 .dataframe()
,这样我就可以从 (row[i], col[j])
的字段中获取值并在 [=13] 中替换它=].
问题:
I have a file that doesn't have the same number of cols, but respects the use of the same separator over the rows. So, is there any way to import my data to a dataframe if my rows doesn't have the same cols number?
df <- read.table(stringsAsFactors = F, fill = T, sep = "|", text = "
|field1|field2|field3|field4|field5|
|field1|field2|field3|field4|
|field1|field2|field3|
|field1|field2|field3|field4|field5|
|field1|field2|field3|field4|
|field1|field2|field3|field4|")
df[2, 2] <- "foo"
df
# V1 V2 V3 V4 V5 V6 V7
# 1 NA field1 field2 field3 field4 field5 NA
# 2 NA foo field2 field3 field4 NA
# 3 NA field1 field2 field3 NA
# 4 NA field1 field2 field3 field4 field5 NA
# 5 NA field1 field2 field3 field4 NA
# 6 NA field1 field2 field3 field4 NA
... 在 R 中(您已标记)。
使用Python,如果data
看起来像
field1|field2|field3|field4
field1|field2|field3
field1|field2|field3|field4|field5
field1|field2|field3|field4
field1|field2|field3|field4
然后
import pandas as pd
import csv
with open('data', 'rb') as f:
df = pd.DataFrame((row for row in csv.reader(f, delimiter='|'))).fillna('')
df.iloc[1, 1] = 'foo'
print(df)
产量
0 1 2 3 4
0 field1 field2 field3 field4
1 field1 foo field3
2 field1 field2 field3 field4 field5
3 field1 field2 field3 field4
4 field1 field2 field3 field4