如何检查一列的字符串,以及使用 pandas 更改另一列的字符串
How to check string of one column ,and change of string of another column using pandas
我有很大的 csv 文件,如下所示的 PF 样本数据
Name,value,data
jack,X16206,hi this is X16206
Riti,X1620600,I want to change X16206.
Aadii,X16206,New value is X1620600.
jan,abc700134,something new 20600.
我有一个值 X16206(字母数字),在值列和数据列中有时添加 00,有时不添加 00
我想检查值列中的字符串,并将数据列中句子中的字符串更改为 'exact'
预期输出:
Name,value,data
jack,X16206,hi this is [exact]
Riti,X1620600,I want to change [exact].
Aadii,X16206,New value is [exact].
jan,abc700134,something new 20600.
到目前为止我尝试了什么
df1['num'] = np.where(df1['value'].str.len().isin({6,8}), 1, -1)
def myfn2(row):
if row['num']==1:
row['New_data']=row['data'].replace(row['value'],'[exact]')
else:
row['New_data']=row['data']
return row
df1=df1.apply(myfn2,axis=1)
我得到的输出
Name,value,data,num,New_data
jack,X16206,hi this is X16206,1,hi this is [exact]
Riti,X1620600,I want to change X16206,1,I want to change X16206.
Aadii,X16206,New value is X1620600,1,New value is [exact]00.
jan,abc700134,something new 20600,-1,something new 20600.
任何人都可以帮我怎么做吗?
尝试:
import re
def fn(x):
v = re.sub(r"(?<=\d{4})00$", "", x["value"])
return re.sub(r"(" + v + "0?0?)", r"[exact]", x["data"])
df["data"] = df.apply(fn, axis=1)
print(df)
打印:
Name value data
0 jack X16206 hi this is [exact]
1 Riti X1620600 I want to change [exact].
2 Aadii X16206 New value is [exact].
3 jan abc700134 something new 20600.
我有很大的 csv 文件,如下所示的 PF 样本数据
Name,value,data
jack,X16206,hi this is X16206
Riti,X1620600,I want to change X16206.
Aadii,X16206,New value is X1620600.
jan,abc700134,something new 20600.
我有一个值 X16206(字母数字),在值列和数据列中有时添加 00,有时不添加 00
我想检查值列中的字符串,并将数据列中句子中的字符串更改为 'exact'
预期输出:
Name,value,data
jack,X16206,hi this is [exact]
Riti,X1620600,I want to change [exact].
Aadii,X16206,New value is [exact].
jan,abc700134,something new 20600.
到目前为止我尝试了什么
df1['num'] = np.where(df1['value'].str.len().isin({6,8}), 1, -1)
def myfn2(row):
if row['num']==1:
row['New_data']=row['data'].replace(row['value'],'[exact]')
else:
row['New_data']=row['data']
return row
df1=df1.apply(myfn2,axis=1)
我得到的输出
Name,value,data,num,New_data
jack,X16206,hi this is X16206,1,hi this is [exact]
Riti,X1620600,I want to change X16206,1,I want to change X16206.
Aadii,X16206,New value is X1620600,1,New value is [exact]00.
jan,abc700134,something new 20600,-1,something new 20600.
任何人都可以帮我怎么做吗?
尝试:
import re
def fn(x):
v = re.sub(r"(?<=\d{4})00$", "", x["value"])
return re.sub(r"(" + v + "0?0?)", r"[exact]", x["data"])
df["data"] = df.apply(fn, axis=1)
print(df)
打印:
Name value data
0 jack X16206 hi this is [exact]
1 Riti X1620600 I want to change [exact].
2 Aadii X16206 New value is [exact].
3 jan abc700134 something new 20600.