如何去除方括号的每个数组并添加前缀
How to strip each array of the square brackets and add a prefix
我有一个数据框,其中包含我的 kNN 为每个 ID 做出的前 12 个预测,它看起来像这样:
customer_id
prediction
00000dbacae5abe5e2
[677530001, 677515001, 677511001, 677506003, 677501001, 677490001, 677478006, 677478003, 677478002, 677546006, 949551001, 903049003]
0000423b00ade9141
[677511001, 677506003, 677501001, 677490001, 677478006, 677478003, 677478002, 677386001, 677385001, 677760003, 949551001, 826674001]
是否可以从数据框中的每一行(它们是数组)中删除方括号,并在每个预测之前添加零前缀,如下所示:
customer_id
prediction
00000dbacae5abe5e2
0677530001, 0677515001, 0677511001.....
0000423b00ade9141
0677511001, 0677506003, 0677501001.....
我生成这些预测和表格的代码:
n = 12
probas = kNN.predict_proba(X.head())
top_n_idx = np.argsort(probas, axis=1)[:,-n:]
top_n = [kNN.classes_[i] for i in top_n_idx]
results = list(zip(top_n))
results = pd.DataFrame(results)
ids_test.reset_index(drop=True, inplace=True)
results.reset_index(drop=True, inplace=True)
y_test.reset_index(drop=True, inplace=True)
knn_table = pd.concat([ids, results], axis=1, ignore_index=True)
knn_table = knn_table.rename(columns={0: 'customer_id', 1: 'prediction'})
尝试:
df["prediction"] = ("0"+df["prediction"].explode().astype(str)).groupby(level=0).agg(", ".join)
或者 apply
:
df["prediction"] = df["prediction"].apply(lambda x: "0"+", 0".join(map(str,x)))
输出:
>>> df
customer_id prediction
0 00000dbacae5abe5e2 0677530001, 0677515001, 0677511001, 0677506003...
1 0000423b00ade9141 0677511001, 0677506003, 0677501001, 0677490001...
我有一个数据框,其中包含我的 kNN 为每个 ID 做出的前 12 个预测,它看起来像这样:
customer_id | prediction |
---|---|
00000dbacae5abe5e2 | [677530001, 677515001, 677511001, 677506003, 677501001, 677490001, 677478006, 677478003, 677478002, 677546006, 949551001, 903049003] |
0000423b00ade9141 | [677511001, 677506003, 677501001, 677490001, 677478006, 677478003, 677478002, 677386001, 677385001, 677760003, 949551001, 826674001] |
是否可以从数据框中的每一行(它们是数组)中删除方括号,并在每个预测之前添加零前缀,如下所示:
customer_id | prediction |
---|---|
00000dbacae5abe5e2 | 0677530001, 0677515001, 0677511001..... |
0000423b00ade9141 | 0677511001, 0677506003, 0677501001..... |
我生成这些预测和表格的代码:
n = 12
probas = kNN.predict_proba(X.head())
top_n_idx = np.argsort(probas, axis=1)[:,-n:]
top_n = [kNN.classes_[i] for i in top_n_idx]
results = list(zip(top_n))
results = pd.DataFrame(results)
ids_test.reset_index(drop=True, inplace=True)
results.reset_index(drop=True, inplace=True)
y_test.reset_index(drop=True, inplace=True)
knn_table = pd.concat([ids, results], axis=1, ignore_index=True)
knn_table = knn_table.rename(columns={0: 'customer_id', 1: 'prediction'})
尝试:
df["prediction"] = ("0"+df["prediction"].explode().astype(str)).groupby(level=0).agg(", ".join)
或者 apply
:
df["prediction"] = df["prediction"].apply(lambda x: "0"+", 0".join(map(str,x)))
输出:
>>> df
customer_id prediction
0 00000dbacae5abe5e2 0677530001, 0677515001, 0677511001, 0677506003...
1 0000423b00ade9141 0677511001, 0677506003, 0677501001, 0677490001...