如何从 pandas 列中的 numpy 数组中删除重复元素?
How to remove duplicate elements from numpy arrays in columns in pandas?
我在 B 列中有以下数据集和 numpy 数组,我想通过删除 B 列中数组的重复元素来制作“new_column”,如图所示。
A B new Column
1 ["A","a","123","123","A"] ["A","a","123"]
2 ["abc","a","1234","123","abc"] ["abc","a","1234","123"]
3 ["abcd","abcd","abcd"] ["abcd"]
4 ["hello","mello"] ["hello","mello"]
5 ["hi","hi","why"] ["hi","why"]
我正在使用以下代码,但它们没有提供所需的 output.Please 帮助。
def u_value(a):
return np.unique(a)
或
def ddpe(a):
a=list(dict.fromkeys(a))
return a
这里的问题值不是列表,而是字符串,所以对列表使用ast.literal_eval
:
import ast
def ddpe(a):
return list(dict.fromkeys(ast.literal_eval(a)))
df['new Column'] = df['B'].apply(ddpe)
我在 B 列中有以下数据集和 numpy 数组,我想通过删除 B 列中数组的重复元素来制作“new_column”,如图所示。
A B new Column
1 ["A","a","123","123","A"] ["A","a","123"]
2 ["abc","a","1234","123","abc"] ["abc","a","1234","123"]
3 ["abcd","abcd","abcd"] ["abcd"]
4 ["hello","mello"] ["hello","mello"]
5 ["hi","hi","why"] ["hi","why"]
我正在使用以下代码,但它们没有提供所需的 output.Please 帮助。
def u_value(a):
return np.unique(a)
或
def ddpe(a):
a=list(dict.fromkeys(a))
return a
这里的问题值不是列表,而是字符串,所以对列表使用ast.literal_eval
:
import ast
def ddpe(a):
return list(dict.fromkeys(ast.literal_eval(a)))
df['new Column'] = df['B'].apply(ddpe)