如果在任一列中找到,则删除两个 float64 值 Pandas
Remove both float64 values if found in either columns Pandas
如果找到非唯一值,我将尝试删除所有行,示例如下:
N1 N2
1 2 4
2 4 5
3 6 6
4 8 7
5 10 8
6 12 10
7 NaN 12
8 NaN 14
所以在这种情况下,我想要的值是 2 5 7 和 14。而且一列比另一列长,因此必须忽略 NaN。我基本上想找到重复值并从 N1 和 N2 中删除它们。这是我试过的:
df[~df.N1.isin(['N2'])]
出现一些错误。谢谢你的帮助。
凯文
实现方法如下:
from io import StringIO
import pandas as pd
s = '''N1 N2
2 4
4 5
6 6
8 7
10 8
12 10
NaN 12
NaN 14'''
ss = StringIO(s)
df = pd.read_csv(ss, sep=r'\s+')
df = df.dropna()
df[~df.N1.isin(['N2'])]
输出:
根据您发布的值创建数据框:
import numpy as np
import pandas as pd
df = pd.DataFrame({'N1':[2, 4, 6, 8, 10, 12, np.nan, np.nan],
'N2':[4,5,6,7,8,10,12,14]})
找出共同的价值观:
common = list(set(df['N1']) & set(df['N2']))
排除 N1
或 N2
具有其中之一的所有行:
df[(~df["N1"].isin(common)) | (~df["N2"].isin(common))]
更新
common = set(df['N1']) & set(df['N2'])
result = list(set(df['N2'])-common) + list(set(df['N1'])-common)
result = [x for x in result if x==x]
快速解决方案:
>> df.stack().drop_duplicates(keep=False).unstack()
N1 N2
1 2.0 NaN
2 NaN 5.0
4 NaN 7.0
8 NaN 14.0
作为列表:
>> df.stack().drop_duplicates(keep=False).values.tolist()
[2.0, 5.0, 7.0, 14.0]
如果找到非唯一值,我将尝试删除所有行,示例如下:
N1 N2
1 2 4
2 4 5
3 6 6
4 8 7
5 10 8
6 12 10
7 NaN 12
8 NaN 14
所以在这种情况下,我想要的值是 2 5 7 和 14。而且一列比另一列长,因此必须忽略 NaN。我基本上想找到重复值并从 N1 和 N2 中删除它们。这是我试过的:
df[~df.N1.isin(['N2'])]
出现一些错误。谢谢你的帮助。
凯文
实现方法如下:
from io import StringIO
import pandas as pd
s = '''N1 N2
2 4
4 5
6 6
8 7
10 8
12 10
NaN 12
NaN 14'''
ss = StringIO(s)
df = pd.read_csv(ss, sep=r'\s+')
df = df.dropna()
df[~df.N1.isin(['N2'])]
输出:
根据您发布的值创建数据框:
import numpy as np
import pandas as pd
df = pd.DataFrame({'N1':[2, 4, 6, 8, 10, 12, np.nan, np.nan],
'N2':[4,5,6,7,8,10,12,14]})
找出共同的价值观:
common = list(set(df['N1']) & set(df['N2']))
排除 N1
或 N2
具有其中之一的所有行:
df[(~df["N1"].isin(common)) | (~df["N2"].isin(common))]
更新
common = set(df['N1']) & set(df['N2'])
result = list(set(df['N2'])-common) + list(set(df['N1'])-common)
result = [x for x in result if x==x]
快速解决方案:
>> df.stack().drop_duplicates(keep=False).unstack()
N1 N2
1 2.0 NaN
2 NaN 5.0
4 NaN 7.0
8 NaN 14.0
作为列表:
>> df.stack().drop_duplicates(keep=False).values.tolist()
[2.0, 5.0, 7.0, 14.0]