Python 3 Pandas Filter/Extract 按多个列值,包括 <> 0
Python 3 Pandas Filter/Extract by multiple column values, including <> 0
使用来自 USASPENDING.gov 的公开可用的 csv 文件。能够从 Navy 中提取数据,但不知道添加第二个过滤器以排除具有 Dollarsobligated = 0
的所有记录的正确语法。
代码是:
import pandas as pd
df = pd.read_csv("2016_DOD_Contracts_Full_20160915.csv")
df.columns = [c.replace(' ','_') for c in df.columns]
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.dollarsobligated <> 0)]
# Export result to CSV
new_df.to_csv('example15.csv')
我收到一条错误消息,提示 <>
语法无效。网络上还没有 'does not equal 0' 的示例。
我认为您需要将 boolean indexing
, because in Python3, <> was removed, thanks 中的 <>
替换为 !=
。
您也可以使用 str.replace
:
df.columns = df.columns.str.replace(' ','_')
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.Dollarsobligated != 0)]
样本:
df = pd.DataFrame({'mod agency':['1700: DEPT OF THE NAVY',
'1700: DEPT OF THE NAVY',
'1800: DEPT OF THE NAVY'],
'Dollarsobligated':[1,0,0],
'C':[7,8,9]})
print (df)
C Dollarsobligated mod agency
0 7 1 1700: DEPT OF THE NAVY
1 8 0 1700: DEPT OF THE NAVY
2 9 0 1800: DEPT OF THE NAVY
df.columns = df.columns.str.replace(' ','_')
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.Dollarsobligated != 0)]
print (new_df)
C Dollarsobligated mod_agency
0 7 1 1700: DEPT OF THE NAVY
您必须使用“!=”而不是“<>”
使用来自 USASPENDING.gov 的公开可用的 csv 文件。能够从 Navy 中提取数据,但不知道添加第二个过滤器以排除具有 Dollarsobligated = 0
的所有记录的正确语法。
代码是:
import pandas as pd
df = pd.read_csv("2016_DOD_Contracts_Full_20160915.csv")
df.columns = [c.replace(' ','_') for c in df.columns]
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.dollarsobligated <> 0)]
# Export result to CSV
new_df.to_csv('example15.csv')
我收到一条错误消息,提示 <>
语法无效。网络上还没有 'does not equal 0' 的示例。
我认为您需要将 boolean indexing
, because in Python3, <> was removed, thanks <>
替换为 !=
。
您也可以使用 str.replace
:
df.columns = df.columns.str.replace(' ','_')
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.Dollarsobligated != 0)]
样本:
df = pd.DataFrame({'mod agency':['1700: DEPT OF THE NAVY',
'1700: DEPT OF THE NAVY',
'1800: DEPT OF THE NAVY'],
'Dollarsobligated':[1,0,0],
'C':[7,8,9]})
print (df)
C Dollarsobligated mod agency
0 7 1 1700: DEPT OF THE NAVY
1 8 0 1700: DEPT OF THE NAVY
2 9 0 1800: DEPT OF THE NAVY
df.columns = df.columns.str.replace(' ','_')
new_df = df[(df.mod_agency == '1700: DEPT OF THE NAVY') & (df.Dollarsobligated != 0)]
print (new_df)
C Dollarsobligated mod_agency
0 7 1 1700: DEPT OF THE NAVY
您必须使用“!=”而不是“<>”