如何在 df.iterrows() 期间删除 pandas 数据框中的当前行

Question

我想在迭代期间删除当前行 - using df.iterrows()，如果它的特定列在我的 if 条件下失败。

例如

for index, row in df:
     if row['A'] == 0:
          #remove/drop this row from the df
          del df[index] #I tried this but it gives me an error

这可能是一件非常简单的事情，但我仍然不知道该怎么做。非常感谢您的帮助！

Answer 1

我不知道这是不是伪代码但是你不能像这样删除一行，你可以drop它:

In [425]:

df = pd.DataFrame({'a':np.random.randn(5), 'b':np.random.randn(5)})
df
Out[425]:
          a         b
0 -1.348112  0.583603
1  0.174836  1.211774
2 -2.054173  0.148201
3 -0.589193 -0.369813
4 -1.156423 -0.967516
In [426]:

for index, row in df.iterrows():
    if row['a'] > 0:
        df.drop(index, inplace=True)
In [427]:

df
Out[427]:
          a         b
0 -1.348112  0.583603
2 -2.054173  0.148201
3 -0.589193 -0.369813
4 -1.156423 -0.967516

如果您只想过滤掉那些行，您可以执行布尔索引：

df[df['a'] <=0]

会达到同样的目的

Answer 2

我尝试了 @EdChum 自定义 pandas.DataFrame 解决方案，但我没有让它工作，因为出现错误：KeyError: '[78] not found in axis'。依此类推，如果您遇到相同的错误，则可以修复在每次 .iterrows() 迭代时将数据帧的索引丢弃在指定索引上的问题。

使用的数据帧是从 investpy which contains all the equities/stock data indexed in Investing.com, and the print function is the one implemented in pprint 中检索到的。无论如何，这是让它工作的代码片段：

In [1]:

import investpy
from pprint import pprint

In [2]:

df = investpy.get_equities()

pprint(df.head())

Out [2]:

     country               name                           full_name  \
0  argentina            Tenaris                             Tenaris   
1  argentina       PETROBRAS ON     Petroleo Brasileiro - Petrobras   
2  argentina     GP Fin Galicia          Grupo Financiero Galicia B   
3  argentina  Ternium Argentina  Ternium Argentina Sociedad Anónima   
4  argentina      Pampa Energía                  Pampa Energía S.A.   

                      tag          isin     id currency  
0       tenaris?cid=13302  LU0156801721  13302      ARS  
1  petrobras-on?cid=13303  BRPETRACNOR9  13303      ARS  
2          gp-fin-galicia  ARP495251018  13304      ARS  
3                 siderar  ARSIDE010029  13305      ARS  
4           pampa-energia  ARP432631215  13306      ARS  

In [3]:

pprint(df[df['tag'] == 'koninklijke-philips-electronics'])

Out [3]:

      country                     name                   full_name  \
78  argentina  Koninklijke Philips DRC  Koninklijke Philips NV DRC   

                                tag          isin     id currency  
78  koninklijke-philips-electronics  ARDEUT110558  30044      ARS  

In [4]:

for index, row in df.iterrows():
    if row['tag'] == 'koninklijke-philips-electronics':
        df.drop(df.index[index], inplace=True)

In [5]:

pprint(df[df['tag'] == 'koninklijke-philips-electronics'])

Out [5]:

Empty DataFrame
Columns: [country, name, full_name, tag, isin, id, currency]
Index: []

希望这对某人有所帮助！无论如何也谢谢你的原始答案 @EdChum!

如何在 df.iterrows() 期间删除 pandas 数据框中的当前行

How to delete the current row in pandas dataframe during df.iterrows()

python

pandas