Python Pandas: 按列名将更改应用于特定列

Python Pandas: Applying changes to specific columns by column names

所以我有一个 [Python2.7] Pandas 数据框 (df),如下所示:

        name    flag  dummy_D random ID dummy_S dummy_T 
0       Mick  Purple    2     NaN   1     21       32
1       John     Red   NaN    NaN   2    w32       4  
2  Christine     NaN    2     NaN   2    w33       3 
3     Stevie     NaN    4     NaN   2    w34       2 
4    Lindsey     NaN    5     NaN   2    w35      NaN 

我想用以前的值替换用 'dummy' 表示的列中的所有 NaN(并且只有这些列,而数据框的其余部分保持不变)

这是我所做的:

dummycol = [col for col in df.columns if 'dummy' in col] 

for d in dummycol:
      df[d] = df[d].fillna(method = 'pad')

我的问题是:

在 Pandas 中是否有更好的(在编码和内存效率方面)方法来执行此操作而不是浪费内存来创建列表 + 循环遍历它?有一个单线解决方案会很棒!

非常感谢!

您可以这样做,这样您就可以同时在所有这些列上调用 str.startswith on the columns to get the cols of interest and then call fillna

In [152]:
cols = df.columns[df.columns.str.startswith('dummy')]
df[cols] = df[cols].fillna(method='pad')
df

Out[152]:
        name    flag  dummy_D  random  ID dummy_S  dummy_T
0       Mick  Purple        2     NaN   1      21       32
1       John     Red        2     NaN   2     w32        4
2  Christine     NaN        2     NaN   2     w33        3
3     Stevie     NaN        4     NaN   2     w34        2
4    Lindsey     NaN        5     NaN   2     w35        2

这避免了您的列表理解并且只在列上循环一次:

for d in df.columns:
    df[d] = df[d].fillna(method = 'pad') if 'dummy' in d

您可以将条件列表理解与 .loc:

一起使用
_ = [df.loc[:, col].fillna(method='ffill', inplace=True) for col in df if col[:5] == 'dummy']

>>> df
        name    flag  dummy_D  random  ID dummy_S  dummy_T
0       Mick  Purple        2     NaN   1      21       32
1       John     Red        2     NaN   2     w32        4
2  Christine     NaN        2     NaN   2     w33        3
3     Stevie     NaN        4     NaN   2     w34        2
4    Lindsey     NaN        5     NaN   2     w35        2