在 pandas 中的某一行之前设置行而不迭代

Set rows before a certain row in pandas without iterating

我在 pandas 数据框中有一列,我想将 'large' 之前的 'normal' 的任何不间断序列更改为 'large.' 我不想替换all个实例'normal'在'large.'的列 例如我想把下面的左列改成右列:

input result
small small
small small
normal large
normal large
large large
small small
normal normal
small small
normal large
large large

迭代很简单:

for i, v in df['input'].iterrows():
    if v == 'large':
        index = i
        while df['input'].iloc(index-1) == 'normal':
            df['input'].iloc(index-1) = 'large'
            index -= 1

然而,这是低效的。有没有一种简洁的矢量化方法可以做到这一点?

这是使用 bfill 的一种方法。

See Pandas docs for fillna for further information

import pandas as pd

input_frame = pd.Series([
    'small',
    'small',
    'normal',
    'normal',
    'large',
    'small',
    'normal',
    'small',
    'normal',
    'large',
]).to_frame()
# Change 'normal' to None in order to use bfill
input_frame = input_frame.replace({'normal': None})
input_frame['bfilled'] = input_frame[0].bfill()
# Change rows that were not bfilled to 'large' back to 'normal'
input_frame.loc[
    (input_frame['bfilled'] != 'large') & input_frame[0].isna(),
    'bfilled'
] = 'normal'
# Select the result, essentially drop the original column
result = input_frame['bfilled']

示例输出:

>>> input_frame = input_frame.replace({'normal': None})
>>> input_frame['bfilled'] = input_frame[0].bfill()
>>> input_frame
       0 bfilled
0  small   small
1  small   small
2   None   large
3   None   large
4  large   large
5  small   small
6   None   small  <--- This should be changed
7  small   small
8   None   large
9  large   large
>>> # Select the row(s) that need to be changed with
>>> input_frame.loc[(input_frame['bfilled'] != 'large') & input_frame[0].isna()]
      0 bfilled
6  None   small

尝试将 bfillwhere 一起使用:

df1.loc[df1['input'].eq("normal")]  = np.nan
df1['result'] = df1.fillna(method='bfill').where(df1.notnull() | (df1.shift(-1) != 'small')).fillna('normal')

df1:

    input   result
0   small   small
1   small   small
2   NaN     large
3   NaN     large
4   large   large
5   small   small
6   NaN     normal
7   small   small
8   NaN     large
9   large   large