在 pandas 中的某一行之前设置行而不迭代
Set rows before a certain row in pandas without iterating
我在 pandas 数据框中有一列,我想将 'large' 之前的 'normal' 的任何不间断序列更改为 'large.' 我不想替换all个实例'normal'在'large.'的列 例如我想把下面的左列改成右列:
input
result
small
small
small
small
normal
large
normal
large
large
large
small
small
normal
normal
small
small
normal
large
large
large
迭代很简单:
for i, v in df['input'].iterrows():
if v == 'large':
index = i
while df['input'].iloc(index-1) == 'normal':
df['input'].iloc(index-1) = 'large'
index -= 1
然而,这是低效的。有没有一种简洁的矢量化方法可以做到这一点?
这是使用 bfill 的一种方法。
See Pandas docs for fillna for further information
import pandas as pd
input_frame = pd.Series([
'small',
'small',
'normal',
'normal',
'large',
'small',
'normal',
'small',
'normal',
'large',
]).to_frame()
# Change 'normal' to None in order to use bfill
input_frame = input_frame.replace({'normal': None})
input_frame['bfilled'] = input_frame[0].bfill()
# Change rows that were not bfilled to 'large' back to 'normal'
input_frame.loc[
(input_frame['bfilled'] != 'large') & input_frame[0].isna(),
'bfilled'
] = 'normal'
# Select the result, essentially drop the original column
result = input_frame['bfilled']
示例输出:
>>> input_frame = input_frame.replace({'normal': None})
>>> input_frame['bfilled'] = input_frame[0].bfill()
>>> input_frame
0 bfilled
0 small small
1 small small
2 None large
3 None large
4 large large
5 small small
6 None small <--- This should be changed
7 small small
8 None large
9 large large
>>> # Select the row(s) that need to be changed with
>>> input_frame.loc[(input_frame['bfilled'] != 'large') & input_frame[0].isna()]
0 bfilled
6 None small
尝试将 bfill
与 where
一起使用:
df1.loc[df1['input'].eq("normal")] = np.nan
df1['result'] = df1.fillna(method='bfill').where(df1.notnull() | (df1.shift(-1) != 'small')).fillna('normal')
df1:
input result
0 small small
1 small small
2 NaN large
3 NaN large
4 large large
5 small small
6 NaN normal
7 small small
8 NaN large
9 large large
我在 pandas 数据框中有一列,我想将 'large' 之前的 'normal' 的任何不间断序列更改为 'large.' 我不想替换all个实例'normal'在'large.'的列 例如我想把下面的左列改成右列:
input | result |
---|---|
small | small |
small | small |
normal | large |
normal | large |
large | large |
small | small |
normal | normal |
small | small |
normal | large |
large | large |
迭代很简单:
for i, v in df['input'].iterrows():
if v == 'large':
index = i
while df['input'].iloc(index-1) == 'normal':
df['input'].iloc(index-1) = 'large'
index -= 1
然而,这是低效的。有没有一种简洁的矢量化方法可以做到这一点?
这是使用 bfill 的一种方法。
See Pandas docs for fillna for further information
import pandas as pd
input_frame = pd.Series([
'small',
'small',
'normal',
'normal',
'large',
'small',
'normal',
'small',
'normal',
'large',
]).to_frame()
# Change 'normal' to None in order to use bfill
input_frame = input_frame.replace({'normal': None})
input_frame['bfilled'] = input_frame[0].bfill()
# Change rows that were not bfilled to 'large' back to 'normal'
input_frame.loc[
(input_frame['bfilled'] != 'large') & input_frame[0].isna(),
'bfilled'
] = 'normal'
# Select the result, essentially drop the original column
result = input_frame['bfilled']
示例输出:
>>> input_frame = input_frame.replace({'normal': None})
>>> input_frame['bfilled'] = input_frame[0].bfill()
>>> input_frame
0 bfilled
0 small small
1 small small
2 None large
3 None large
4 large large
5 small small
6 None small <--- This should be changed
7 small small
8 None large
9 large large
>>> # Select the row(s) that need to be changed with
>>> input_frame.loc[(input_frame['bfilled'] != 'large') & input_frame[0].isna()]
0 bfilled
6 None small
尝试将 bfill
与 where
一起使用:
df1.loc[df1['input'].eq("normal")] = np.nan
df1['result'] = df1.fillna(method='bfill').where(df1.notnull() | (df1.shift(-1) != 'small')).fillna('normal')
df1:
input result
0 small small
1 small small
2 NaN large
3 NaN large
4 large large
5 small small
6 NaN normal
7 small small
8 NaN large
9 large large