遍历 pandas 中的行,将值向右移动一位

Iterating over rows in pandas, shifting values to the right by one

我需要遍历数据框以获取第一个日期的值以移动到下一行并从相同的第一个值开始。重要的是,在下面的示例中,我需要新输入停止在列范围内并且不会溢出超过 day5。

当前输出:

            day1  day2  day3  day4  day5
date                                    
2018-03-16   1.0   2.0   3.0   4.0   5.0
2018-03-17   NaN   NaN   NaN   NaN   NaN
2018-03-18   NaN   NaN   NaN   NaN   NaN
2018-03-19   NaN   NaN   NaN   NaN   NaN
2018-03-20   NaN   NaN   NaN   NaN   NaN

期望的输出:

             day1  day2  day3  day4  day5
date                                    
2018-03-16   1.0   2.0   3.0   4.0   5.0
2018-03-17   NaN   1.0   2.0   3.0   4.0
2018-03-18   NaN   NaN   1.0   2.0   3.0
2018-03-19   NaN   NaN   NaN   1.0   2.0
2018-03-20   NaN   NaN   NaN   NaN   1.0

要迭代的示例代码:

data = [1, 2, 3, 4, 5]
columns_name = ['day1', 'day2', 'day3', 'day4', 'day5']

df = pd.DataFrame(data)
df = df.T
df.columns = columns_name

dates = pd.date_range('2018-03-16', '2018-03-20').tolist()
dates_df = pd.DataFrame(dates)
dates_df.columns = ['date']

dfs = [df, dates_df]
combined = pd.concat(dfs, axis=1)
combined = combined.set_index(['date'])

numpy.triu_indices

用三角形索引对第一行进行切片并赋值

v = df.values
i, j = np.triu_indices(v.shape[1])
v[i, j] = v[0][j - i]
df

            day1  day2  day3  day4  day5
date                                    
2018-03-16   1.0   2.0   3.0   4.0   5.0
2018-03-17   NaN   1.0   2.0   3.0   4.0
2018-03-18   NaN   NaN   1.0   2.0   3.0
2018-03-19   NaN   NaN   NaN   1.0   2.0
2018-03-20   NaN   NaN   NaN   NaN   1.0

如果这对你不起作用,因为 df.values 是副本而不是视图:

v = df.values
i, j = np.triu_indices(v.shape[1])
v[i, j] = v[0][j - i]
df.loc[:] = v
df

            day1  day2  day3  day4  day5
date                                    
2018-03-16   1.0   2.0   3.0   4.0   5.0
2018-03-17   NaN   1.0   2.0   3.0   4.0
2018-03-18   NaN   NaN   1.0   2.0   3.0
2018-03-19   NaN   NaN   NaN   1.0   2.0
2018-03-20   NaN   NaN   NaN   NaN   1.0

numpy.lib.stride_tricks.as_strided

不推荐
但还是很有趣

from numpy.lib.stride_tricks import as_strided as strided

n = df.shape[1]
v = np.append([np.nan for _ in range(n - 1)], df.values[0])
s = v.strides[0]

df.loc[:] = strided(v[n - 1:], df.shape, (-s, s))

df

            day1  day2  day3  day4  day5
date                                    
2018-03-16   1.0   2.0   3.0   4.0   5.0
2018-03-17   NaN   1.0   2.0   3.0   4.0
2018-03-18   NaN   NaN   1.0   2.0   3.0
2018-03-19   NaN   NaN   NaN   1.0   2.0
2018-03-20   NaN   NaN   NaN   NaN   1.0