如何合并两个连续的行并形成一个新列？

Question

我有一个 DF（从会计软件中收集的），看起来像这样。


    Serial || Date || Particulars || Price
    --------------------------------------
      1    || 0308 || Andrew      || 100
      2    || NaN  || Gloves      || NaN
      3    || 0408 || Johnson     || 50
      4    || NaN  || Wicket      || NaN

我想合并连续的 2 行并创建一个新列 'Product'，其中第 2 行 'Particulars' 值。预期的输出应该类似于 ---

    Serial || Date || Particulars || Price || Product
    -------------------------------------------------
      1    || 0308 || Andrew      || 100   || Gloves
      3    || 0408 || Johnson     || 50    || Wicket

如何使用 pandas 实现此目的？

Answer 1

这些答案基于数据帧的格式，该格式始终呈现遵循 OP 呈现的相同模式的行对。第一行显示一个人，第二行显示一个产品和日期，价格列为 NaN。

使用 `shift` 然后 `dropna`

df.assign(Product=df.Particulars.shift(-1)).dropna()

   Serial   Date Particulars  Price Product
0       1  308.0      Andrew  100.0  Gloves
2       3  408.0     Johnson   50.0  Wicket

`join`

相同但不同

df.join(df.Particulars.shift(-1).rename('Product')).dropna()

详情

每个请求

df.Particulars.shift(-1) 将 Particulars 列的所有成员后退一行

0     Gloves
1    Johnson
2     Wicket
3        NaN
Name: Particulars, dtype: object

当我将其分配给现有数据框时df.assign(Product=df.Particulars.shift(-1))，它会添加一个具有新名称的列'Product'，其中的值是移位的细节。

   Serial   Date Particulars  Price  Product
0       1  308.0      Andrew  100.0   Gloves
1       2    NaN      Gloves    NaN  Johnson
2       3  408.0     Johnson   50.0   Wicket
3       4    NaN      Wicket    NaN      NaN

剩下的就是删除具有 NaN 值的行，我们得到上面显示的内容。

灵感来自

我不需要依赖 dropna 如果我隔行切片

df.assign(Product=df.Particulars.shift(-1))[::2]

或者更简洁

df[::2].assign(Product=[*df.Particulars[1::2]])

一种方法

这是我第一个想到的方法，很恶心

i = np.flatnonzero(df.Price.notna())
j = i + 1

df.iloc[i].assign(Product=df.iloc[j].Particulars.values)

   Serial   Date Particulars  Price Product
0       1  308.0      Andrew  100.0  Gloves
2       3  408.0     Johnson   50.0  Wicket

Answer 2

丑陋但直截了当：

ans = df[~pd.isna(df.Date)].copy()
ans['product'] = df[pd.isna(df.Date)].Particulars.values

输出

        Date  Particulars  Price  product
Serial                                  
1       308.0      Andrew  100.0  Gloves
3       408.0     Johnson   50.0  Wicket

Answer 3

尝试 shift 并删除偶数行：

df['Product'] = df['Particulars'].shift(-1)
df = df.loc[0:len(df):2]

如何合并两个连续的行并形成一个新列？

How to merge two consecutive rows and form a new column?

pandas

numpy-ndarray

使用 `shift` 然后 `dropna`

`join`

详情

灵感来自

一种方法

输出

如何合并两个连续的行并形成一个新列？

How to merge two consecutive rows and form a new column?

pandas

numpy-ndarray

使用 shift 然后 dropna

join

详情

灵感来自

一种方法

输出

使用 `shift` 然后 `dropna`

`join`