生成行以在数据框的列中生成序列

Question

我正在尝试根据特定列中的值生成新行。在当前数据中，您可以看到 'days_left' 列没有所有顺序值。

current = {'assignment': [1,1,1,1,2,2,2,2,2], 'days_left': [1, 2, 5, 9,1, 3, 4, 8, 13]}
dfcurrent = pd.DataFrame(data=current)
dfcurrent

虽然我想在该数据框中生成行，以便为每个 'assignment' 创建 'days_left' 的顺序列表。请在下面查看所需的输出：

   desired = {'assignment': [1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,2],
           'days_left': [1,2,3,4,5,6,7,8,9,1,2,3,4,5,6,7,8,9,10,11,12,13]}
dfdesired = pd.DataFrame(data=desired)
dfdesired

注意：原始数据要大得多，还有其他列，但我只是针对这个问题进行了简化。

你能帮我解决这个问题吗？

非常感谢您！

Answer 1

您可以遍历当前数据框的行并创建一个新的数据框。对于每个 days_left 范围，将当前行复制到新数据框并更新 days_left 列值。

试试这个代码：

import pandas as pd

current = {'assignment': [1,1,1,1,2,2,2,2,2], 'days_left': [1, 2, 5, 9, 1, 3, 4, 8, 13]}
dfc = pd.DataFrame(data=current)

dfd = pd.DataFrame()  # new dataframe

for r in range(1,len(dfc)):  # start at 2nd row
   for i in range(dfc.iloc[r-1]['days_left'],dfc.iloc[r]['days_left']): # fill gap of missing numbers
      dfd = dfd.append(dfc.iloc[r]) # copy row
      dfd.reset_index(drop=True, inplace=True)  # prevent index duplication
      dfd.loc[len(dfd)-1, 'days_left'] = i  # update column value
   if r == len(dfc)-1 or dfc.iloc[r+1]['assignment']!=dfc.iloc[r]['assignment']:  # last entry in assignment
      dfd = dfd.append(dfc.iloc[r]) # copy row
      dfd.reset_index(drop=True, inplace=True)  # prevent index duplication

dfd = dfd.astype(int)  # convert all data to integers
print(dfd.to_string(index=False))

输出

 assignment  days_left
          1          1
          1          2
          1          3
          1          4
          1          5
          1          6
          1          7
          1          8
          1          9
          2          1
          2          2
          2          3
          2          4
          2          5
          2          6
          2          7
          2          8
          2          9
          2         10
          2         11
          2         12
          2         13

生成行以在数据框的列中生成序列

Generate rows based to make a sequence in a column of a dataframe

for-loop

row

append

while-loop