在多索引列数据框中的列末尾添加一个值
Adding a value at the end of a column in a multindex column dataframe
我有一个简单的问题,可能有一个简单的解决方案,但我找不到任何地方。我有以下多索引列数据框:
mux = pd.MultiIndex.from_product(['A','B','C'], ['Datetime', 'Str', 'Ret']])
dfr = pd.DataFrame(columns=mux)
| A | B | C |
|Datetime|Str|Ret|Datetime|Str|Ret|Datetime|Str|Ret|
我需要在特定子列的末尾逐一添加值。例如,在 A 列子列 Datetime 的末尾添加一个值并保留该行的其余部分不变,然后将另一个值添加到 B 列子列 Str 并再次保留同一行中的其余值不变等等。所以我的问题是:是否可以针对此类数据框中的各个位置?如何?并且是否可以在不知道结束位置的情况下始终在前一个值之后追加一个完整的行而不是一个单独的值?非常感谢您的回答。
IIUC,可以使用.loc
:
idx = len(dfr) # get the index of the next row after the last one
dfr.loc[idx, ('A', 'Datetime')] = pd.to_datetime('2021-09-24')
dfr.loc[idx, ('B', 'Str')] = 'Hello'
dfr.loc[idx, ('C', 'Ret')] = 4.3
输出:
>>> dfr
A B C
Datetime Str Ret Datetime Str Ret Datetime Str Ret
0 2021-09-24 00:00:00 NaN NaN NaN Hello NaN NaN NaN 4.3
更新
I mean for example when I have different number of values in different columns (for example 6 values in column A-Str but only 4 in column B-Datetime) but I don´t really know. In that case what I need is to add the next value in that column after the last one so I need to know the index of the last non Nan value of that particular column so I can use it in your answer because if I use len(dfr) while trying to add value to the column that only has 4 values it will end up in the 7th row instead of the 5th row, this is because one of the columns may have more values than the others.
您可以使用 last_valid_index
轻松完成。创建一个方便的函数 append_to_col
以将值附加到数据框中:
def append_to_col(col, val):
idx = dfr[col].last_valid_index()
dfr.loc[idx+1 if idx is not None else 0, col] = val
# Fill your dataframe
append_to_col(('A', 'Datetime'), '2021-09-24')
append_to_col(('A', 'Datetime'), '2021-09-25')
append_to_col(('B', 'Str'), 'Hello')
append_to_col(('C', 'Ret'), 4.3)
append_to_col(('C', 'Ret'), 8.2)
append_to_col(('A', 'Datetime'), '2021-09-26')
输出:
>>> dfr
A B C
Datetime Str Ret Datetime Str Ret Datetime Str Ret
0 2021-09-24 NaN NaN NaN Hello NaN NaN NaN 4.3
1 2021-09-25 NaN NaN NaN NaN NaN NaN NaN 8.2
2 2021-09-26 NaN NaN NaN NaN NaN NaN NaN NaN
我有一个简单的问题,可能有一个简单的解决方案,但我找不到任何地方。我有以下多索引列数据框:
mux = pd.MultiIndex.from_product(['A','B','C'], ['Datetime', 'Str', 'Ret']])
dfr = pd.DataFrame(columns=mux)
| A | B | C |
|Datetime|Str|Ret|Datetime|Str|Ret|Datetime|Str|Ret|
我需要在特定子列的末尾逐一添加值。例如,在 A 列子列 Datetime 的末尾添加一个值并保留该行的其余部分不变,然后将另一个值添加到 B 列子列 Str 并再次保留同一行中的其余值不变等等。所以我的问题是:是否可以针对此类数据框中的各个位置?如何?并且是否可以在不知道结束位置的情况下始终在前一个值之后追加一个完整的行而不是一个单独的值?非常感谢您的回答。
IIUC,可以使用.loc
:
idx = len(dfr) # get the index of the next row after the last one
dfr.loc[idx, ('A', 'Datetime')] = pd.to_datetime('2021-09-24')
dfr.loc[idx, ('B', 'Str')] = 'Hello'
dfr.loc[idx, ('C', 'Ret')] = 4.3
输出:
>>> dfr
A B C
Datetime Str Ret Datetime Str Ret Datetime Str Ret
0 2021-09-24 00:00:00 NaN NaN NaN Hello NaN NaN NaN 4.3
更新
I mean for example when I have different number of values in different columns (for example 6 values in column A-Str but only 4 in column B-Datetime) but I don´t really know. In that case what I need is to add the next value in that column after the last one so I need to know the index of the last non Nan value of that particular column so I can use it in your answer because if I use len(dfr) while trying to add value to the column that only has 4 values it will end up in the 7th row instead of the 5th row, this is because one of the columns may have more values than the others.
您可以使用 last_valid_index
轻松完成。创建一个方便的函数 append_to_col
以将值附加到数据框中:
def append_to_col(col, val):
idx = dfr[col].last_valid_index()
dfr.loc[idx+1 if idx is not None else 0, col] = val
# Fill your dataframe
append_to_col(('A', 'Datetime'), '2021-09-24')
append_to_col(('A', 'Datetime'), '2021-09-25')
append_to_col(('B', 'Str'), 'Hello')
append_to_col(('C', 'Ret'), 4.3)
append_to_col(('C', 'Ret'), 8.2)
append_to_col(('A', 'Datetime'), '2021-09-26')
输出:
>>> dfr
A B C
Datetime Str Ret Datetime Str Ret Datetime Str Ret
0 2021-09-24 NaN NaN NaN Hello NaN NaN NaN 4.3
1 2021-09-25 NaN NaN NaN NaN NaN NaN NaN 8.2
2 2021-09-26 NaN NaN NaN NaN NaN NaN NaN NaN