使用字典将所选值分配给 Pandas 系列

Question

我有一个数据框，想用存储在一系列单独字典中的新值覆盖其中一行。

这是我所说的类似设置：

In [1]: import pandas as pd

In [2]: data = {'A': range(3), 'B': range(3, 0, -1), 'C': [4, 0, 2]}

In [3]: df = pd.DataFrame(data)

In [4]: df
Out[4]: 
   A  B  C
0  0  3  4
1  1  2  0
2  2  1  2    # Let's say I want to put the new values in this row.

In [5]: d1 = {'A': 1, 'C': 1}

In [6]: d2 = {'B': 2}

想要的结果：

In [11]: df
Out[11]: 
   A  B  C
0  0  3  4
1  1  2  0
2  1  2  1

基本上，我需要一种方法来从插入 Pandas 系列的多个词典中获取值。

我尝试过的：

In [15]: df.loc[2, :] = d1

In [16]: df.loc[2, :] = d2

（无效）

In [24]:     def merge_dicts(list_of_dicts):
    ...:         """Merges the dictionaries into one."""
    ...: 
    ...:         new_dict = list_of_dicts[0].copy()
    ...:         for e in list_of_dicts[1:]:
    ...:             new_dict.update(e)
    ...:         return new_dict
    ...: 
    ...: 

In [25]: merge_dicts([d1, d2])
Out[25]: {'A': 1, 'C': 1, 'B': 2}

In [26]: df.loc[2, :] = merge_dicts([d1, d2])

（可行，但必须是更简单的方法）

请注意，我使用的是 Python 3.4 或更低版本，因此无法执行以下操作：

In [10]: df.loc[2,:] = {**d1, **d2}

更新：

另一个低于标准的解决方案：

In [9]: pd.Series(d1).combine_first(pd.Series(d2)).combine_first(df.loc[2, :])
Out[9]: 
A    1.0
B    2.0
C    1.0
dtype: float64

Answer 1

我认为可以在循环中使用 update:

result = {}
for d in [d1, d2]:
    result.update(d)

df.loc[2,:] = result

或者生成器转换为dict:

df.loc[2,:] = dict(pair for d in [d1, d2] for pair in d.items())

或听写理解：

df.loc[2,:] = {k: v for d in [d1, d2] for k, v in d.items()}

print (df)
   A  B  C
0  0  3  4
1  1  2  0
2  1  2  1

Answer 2

这是另一个解决方案：

df.loc[2,:] = reduce(pd.Series.combine_first, [pd.Series(d) for d in d1, d2])

如果 d1、d2 的内容互斥则有效。

虽然我计时了，但它没有@jezrael 的解决方案那么快。

使用字典将所选值分配给 Pandas 系列

Assign selected values to Pandas Series using dictionary

dictionary

variable-assignment

dataframe

pandas