pandas

Question

我有一个看起来像这样的数据框：

         IDs
  Name
  John   1,4,8
  Eric   2,9,17
  Paul   41,72,78,100

我需要从 IDs 中获取每个组合并将其分配给新的原始数据，因此输出 df 应该如下所示：

        IDs
Name   
John    1,4
John    1,8
John    4,8
Eric    2,9
Eric    2,17
Eric    9,17
Paul    41,72
Paul    41,78
Paul    41,100
Paul    72,78
Paul    72,100
Paul    78,100

我尝试了几种方法，但其中 none 甚至开始看起来接近我需要的东西。

Answer 1

让我们使用 itertools 中的 combinations、pd.Series、stack 和 reset_index:

from itertools import combinations
df.IDs.apply(lambda x:pd.Series(list(combinations(x.split(','),2))))\
      .stack()\
      .reset_index(level=1, drop=True)

输出：

Name
John       (1, 4)
John       (1, 8)
John       (4, 8)
Eric       (2, 9)
Eric      (2, 17)
Eric      (9, 17)
Paul     (41, 72)
Paul     (41, 78)
Paul    (41, 100)
Paul     (72, 78)
Paul    (72, 100)
Paul    (78, 100)
dtype: object

pandas - 拆分字符串并取每一对

pandas - split string and take each couple

python

combinatorics

itertools