如何显示所有交互的交叉表?
How to display cross-tabs of all interactions?
我有一个数据集(简化形式)如下所示:
import pandas as pd
df = pd.DataFrame({"target":[20,30,40], "x1":[1,0,1], "x2":[0,1,1], "x3":[0,0,1]}
我想找到所有可能的双变量 (x_i, x_j)
交互的 target
的平均值。所以输出应该是这样的:
我将如何在 Pandas 中执行此操作?
您可以使用 pivot_table
and for add not exist values reindex
by MultiIndex
created by from_product
:
df = df.pivot_table(index='x1',columns=['x2','x3'], values='target')
mux = pd.MultiIndex.from_product(df.columns.levels, names=df.columns.names)
df = df.reindex(columns=mux)
print (df)
x2 0 1
x3 0 1 0 1
x1
0 NaN NaN 30.0 NaN
1 20.0 NaN NaN 40.0
如果要将 NaN
s 替换为 0
:
df = df.pivot_table(index='x1',columns=['x2','x3'], values='target', fill_value=0)
mux = pd.MultiIndex.from_product(df.columns.levels, names=df.columns.names)
df = df.reindex(columns=mux, fill_value=0)
print (df)
x2 0 1
x3 0 1 0 1
x1
0 0 0 30 0
1 20 0 0 40
我有一个数据集(简化形式)如下所示:
import pandas as pd
df = pd.DataFrame({"target":[20,30,40], "x1":[1,0,1], "x2":[0,1,1], "x3":[0,0,1]}
我想找到所有可能的双变量 (x_i, x_j)
交互的 target
的平均值。所以输出应该是这样的:
我将如何在 Pandas 中执行此操作?
您可以使用 pivot_table
and for add not exist values reindex
by MultiIndex
created by from_product
:
df = df.pivot_table(index='x1',columns=['x2','x3'], values='target')
mux = pd.MultiIndex.from_product(df.columns.levels, names=df.columns.names)
df = df.reindex(columns=mux)
print (df)
x2 0 1
x3 0 1 0 1
x1
0 NaN NaN 30.0 NaN
1 20.0 NaN NaN 40.0
如果要将 NaN
s 替换为 0
:
df = df.pivot_table(index='x1',columns=['x2','x3'], values='target', fill_value=0)
mux = pd.MultiIndex.from_product(df.columns.levels, names=df.columns.names)
df = df.reindex(columns=mux, fill_value=0)
print (df)
x2 0 1
x3 0 1 0 1
x1
0 0 0 30 0
1 20 0 0 40