ValueError: Length of values does not match length of index when applying function to dataframe
ValueError: Length of values does not match length of index when applying function to dataframe
我有一个自定义函数,我必须将其应用于数据框。但是当我应用该函数时,它会出现上述错误。数据框如下所示:
函数是:
def f(x):
G = nx.from_pandas_edgelist(x, 'restart_A', 'restart_B')
l = x.apply(lambda n: ','.join(nx.node_connected_component(G, n['restart_A'])), axis=1)
return l
df_2['subgroup_name'] = df_2.groupby('Group').apply(f).to_numpy()
我做错了什么?我也做了reset_index,这里看不到
df_2 = pd.DataFrame(
{
"Date": ['2020-07-01', '2020-07-01', '2020-07-01'],
"restart_A": ['User-1013861701','User-1013861701','User-1013861701'],
"restart_B": ['User-202955957','User-1744844911','User-5711961755'],
"Group":['G0', 'G0','G0']
}
)
你能试一试吗?
import pandas as pd
import networkx as nx
df_2 = pd.DataFrame(
{
"Date": ['2020-07-01', '2020-07-01', '2020-07-01'],
"restart_A": ['User-1013861701','User-1013861701','User-1013861701'],
"restart_B": ['User-202955957','User-1744844911','User-5711961755 '],
"Group":['G0', 'G0','G0']
}
)
def f(x):
G = nx.from_pandas_edgelist(x, 'restart_A', 'restart_B')
l = x.apply(lambda n: ','.join(nx.node_connected_component(G, n['restart_A'])), axis=1)
x['subgroup_name'] = l.to_numpy()
return x
df_2 = df_2.groupby('Group').apply(f)
print(df_2)
输出:
Date restart_A restart_B Group subgroup_name
0 2020-07-01 User-1013861701 User-202955957 G0 User-202955957,User-1013861701,User-5711961755...
1 2020-07-01 User-1013861701 User-1744844911 G0 User-202955957,User-1013861701,User-5711961755...
2 2020-07-01 User-1013861701 User-5711961755 G0 User-202955957,User-1013861701,User-5711961755...
我有一个自定义函数,我必须将其应用于数据框。但是当我应用该函数时,它会出现上述错误。数据框如下所示:
函数是:
def f(x):
G = nx.from_pandas_edgelist(x, 'restart_A', 'restart_B')
l = x.apply(lambda n: ','.join(nx.node_connected_component(G, n['restart_A'])), axis=1)
return l
df_2['subgroup_name'] = df_2.groupby('Group').apply(f).to_numpy()
我做错了什么?我也做了reset_index,这里看不到
df_2 = pd.DataFrame(
{
"Date": ['2020-07-01', '2020-07-01', '2020-07-01'],
"restart_A": ['User-1013861701','User-1013861701','User-1013861701'],
"restart_B": ['User-202955957','User-1744844911','User-5711961755'],
"Group":['G0', 'G0','G0']
}
)
你能试一试吗?
import pandas as pd
import networkx as nx
df_2 = pd.DataFrame(
{
"Date": ['2020-07-01', '2020-07-01', '2020-07-01'],
"restart_A": ['User-1013861701','User-1013861701','User-1013861701'],
"restart_B": ['User-202955957','User-1744844911','User-5711961755 '],
"Group":['G0', 'G0','G0']
}
)
def f(x):
G = nx.from_pandas_edgelist(x, 'restart_A', 'restart_B')
l = x.apply(lambda n: ','.join(nx.node_connected_component(G, n['restart_A'])), axis=1)
x['subgroup_name'] = l.to_numpy()
return x
df_2 = df_2.groupby('Group').apply(f)
print(df_2)
输出:
Date restart_A restart_B Group subgroup_name
0 2020-07-01 User-1013861701 User-202955957 G0 User-202955957,User-1013861701,User-5711961755...
1 2020-07-01 User-1013861701 User-1744844911 G0 User-202955957,User-1013861701,User-5711961755...
2 2020-07-01 User-1013861701 User-5711961755 G0 User-202955957,User-1013861701,User-5711961755...