使用 Numpy 向量化的循环
Loops using Numpy Vectorization
我正在尝试复制使用循环和 numpy 向量化的结果,在此处找到文章 (https://towardsdatascience.com/how-to-make-your-pandas-loop-71-803-times-faster-805030df4f06)。这篇文章不包含来自 运行 的数据或结果,但我能够在网上找到数据。我想为自己的工作复制结果,但输出不正确。
我已经包含了文章中的一小部分原始数据框和相应的代码:
import pandas as pd
data = {'HomeTeam':['Burnley','Crystal Palace','Everton','Hull','Man City','Middlesbrough','Southampton',
'Arsenal','Bournemouth','Chelsea','Man United','Burnley','Leicester','Stoke'], 'AwayTeam':['Swansea','West Brom','Tottenham','Leicester','Sunderland','Stoke','Watford','Liverpool','Man United',
'West Ham','Southampton','Liverpool','Arsenal','Man City'], 'FTR': ['A','A','D','H','H','D','D','A','A','H','H','H','D','A']}
leaguedf = pd.DataFrame(data)
def soc_iter(TEAM,home,away,ftr):
leaguedf['Draws'] = 'No_Game'
leaguedf.loc[((home == TEAM) & (ftr == 'D')) | ((away == TEAM) & (ftr == 'D')), 'Draws'] = 'Draw'
leaguedf.loc[((home == TEAM) & (ftr != 'D')) | ((away == TEAM) & (ftr != 'D')), 'Draws'] = 'No_Draw'
leaguedf['Draws']=soc_iter('Arsenal',leaguedf['HomeTeam'].values, leaguedf['AwayTeam'].values, leaguedf['FTR'].values)
leaguedf
当我 运行 代码输出列 'Draws' 只生成 'None' 的输出,而不是 'Draw' 或 'No_Draw'。
代码有什么问题?
你的函数没有 return 任何东西,所以 Draws 将全部 None,你不必分配任何东西,函数内的代码已经在创建 Draw 列:
import pandas as pd
data = {'HomeTeam':['Burnley','Crystal Palace','Everton','Hull','Man City','Middlesbrough','Southampton',
'Arsenal','Bournemouth','Chelsea','Man United','Burnley','Leicester','Stoke'], 'AwayTeam':['Swansea','West Brom','Tottenham','Leicester','Sunderland','Stoke','Watford','Liverpool','Man United',
'West Ham','Southampton','Liverpool','Arsenal','Man City'], 'FTR': ['A','A','D','H','H','D','D','A','A','H','H','H','D','A']}
leaguedf = pd.DataFrame(data)
def soc_iter(TEAM,home,away,ftr):
leaguedf['Draws'] = 'No_Game'
leaguedf.loc[((home == TEAM) & (ftr == 'D')) | ((away == TEAM) & (ftr == 'D')), 'Draws'] = 'Draw'
leaguedf.loc[((home == TEAM) & (ftr != 'D')) | ((away == TEAM) & (ftr != 'D')), 'Draws'] = 'No_Draw'
soc_iter('Arsenal',leaguedf['HomeTeam'].values, leaguedf['AwayTeam'].values, leaguedf['FTR'].values)
leaguedf
HomeTeam AwayTeam FTR Draws
0 Burnley Swansea A No_Game
1 Crystal Palace West Brom A No_Game
2 Everton Tottenham D No_Game
3 Hull Leicester H No_Game
4 Man City Sunderland H No_Game
5 Middlesbrough Stoke D No_Game
6 Southampton Watford D No_Game
7 Arsenal Liverpool A No_Draw
8 Bournemouth Man United A No_Game
9 Chelsea West Ham H No_Game
10 Man United Southampton H No_Game
11 Burnley Liverpool H No_Game
12 Leicester Arsenal D Draw
13 Stoke Man City A No_Game
我正在尝试复制使用循环和 numpy 向量化的结果,在此处找到文章 (https://towardsdatascience.com/how-to-make-your-pandas-loop-71-803-times-faster-805030df4f06)。这篇文章不包含来自 运行 的数据或结果,但我能够在网上找到数据。我想为自己的工作复制结果,但输出不正确。
我已经包含了文章中的一小部分原始数据框和相应的代码:
import pandas as pd
data = {'HomeTeam':['Burnley','Crystal Palace','Everton','Hull','Man City','Middlesbrough','Southampton',
'Arsenal','Bournemouth','Chelsea','Man United','Burnley','Leicester','Stoke'], 'AwayTeam':['Swansea','West Brom','Tottenham','Leicester','Sunderland','Stoke','Watford','Liverpool','Man United',
'West Ham','Southampton','Liverpool','Arsenal','Man City'], 'FTR': ['A','A','D','H','H','D','D','A','A','H','H','H','D','A']}
leaguedf = pd.DataFrame(data)
def soc_iter(TEAM,home,away,ftr):
leaguedf['Draws'] = 'No_Game'
leaguedf.loc[((home == TEAM) & (ftr == 'D')) | ((away == TEAM) & (ftr == 'D')), 'Draws'] = 'Draw'
leaguedf.loc[((home == TEAM) & (ftr != 'D')) | ((away == TEAM) & (ftr != 'D')), 'Draws'] = 'No_Draw'
leaguedf['Draws']=soc_iter('Arsenal',leaguedf['HomeTeam'].values, leaguedf['AwayTeam'].values, leaguedf['FTR'].values)
leaguedf
当我 运行 代码输出列 'Draws' 只生成 'None' 的输出,而不是 'Draw' 或 'No_Draw'。
代码有什么问题?
你的函数没有 return 任何东西,所以 Draws 将全部 None,你不必分配任何东西,函数内的代码已经在创建 Draw 列:
import pandas as pd
data = {'HomeTeam':['Burnley','Crystal Palace','Everton','Hull','Man City','Middlesbrough','Southampton',
'Arsenal','Bournemouth','Chelsea','Man United','Burnley','Leicester','Stoke'], 'AwayTeam':['Swansea','West Brom','Tottenham','Leicester','Sunderland','Stoke','Watford','Liverpool','Man United',
'West Ham','Southampton','Liverpool','Arsenal','Man City'], 'FTR': ['A','A','D','H','H','D','D','A','A','H','H','H','D','A']}
leaguedf = pd.DataFrame(data)
def soc_iter(TEAM,home,away,ftr):
leaguedf['Draws'] = 'No_Game'
leaguedf.loc[((home == TEAM) & (ftr == 'D')) | ((away == TEAM) & (ftr == 'D')), 'Draws'] = 'Draw'
leaguedf.loc[((home == TEAM) & (ftr != 'D')) | ((away == TEAM) & (ftr != 'D')), 'Draws'] = 'No_Draw'
soc_iter('Arsenal',leaguedf['HomeTeam'].values, leaguedf['AwayTeam'].values, leaguedf['FTR'].values)
leaguedf
HomeTeam AwayTeam FTR Draws
0 Burnley Swansea A No_Game
1 Crystal Palace West Brom A No_Game
2 Everton Tottenham D No_Game
3 Hull Leicester H No_Game
4 Man City Sunderland H No_Game
5 Middlesbrough Stoke D No_Game
6 Southampton Watford D No_Game
7 Arsenal Liverpool A No_Draw
8 Bournemouth Man United A No_Game
9 Chelsea West Ham H No_Game
10 Man United Southampton H No_Game
11 Burnley Liverpool H No_Game
12 Leicester Arsenal D Draw
13 Stoke Man City A No_Game