在 pandas 数据框上使用 apply(),将其他数据框列作为输入
Using apply() on pandas dataframe, with other dataframe columns as inputs
我有一个足球比赛结果的数据框,我正试图在数据框的末尾创建一个新列,显示哪支球队获胜。我正在尝试使用 df.apply
来执行此操作。这是我目前所拥有的:
def match_winner(winner,home_team,away_team,home_goals,away_goals):
if home_goals>away_goals:
winner = home_team
elif home_goals<away_goals:
winner = away_team
else:
winner = "None"
match['Winning Team'] =""
match['Winning Team'].apply(match_winner,args=[match['Home Team'],match['Away Team'],match['home_team_goal'],match['away_team_goal']])
这里是 match
数据框的结构:
<bound method DataFrame.info of id country_id league_id season stage date \
145 146 1 1 2008/2009 24 2009-02-27 00:00:00
153 154 1 1 2008/2009 25 2009-03-08 00:00:00
155 156 1 1 2008/2009 25 2009-03-07 00:00:00
162 163 1 1 2008/2009 26 2009-03-13 00:00:00
168 169 1 1 2008/2009 26 2009-03-14 00:00:00
... ... ... ... ... ... ...
25972 25973 24558 24558 2015/2016 8 2015-09-13 00:00:00
25974 25975 24558 24558 2015/2016 9 2015-09-22 00:00:00
25975 25976 24558 24558 2015/2016 9 2015-09-23 00:00:00
25976 25977 24558 24558 2015/2016 9 2015-09-23 00:00:00
25978 25979 24558 24558 2015/2016 9 2015-09-23 00:00:00
match_api_id home_team_api_id away_team_api_id home_team_goal \
145 493017 8203 9987 2
153 493025 9984 8342 1
155 493027 8635 10000 2
162 493034 8203 8635 2
168 493040 10000 9999 0
... ... ... ... ...
25972 1992089 10243 10191 3
25974 1992091 10190 10191 1
25975 1992092 9824 10199 1
25976 1992093 9956 10179 2
25978 1992095 10192 9931 4
away_team_goal goal shoton shotoff foulcommit card cross corner \
145 1 None None None None None None None
153 3 None None None None None None None
155 0 None None None None None None None
162 1 None None None None None None None
168 0 None None None None None None None
... ... ... ... ... ... ... ... ...
25972 3 None None None None None None None
25974 0 None None None None None None None
25975 2 None None None None None None None
25976 0 None None None None None None None
25978 3 None None None None None None None
possession BSA Home Team Away Team \
145 None 2.25 KV Mechelen KRC Genk
153 None 2.38 KSV Cercle Brugge Club Brugge KV
155 None 7.00 RSC Anderlecht SV Zulte-Waregem
162 None 1.75 KV Mechelen RSC Anderlecht
168 None 4.33 SV Zulte-Waregem KSV Roeselare
... ... ... ... ...
25972 None NaN FC Zürich FC Thun
25974 None NaN FC St. Gallen FC Thun
25975 None NaN FC Vaduz FC Luzern
25976 None NaN Grasshopper Club Zürich FC Sion
25978 None NaN BSC Young Boys FC Basel
League Country home_player_1 home_player_2 \
145 Belgium Jupiler League Belgium False False
153 Belgium Jupiler League Belgium False False
155 Belgium Jupiler League Belgium False False
162 Belgium Jupiler League Belgium False False
168 Belgium Jupiler League Belgium False False
... ... ... ... ...
25972 Switzerland Super League Switzerland False False
25974 Switzerland Super League Switzerland False False
25975 Switzerland Super League Switzerland False False
25976 Switzerland Super League Switzerland False False
25978 Switzerland Super League Switzerland False False
home_player_3 home_player_4 home_player_5 home_player_6 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
home_player_7 home_player_8 home_player_9 home_player_10 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
home_player_11 away_player_1 away_player_2 away_player_3 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
away_player_4 away_player_5 away_player_6 away_player_7 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
away_player_8 away_player_9 away_player_10 away_player_11 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
Winning Team
145
153
155
162
168
... ...
25972
25974
25975
25976
25978
[21374 rows x 47 columns]>
我需要发送多列作为输入论据,以便找出谁赢了,但我不确定 .apply()
是否允许您将系列作为论据传递?是否可以使用 .apply()
来做到这一点。如果是这样,解决方案会有所帮助,但如果有人知道更好的替代方案,那也会很有用。
避免 DataFrame.apply
(通常 运行 作为 hidden loop) and instead consider nested conditional logic with numpy.where
列:
match['Winning Team'] = np.where(match['home_team_goal'] > match['away_team_goal'],
match['Home Team'],
np.where(match['home_team_goal'] < match['away_team_goal'],
match['Away Team'],
np.nan
)
)
使用 numpy.select
代替 apply
,这在这里不是个好主意(引擎盖下的循环):
match = pd.DataFrame({
'Home Team':list('abcdef'),
'Away Team':list('ghijkl'),
'home_team_goal':[14,5,4,5,5,4],
'away_team_goal':[7,8,2,4,8,4],
})
m1 = match.home_team_goal>match.away_team_goal
m2 = match.home_team_goal<match.away_team_goal
match['winner'] = np.select([m1, m2], [match['Home Team'], match['Away Team']], default=None)
print (match)
Home Team Away Team home_team_goal away_team_goal winner
0 a g 14 7 a
1 b h 5 8 h
2 c i 4 2 c
3 d j 5 4 d
4 e k 5 8 k
5 f l 4 4 None
这是另一种方式:
与团队一起创建数据框:
results_di = {
"Home Team": [
"KV Mechelen",
"KSV Cercle Brugge",
"RSC Anderlecht",
"KV Mechelen",
"SV Zulte-Waregem",
"FC Zürich",
"FC St. Gallen",
"FC Vaduz",
"Grasshopper Club Zürich",
"BSC Young Boys",
],
"Away Team": [
"KRC Genk",
"Club Brugge KV",
"SV Zulte-Waregem",
"RSC Anderlecht",
"KSV Roeselare",
"FC Thun",
"FC Thun",
"FC Luzern",
"FC Sion",
"FC Basel",
],
"home_team_goal": [2, 1, 2, 2, 0, 3, 1, 1, 2, 4],
"away_team_goal": [1, 3, 0, 1, 0, 3, 0, 2, 0, 3],
}
df = pd.DataFrame(results_di)
df['winner'] = 'none'
home_win = df['home_team_goal'] - df["away_team_goal"] > 0
away_win = df["away_team_goal"] - df['home_team_goal'] > 0
df.loc[home_win, "winner"] = df.loc[home_win, 'Home Team']
df.loc[away_win, "winner"] = df.loc[away_win, 'Away Team']
打印(df)
Home Team Away Team home_team_goal away_team_goal \
0 KV Mechelen KRC Genk 2 1
1 KSV Cercle Brugge Club Brugge KV 1 3
2 RSC Anderlecht SV Zulte-Waregem 2 0
3 KV Mechelen RSC Anderlecht 2 1
4 SV Zulte-Waregem KSV Roeselare 0 0
5 FC Zürich FC Thun 3 3
6 FC St. Gallen FC Thun 1 0
7 FC Vaduz FC Luzern 1 2
8 Grasshopper Club Zürich FC Sion 2 0
9 BSC Young Boys FC Basel 4 3
winner
0 KV Mechelen
1 Club Brugge KV
2 RSC Anderlecht
3 KV Mechelen
4 none
5 none
6 FC St. Gallen
7 FC Luzern
8 Grasshopper Club Zürich
9 BSC Young Boys
我有一个足球比赛结果的数据框,我正试图在数据框的末尾创建一个新列,显示哪支球队获胜。我正在尝试使用 df.apply
来执行此操作。这是我目前所拥有的:
def match_winner(winner,home_team,away_team,home_goals,away_goals):
if home_goals>away_goals:
winner = home_team
elif home_goals<away_goals:
winner = away_team
else:
winner = "None"
match['Winning Team'] =""
match['Winning Team'].apply(match_winner,args=[match['Home Team'],match['Away Team'],match['home_team_goal'],match['away_team_goal']])
这里是 match
数据框的结构:
<bound method DataFrame.info of id country_id league_id season stage date \
145 146 1 1 2008/2009 24 2009-02-27 00:00:00
153 154 1 1 2008/2009 25 2009-03-08 00:00:00
155 156 1 1 2008/2009 25 2009-03-07 00:00:00
162 163 1 1 2008/2009 26 2009-03-13 00:00:00
168 169 1 1 2008/2009 26 2009-03-14 00:00:00
... ... ... ... ... ... ...
25972 25973 24558 24558 2015/2016 8 2015-09-13 00:00:00
25974 25975 24558 24558 2015/2016 9 2015-09-22 00:00:00
25975 25976 24558 24558 2015/2016 9 2015-09-23 00:00:00
25976 25977 24558 24558 2015/2016 9 2015-09-23 00:00:00
25978 25979 24558 24558 2015/2016 9 2015-09-23 00:00:00
match_api_id home_team_api_id away_team_api_id home_team_goal \
145 493017 8203 9987 2
153 493025 9984 8342 1
155 493027 8635 10000 2
162 493034 8203 8635 2
168 493040 10000 9999 0
... ... ... ... ...
25972 1992089 10243 10191 3
25974 1992091 10190 10191 1
25975 1992092 9824 10199 1
25976 1992093 9956 10179 2
25978 1992095 10192 9931 4
away_team_goal goal shoton shotoff foulcommit card cross corner \
145 1 None None None None None None None
153 3 None None None None None None None
155 0 None None None None None None None
162 1 None None None None None None None
168 0 None None None None None None None
... ... ... ... ... ... ... ... ...
25972 3 None None None None None None None
25974 0 None None None None None None None
25975 2 None None None None None None None
25976 0 None None None None None None None
25978 3 None None None None None None None
possession BSA Home Team Away Team \
145 None 2.25 KV Mechelen KRC Genk
153 None 2.38 KSV Cercle Brugge Club Brugge KV
155 None 7.00 RSC Anderlecht SV Zulte-Waregem
162 None 1.75 KV Mechelen RSC Anderlecht
168 None 4.33 SV Zulte-Waregem KSV Roeselare
... ... ... ... ...
25972 None NaN FC Zürich FC Thun
25974 None NaN FC St. Gallen FC Thun
25975 None NaN FC Vaduz FC Luzern
25976 None NaN Grasshopper Club Zürich FC Sion
25978 None NaN BSC Young Boys FC Basel
League Country home_player_1 home_player_2 \
145 Belgium Jupiler League Belgium False False
153 Belgium Jupiler League Belgium False False
155 Belgium Jupiler League Belgium False False
162 Belgium Jupiler League Belgium False False
168 Belgium Jupiler League Belgium False False
... ... ... ... ...
25972 Switzerland Super League Switzerland False False
25974 Switzerland Super League Switzerland False False
25975 Switzerland Super League Switzerland False False
25976 Switzerland Super League Switzerland False False
25978 Switzerland Super League Switzerland False False
home_player_3 home_player_4 home_player_5 home_player_6 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
home_player_7 home_player_8 home_player_9 home_player_10 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
home_player_11 away_player_1 away_player_2 away_player_3 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
away_player_4 away_player_5 away_player_6 away_player_7 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
away_player_8 away_player_9 away_player_10 away_player_11 \
145 False False False False
153 False False False False
155 False False False False
162 False False False False
168 False False False False
... ... ... ... ...
25972 False False False False
25974 False False False False
25975 False False False False
25976 False False False False
25978 False False False False
Winning Team
145
153
155
162
168
... ...
25972
25974
25975
25976
25978
[21374 rows x 47 columns]>
我需要发送多列作为输入论据,以便找出谁赢了,但我不确定 .apply()
是否允许您将系列作为论据传递?是否可以使用 .apply()
来做到这一点。如果是这样,解决方案会有所帮助,但如果有人知道更好的替代方案,那也会很有用。
避免 DataFrame.apply
(通常 运行 作为 hidden loop) and instead consider nested conditional logic with numpy.where
列:
match['Winning Team'] = np.where(match['home_team_goal'] > match['away_team_goal'],
match['Home Team'],
np.where(match['home_team_goal'] < match['away_team_goal'],
match['Away Team'],
np.nan
)
)
使用 numpy.select
代替 apply
,这在这里不是个好主意(引擎盖下的循环):
match = pd.DataFrame({
'Home Team':list('abcdef'),
'Away Team':list('ghijkl'),
'home_team_goal':[14,5,4,5,5,4],
'away_team_goal':[7,8,2,4,8,4],
})
m1 = match.home_team_goal>match.away_team_goal
m2 = match.home_team_goal<match.away_team_goal
match['winner'] = np.select([m1, m2], [match['Home Team'], match['Away Team']], default=None)
print (match)
Home Team Away Team home_team_goal away_team_goal winner
0 a g 14 7 a
1 b h 5 8 h
2 c i 4 2 c
3 d j 5 4 d
4 e k 5 8 k
5 f l 4 4 None
这是另一种方式:
与团队一起创建数据框:
results_di = {
"Home Team": [
"KV Mechelen",
"KSV Cercle Brugge",
"RSC Anderlecht",
"KV Mechelen",
"SV Zulte-Waregem",
"FC Zürich",
"FC St. Gallen",
"FC Vaduz",
"Grasshopper Club Zürich",
"BSC Young Boys",
],
"Away Team": [
"KRC Genk",
"Club Brugge KV",
"SV Zulte-Waregem",
"RSC Anderlecht",
"KSV Roeselare",
"FC Thun",
"FC Thun",
"FC Luzern",
"FC Sion",
"FC Basel",
],
"home_team_goal": [2, 1, 2, 2, 0, 3, 1, 1, 2, 4],
"away_team_goal": [1, 3, 0, 1, 0, 3, 0, 2, 0, 3],
}
df = pd.DataFrame(results_di)
df['winner'] = 'none'
home_win = df['home_team_goal'] - df["away_team_goal"] > 0
away_win = df["away_team_goal"] - df['home_team_goal'] > 0
df.loc[home_win, "winner"] = df.loc[home_win, 'Home Team']
df.loc[away_win, "winner"] = df.loc[away_win, 'Away Team']
打印(df)
Home Team Away Team home_team_goal away_team_goal \
0 KV Mechelen KRC Genk 2 1
1 KSV Cercle Brugge Club Brugge KV 1 3
2 RSC Anderlecht SV Zulte-Waregem 2 0
3 KV Mechelen RSC Anderlecht 2 1
4 SV Zulte-Waregem KSV Roeselare 0 0
5 FC Zürich FC Thun 3 3
6 FC St. Gallen FC Thun 1 0
7 FC Vaduz FC Luzern 1 2
8 Grasshopper Club Zürich FC Sion 2 0
9 BSC Young Boys FC Basel 4 3
winner
0 KV Mechelen
1 Club Brugge KV
2 RSC Anderlecht
3 KV Mechelen
4 none
5 none
6 FC St. Gallen
7 FC Luzern
8 Grasshopper Club Zürich
9 BSC Young Boys