在 pandas 中解决 SettingWithCopyWarning
solve SettingWithCopyWarning in pandas
这是我遇到的问题:
/var/folders/v0/57ps6v293zx6jb2g78v0__1h0000gn/T/ipykernel_58173/392784622.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
这是我的代码
AMZN['VWDR'] = AMZN['Volume'] * AMZN['DailyReturn']/ AMZN['Volume'].cumsum()
我也尝试了以下方法,但没有解决警告:
AMZN.loc[AMZN.index,'VWDR'] = AMZN.loc[AMZN.index, 'Volume'] * AMZN.loc[AMZN.index, 'DailyReturn']/ AMZN.loc[AMZN.index,'Volume'].cumsum()
下面是获取我的table的代码:
import pandas as pd
import yfinance as yf
# now just read the html to get all the S&P500 tickers
dataload=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = dataload[0]
# now get the first column(tickers) from the above data
# convert it into a list
ticker_list = df['Symbol'].values.tolist()
# convert the list into a string, separated by space, and replace . with -
all_tickers = " ".join(ticker_list).replace('.', '-') # this is to ensure that we could find BRK.B and BF.B
# get all the tickers from yfinance
tickers = yf.Tickers(all_tickers)
# set a start and end date to get two-years info
# group by the ticker
hist = tickers.history(start='2020-05-01', end='2022-05-01', group_by='ticker')
# ‘Stack’ the table to get it into row form
Data_stack = pd.DataFrame(hist.stack(level=0).reset_index().rename(columns = {'level_1':'Ticker'}))
# Add a column to the original table containing the daily return per ticker
Data_stack['DailyReturn'] = Data_stack.sort_values(['Ticker', 'Date']).groupby('Ticker')['Close'].pct_change()
Data_stack = Data_stack.set_index('Date') # now set the Date as the index
# get the AMZN data by sort the original table on Ticker
AMZN = Data_stack[Data_stack.Ticker=='AMZN']
为简单起见,您可以从 yfinance
下载 AMZN 代码 table
创建时复制AMZN
:
AMZN = Data_stack[Data_stack.Ticker=='AMZN'].copy()
# ^^^^^^^
那么您的其余代码将不会有警告。
您正在使用的是链式分配,要解决此问题,您需要复制并省略 select 列的 loc。
AMZN = Data_stack[Data_stack.Ticker=='AMZN'].copy()
AMZN['VWDR'] = AMZN['Volume'] * AMZN['DailyReturn']/ AMZN['Volume'].cumsum()
这是我遇到的问题:
/var/folders/v0/57ps6v293zx6jb2g78v0__1h0000gn/T/ipykernel_58173/392784622.py:4: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
这是我的代码
AMZN['VWDR'] = AMZN['Volume'] * AMZN['DailyReturn']/ AMZN['Volume'].cumsum()
我也尝试了以下方法,但没有解决警告:
AMZN.loc[AMZN.index,'VWDR'] = AMZN.loc[AMZN.index, 'Volume'] * AMZN.loc[AMZN.index, 'DailyReturn']/ AMZN.loc[AMZN.index,'Volume'].cumsum()
下面是获取我的table的代码:
import pandas as pd
import yfinance as yf
# now just read the html to get all the S&P500 tickers
dataload=pd.read_html('https://en.wikipedia.org/wiki/List_of_S%26P_500_companies')
df = dataload[0]
# now get the first column(tickers) from the above data
# convert it into a list
ticker_list = df['Symbol'].values.tolist()
# convert the list into a string, separated by space, and replace . with -
all_tickers = " ".join(ticker_list).replace('.', '-') # this is to ensure that we could find BRK.B and BF.B
# get all the tickers from yfinance
tickers = yf.Tickers(all_tickers)
# set a start and end date to get two-years info
# group by the ticker
hist = tickers.history(start='2020-05-01', end='2022-05-01', group_by='ticker')
# ‘Stack’ the table to get it into row form
Data_stack = pd.DataFrame(hist.stack(level=0).reset_index().rename(columns = {'level_1':'Ticker'}))
# Add a column to the original table containing the daily return per ticker
Data_stack['DailyReturn'] = Data_stack.sort_values(['Ticker', 'Date']).groupby('Ticker')['Close'].pct_change()
Data_stack = Data_stack.set_index('Date') # now set the Date as the index
# get the AMZN data by sort the original table on Ticker
AMZN = Data_stack[Data_stack.Ticker=='AMZN']
为简单起见,您可以从 yfinance
下载 AMZN 代码 table创建时复制AMZN
:
AMZN = Data_stack[Data_stack.Ticker=='AMZN'].copy()
# ^^^^^^^
那么您的其余代码将不会有警告。
您正在使用的是链式分配,要解决此问题,您需要复制并省略 select 列的 loc。
AMZN = Data_stack[Data_stack.Ticker=='AMZN'].copy()
AMZN['VWDR'] = AMZN['Volume'] * AMZN['DailyReturn']/ AMZN['Volume'].cumsum()