读取数据文件时是否可以将函数应用于一列?
Is it possible to apply a function to one column when reading a data file?
在从文件(以 pythonic 方式)构建数据框时,有没有办法直接应用系列操作(内置函数或自定义)?
我想更改以下内容:
# import data frame containing a custom timestamp column (ex: _2019_11_19_15_10_35_)
df1 = pd.read_csv('mydatafile.csv').assign(newcol='newval')
df1['Timestamp'] = pd.todatetime(df1['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_')
类似于:
df1 = pd.read_csv('mydatafile.csv').assign(newcol='newval').todatetime(df1['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_')
我也试过:
df1 = pd.read_csv('mydatafile.csv').assign(newcol='newval').apply(lambda x: pd.todatetime(df1['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_') if x.name=='Timestamp' else x)
那么你可以分配另一个时间戳列,删除前一个:
df1 = pd.read_csv('mydatafile.csv').assign(
newcol='newval',
Timestamp=lambda df: pd.to_datetime(df['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_'))
在从文件(以 pythonic 方式)构建数据框时,有没有办法直接应用系列操作(内置函数或自定义)?
我想更改以下内容:
# import data frame containing a custom timestamp column (ex: _2019_11_19_15_10_35_)
df1 = pd.read_csv('mydatafile.csv').assign(newcol='newval')
df1['Timestamp'] = pd.todatetime(df1['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_')
类似于:
df1 = pd.read_csv('mydatafile.csv').assign(newcol='newval').todatetime(df1['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_')
我也试过:
df1 = pd.read_csv('mydatafile.csv').assign(newcol='newval').apply(lambda x: pd.todatetime(df1['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_') if x.name=='Timestamp' else x)
那么你可以分配另一个时间戳列,删除前一个:
df1 = pd.read_csv('mydatafile.csv').assign(
newcol='newval',
Timestamp=lambda df: pd.to_datetime(df['Timestamp'], format='_%Y_%m_%d_%H_%M_%S_'))