将季节性分解趋势应用于 dask DataFrame 的每一列,Python
Apply trend of seasonal decompose to every column of a dask DataFrame, Python
正如标题所说,我不能运行这个代码:
def simple_map(x):
y = seasonal_decompose(x,model='additive',extrapolate_trend='freq',period=7,two_sided=False)
return y.trend
b.map_partitions(simple_map,meta=b).compute()
其中 b 是一个 dask DataFrame,以日期时间作为索引,一些系列的浮点数作为列,seasonal_decompose 是 statsmodel。
这是我得到的:
Index(...) must be called with a collection of some kind, 'seasonal' was passed
如果我这样做:
b.apply(simple_map,axis=0)
其中 b 是一个 pandas DataFrame 我得到了我想要的。
我哪里错了?
#
可重现的例子:
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
d = {'Val1': [3, 2,7,5], 'Val2': [2, 4,8,6]}
b=pd.DataFrame(data=d)
b=b.set_index(pd.to_datetime(['25/12/1991','26/12/1991','27/12/1991','28/12/1991']))
def simple_map(x):
y =seasonal_decompose(x,model='additive',extrapolate_trend='freq',period=2,two_sided=False)
return y.trend
b.apply(simple_map,axis=0)
Val1 Val2
1991-12-25 0.70 0.9
1991-12-26 2.10 2.7
1991-12-27 3.50 4.5
1991-12-28 5.25 6.5
这是我想用 dask 做的,但我做不到
事实上:
import dask.dataframe as dd
c=dd.from_pandas(b, npartitions=1)
c.map_partitions(simple_map,meta=c).compute()
产生上面指定的错误。
谢谢你的例子!
来自应用的文档字符串
Objects passed to the function are Series objects whose index is
either the DataFrame's index (axis=0
)
但是,map_partitions
将适用于整个 Dataframe。我建议稍微重写函数:
def simple_map_2(x):
xVal1 = seasonal_decompose(x.Val1,model='additive',extrapolate_trend='freq',period=2,two_sided=False)
xVal2 = seasonal_decompose(x.Val2,model='additive',extrapolate_trend='freq',period=2,two_sided=False)
return pd.DataFrame({'Val1': xVal1.trend, 'Val2': xVal2.trend})
c.map_partitions(simple_map_2,meta=make_meta(c)).compute()
Val1 Val2
1991-12-25 0.70 0.9
1991-12-26 2.10 2.7
1991-12-27 3.50 4.5
1991-12-28 5.25 6.5
正如标题所说,我不能运行这个代码:
def simple_map(x):
y = seasonal_decompose(x,model='additive',extrapolate_trend='freq',period=7,two_sided=False)
return y.trend
b.map_partitions(simple_map,meta=b).compute()
其中 b 是一个 dask DataFrame,以日期时间作为索引,一些系列的浮点数作为列,seasonal_decompose 是 statsmodel。
这是我得到的:
Index(...) must be called with a collection of some kind, 'seasonal' was passed
如果我这样做:
b.apply(simple_map,axis=0)
其中 b 是一个 pandas DataFrame 我得到了我想要的。
我哪里错了?
#可重现的例子:
import pandas as pd
from statsmodels.tsa.seasonal import seasonal_decompose
d = {'Val1': [3, 2,7,5], 'Val2': [2, 4,8,6]}
b=pd.DataFrame(data=d)
b=b.set_index(pd.to_datetime(['25/12/1991','26/12/1991','27/12/1991','28/12/1991']))
def simple_map(x):
y =seasonal_decompose(x,model='additive',extrapolate_trend='freq',period=2,two_sided=False)
return y.trend
b.apply(simple_map,axis=0)
Val1 Val2
1991-12-25 0.70 0.9
1991-12-26 2.10 2.7
1991-12-27 3.50 4.5
1991-12-28 5.25 6.5
这是我想用 dask 做的,但我做不到
事实上:
import dask.dataframe as dd
c=dd.from_pandas(b, npartitions=1)
c.map_partitions(simple_map,meta=c).compute()
产生上面指定的错误。
谢谢你的例子!
来自应用的文档字符串
Objects passed to the function are Series objects whose index is either the DataFrame's index (
axis=0
)
但是,map_partitions
将适用于整个 Dataframe。我建议稍微重写函数:
def simple_map_2(x):
xVal1 = seasonal_decompose(x.Val1,model='additive',extrapolate_trend='freq',period=2,two_sided=False)
xVal2 = seasonal_decompose(x.Val2,model='additive',extrapolate_trend='freq',period=2,two_sided=False)
return pd.DataFrame({'Val1': xVal1.trend, 'Val2': xVal2.trend})
c.map_partitions(simple_map_2,meta=make_meta(c)).compute()
Val1 Val2
1991-12-25 0.70 0.9
1991-12-26 2.10 2.7
1991-12-27 3.50 4.5
1991-12-28 5.25 6.5