切片索引 Dask 数据框
Slice index Dask dataframe
是否有一种简单的方法来切片 Dask 数据帧索引:
Pandas?
中的这些内容
index_element = df.index[-1]
你在追求什么?
在 dask 数据帧上执行 .index[i]
将得到
import dask.dataframe as dd
df = dd.demo.make_timeseries(
start="2000-01-01",
end="2000-01-03",
dtypes={"id": int, "z": int},
freq="1h",
partition_freq="24h",
)
df.index[-1]
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-7-d70d3c1197c1> in <module>
----> 1 df.index[-1]
~/miniconda/envs/main/lib/python3.8/site-packages/dask/dataframe/core.py in __getitem__(self, key)
3172 graph = HighLevelGraph.from_collections(name, dsk, dependencies=[self, key])
3173 return Series(graph, name, self._meta, self.divisions)
-> 3174 raise NotImplementedError(
3175 "Series getitem in only supported for other series objects "
3176 "with matching partition structure"
NotImplementedError: Series getitem in only supported for other series objects with matching partition structure
如果您在最后一行的索引之后,您可以这样做:
df.tail(1).index
给予
DatetimeIndex(['2000-01-02 23:00:00'], dtype='datetime64[ns]', name='timestamp', freq='H')
是否有一种简单的方法来切片 Dask 数据帧索引:
Pandas?
中的这些内容index_element = df.index[-1]
你在追求什么?
在 dask 数据帧上执行 .index[i]
将得到
import dask.dataframe as dd
df = dd.demo.make_timeseries(
start="2000-01-01",
end="2000-01-03",
dtypes={"id": int, "z": int},
freq="1h",
partition_freq="24h",
)
df.index[-1]
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-7-d70d3c1197c1> in <module>
----> 1 df.index[-1]
~/miniconda/envs/main/lib/python3.8/site-packages/dask/dataframe/core.py in __getitem__(self, key)
3172 graph = HighLevelGraph.from_collections(name, dsk, dependencies=[self, key])
3173 return Series(graph, name, self._meta, self.divisions)
-> 3174 raise NotImplementedError(
3175 "Series getitem in only supported for other series objects "
3176 "with matching partition structure"
NotImplementedError: Series getitem in only supported for other series objects with matching partition structure
如果您在最后一行的索引之后,您可以这样做:
df.tail(1).index
给予
DatetimeIndex(['2000-01-02 23:00:00'], dtype='datetime64[ns]', name='timestamp', freq='H')