创建生成器时如何避免将索引更改为 pandas 数据框中的时间戳

how avoid change index to timestamp in pandas dataframe when create a generator

我有一个 pandas DataFrame,其索引为 datetime64[ns],当我将其转换为生成器时,我的日期时间更改为时间戳。我怎样才能避免这种情况?

这里是代码。 (我也试过 itertuples 而不是 iterrow,但得到了相同的结果)。

>>> test.index
DatetimeIndex(['1990-01-02', '1990-01-03', '1990-01-04', '1990-01-05',
               '1990-01-08', '1990-01-09', '1990-01-10', '1990-01-11',
               '1990-01-12', '1990-01-15', 
               ...
               '2015-11-05', '2015-11-06', '2015-11-09', '2015-11-10',
               '2015-11-11', '2015-11-12', '2015-11-13', '2015-11-16',
               '2015-11-17', '2015-11-18'],
              dtype='datetime64[ns]', name=u'datetime', length=6524, freq=None, tz=None)
>>> test.iterrows()
<generator object iterrows at 0x104314640>
>>> test.iterrows().next()
(Timestamp('1990-01-02 00:00:00'), open               35.249999
high               37.500000
low                35.000000
close              37.250001
volume       45799600.000000
adj_close           1.157262
Name: 1990-01-02 00:00:00, dtype: float64)

实际上这是因为您的索引 dtypedatetime ,要将其转换为 date 您可以执行以下操作

df.index = df.index.date

例子


In [68]:
dates  = ['1990-01-02', '1990-01-03', '1990-01-04', '1990-01-05',
               '1990-01-08', '1990-01-09', '1990-01-10', '1990-01-11',
               '1990-01-12', '1990-01-15',
               '2015-11-05', '2015-11-06', '2015-11-09', '2015-11-10',
               '2015-11-11', '2015-11-12', '2015-11-13', '2015-11-16',
               '2015-11-17', '2015-11-18']

In [90]:
df = pd.DataFrame(np.arange(5).reshape(-1 ,1) , columns = ['data'] , index = [ dt.datetime.strptime(date , '%Y-%m-%d') for date in dates[0:5]] )
df
Out[90]:
          data
1990-01-02  0
1990-01-03  1
1990-01-04  2
1990-01-05  3
1990-01-08  4

In [91]:
df.index
Out[91]:
DatetimeIndex(['1990-01-02', '1990-01-03', '1990-01-04', '1990-01-05',
               '1990-01-08'],
              dtype='datetime64[ns]', freq=None)

In [92]:
df.index = df.index.date
df.index
Out[92]:
Index([1990-01-02, 1990-01-03, 1990-01-04, 1990-01-05, 1990-01-08], dtype='object')

In [88]:
df.iterrows().next()
Out[88]:
(datetime.date(1990, 1, 2), a    0
 b    1
 c    2
 d    3
 e    4
 Name: 1990-01-02, dtype: int32)