为什么第 'Date' 列被最后一个工作日取代?
Why is Column 'Date' getting replaced by last working day?
我正在使用一个包含日期列的数据框,我必须找到每个月的最后一个工作日期,我使用的代码有效,但它的工作原理没有意义
数据框 'apple' 最初有 6 列,但我主要关注 'Date' 列,该列的日期范围为每个月的 2014-1980 年
示例数据:
Date Open High Low Close Volume Adj Close
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-07-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-07-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-07-02 93.87 94.06 93.09 93.48 28420900 93.48
4 2014-07-01 93.52 94.07 93.13 93.52 38170200 93.52
from pandas.tseries.offsets import MonthEnd
apple['Last_Day']=pd.to_datetime(apple['Date'],format="%Y-%m")+MonthEnd(0)
banana=apple.loc[-apple.Last_Day.duplicated()]
我原以为新创建的 'Last_Day' 列会有每个月的最后一天,但令人惊讶的是 'Date' 列有每个月的最后一个工作日,我不明白因为我没有将任何东西初始化为 'Date' 所以 'Date' 中的所有值是如何被上一个工作日替换的,
输出:
Date Open High Low Close Volume Adj Close Last_Day
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35 2014-07-31
5 2014-06-30 92.10 93.73 92.09 92.93 49482300 92.93 2014-06-30
26 2014-05-30 637.98 644.17 628.90 633.00 141005200 90.43 2014-05-31
47 2014-04-30 592.64 599.43 589.80 590.09 114160200 83.83 2014-04-30
68 2014-03-31 539.23 540.81 535.93 536.74 42167300 76.25 2014-03-31
89 2014-02-28 529.08 532.75 522.12 526.24 92992200 74.76 2014-02-28
108 2014-01-31 495.18 501.53 493.55 500.60 116199300 70.69 2014-01-31
No, my doubt is why is the Date column getting replaced by last working date, I do want the last working day but I did not understand how was the Date column replaced by last working day
没有替换,但每月和每年 Date
的最后一个值取决于删除重复项后 Date
列中的数据。
所以这里的最后一个值与 Last_Day
相同,除了 2014 年 7 月 - 每月的最后一天 2014-07-08
。
为了更好地理解更改的数据和排序 - 然后获取每个月的第一个值或每个月的最后一个值:
print (apple)
Date Open High Low Close Volume Adj Close
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52
from pandas.tseries.offsets import MonthEnd
apple['Date']=pd.to_datetime(apple['Date'])
apple = apple.sort_values('Date')
print (apple)
Date Open High Low Close Volume Adj Close
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52
apple['Last_Day']=apple['Date']+MonthEnd(0)
banana=apple.loc[-apple.Last_Day.duplicated()]
print (banana)
Date Open High Low Close Volume Adj Close Last_Day
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48 2014-05-31
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03 2014-06-30
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35 2014-07-31
from pandas.tseries.offsets import MonthEnd
apple['Date']=pd.to_datetime(apple['Date'])
apple1 = apple.sort_values('Date', ascending=False)
print (apple1)
Date Open High Low Close Volume Adj Close
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48
apple1['Last_Day']=apple1['Date']+MonthEnd(0)
banana1=apple1.loc[-apple1.Last_Day.duplicated()]
print (banana1)
Date Open High Low Close Volume Adj Close Last_Day
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52 2014-07-31
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97 2014-06-30
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48 2014-05-31
我正在使用一个包含日期列的数据框,我必须找到每个月的最后一个工作日期,我使用的代码有效,但它的工作原理没有意义
数据框 'apple' 最初有 6 列,但我主要关注 'Date' 列,该列的日期范围为每个月的 2014-1980 年 示例数据:
Date Open High Low Close Volume Adj Close
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-07-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-07-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-07-02 93.87 94.06 93.09 93.48 28420900 93.48
4 2014-07-01 93.52 94.07 93.13 93.52 38170200 93.52
from pandas.tseries.offsets import MonthEnd
apple['Last_Day']=pd.to_datetime(apple['Date'],format="%Y-%m")+MonthEnd(0)
banana=apple.loc[-apple.Last_Day.duplicated()]
我原以为新创建的 'Last_Day' 列会有每个月的最后一天,但令人惊讶的是 'Date' 列有每个月的最后一个工作日,我不明白因为我没有将任何东西初始化为 'Date' 所以 'Date' 中的所有值是如何被上一个工作日替换的, 输出:
Date Open High Low Close Volume Adj Close Last_Day
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35 2014-07-31
5 2014-06-30 92.10 93.73 92.09 92.93 49482300 92.93 2014-06-30
26 2014-05-30 637.98 644.17 628.90 633.00 141005200 90.43 2014-05-31
47 2014-04-30 592.64 599.43 589.80 590.09 114160200 83.83 2014-04-30
68 2014-03-31 539.23 540.81 535.93 536.74 42167300 76.25 2014-03-31
89 2014-02-28 529.08 532.75 522.12 526.24 92992200 74.76 2014-02-28
108 2014-01-31 495.18 501.53 493.55 500.60 116199300 70.69 2014-01-31
No, my doubt is why is the Date column getting replaced by last working date, I do want the last working day but I did not understand how was the Date column replaced by last working day
没有替换,但每月和每年 Date
的最后一个值取决于删除重复项后 Date
列中的数据。
所以这里的最后一个值与 Last_Day
相同,除了 2014 年 7 月 - 每月的最后一天 2014-07-08
。
为了更好地理解更改的数据和排序 - 然后获取每个月的第一个值或每个月的最后一个值:
print (apple)
Date Open High Low Close Volume Adj Close
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52
from pandas.tseries.offsets import MonthEnd
apple['Date']=pd.to_datetime(apple['Date'])
apple = apple.sort_values('Date')
print (apple)
Date Open High Low Close Volume Adj Close
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52
apple['Last_Day']=apple['Date']+MonthEnd(0)
banana=apple.loc[-apple.Last_Day.duplicated()]
print (banana)
Date Open High Low Close Volume Adj Close Last_Day
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48 2014-05-31
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03 2014-06-30
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35 2014-07-31
from pandas.tseries.offsets import MonthEnd
apple['Date']=pd.to_datetime(apple['Date'])
apple1 = apple.sort_values('Date', ascending=False)
print (apple1)
Date Open High Low Close Volume Adj Close
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52
0 2014-07-08 96.27 96.80 93.92 95.35 65130000 95.35
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97
2 2014-06-03 93.67 94.10 93.20 94.03 22891800 94.03
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48
apple1['Last_Day']=apple1['Date']+MonthEnd(0)
banana1=apple1.loc[-apple1.Last_Day.duplicated()]
print (banana1)
Date Open High Low Close Volume Adj Close Last_Day
4 2014-07-31 93.52 94.07 93.13 93.52 38170200 93.52 2014-07-31
1 2014-06-07 94.14 95.99 94.10 95.97 56305400 95.97 2014-06-30
3 2014-05-31 93.87 94.06 93.09 93.48 28420900 93.48 2014-05-31