Pandas 在不同的列中拆分日期和时间
Pandas split date and time in different columns
我有这样的日期列
0 Feb-23-21 10:35AM
1 10:18AM
2 10:13AM
3 10:10AM
4 09:15AM
5 09:02AM
6 08:13AM
7 08:07AM
8 05:34AM
9 12:52AM
10 Feb-22-21 07:00PM
11 07:00PM
12 06:22PM
13 05:56PM
14 05:18PM
15 05:07PM
16 05:00PM
17 04:31PM
18 04:11PM
19 04:05PM
期望的输出是我想在不同的列中拆分日期和时间,如下所示:
0 Feb-23-21 10:35AM
1 Feb-23-21 10:18AM
2 Feb-23-21 10:13AM
3 Feb-23-21 10:10AM
4 Feb-23-21 09:15AM
5 Feb-23-21 09:02AM
6 Feb-23-21 08:13AM
7 Feb-23-21 08:07AM
8 Feb-23-21 05:34AM
9 Feb-23-21 12:52AM
10 Feb-22-21 07:00PM
11 Feb-22-21 07:00PM
12 Feb-22-21 06:22PM
13 Feb-22-21 05:56PM
14 Feb-22-21 05:18PM
15 Feb-22-21 05:07PM
16 Feb-22-21 05:00PM
17 Feb-22-21 04:31PM
18 Feb-22-21 04:11PM
19 Feb-22-21 04:05PM
可能,我想在不同的列中显示日期和时间。实际上,我正在从 here 中抓取新闻,而编写的代码是这样的:
news = pd.read_html(str(response.body), attrs={'class': 'fullview-news-outer'})[0]
links = []
for a in response.css('a[class="tab-link-news"]::attr(href)').getall():
links.append(a)
news.columns = ['Date', 'News Headline']
news['Article Link'] = links
使用给定的 date/time 格式,您可以
- 拆分 space 日期和时间
- 将倒数第二个元素放入“日期”列并向前填充空白
- 将最后一个元素放入“time”列
前:
df = pd.DataFrame({'input': ["Feb-23-21 10:35AM", "10:18AM", "10:13AM", "Feb-22-21 07:00PM", "07:00PM", "06:22PM"]})
df['date'] = df['input'].str.split(' ').str[-2].fillna(method='ffill')
df['time'] = df['input'].str.split(' ').str[-1]
# df
# input date time
# 0 Feb-23-21 10:35AM Feb-23-21 10:35AM
# 1 10:18AM Feb-23-21 10:18AM
# 2 10:13AM Feb-23-21 10:13AM
# 3 Feb-22-21 07:00PM Feb-22-21 07:00PM
# 4 07:00PM Feb-22-21 07:00PM
# 5 06:22PM Feb-22-21 06:22PM
现在您还可以从字符串转换为 datetime
,例如
df['datetime'] = pd.to_datetime(df['date']+' '+df['time'])
# df['datetime']
# 0 2021-02-23 10:35:00
# 1 2021-02-23 10:18:00
# 2 2021-02-23 10:13:00
# 3 2021-02-22 19:00:00
# 4 2021-02-22 19:00:00
# 5 2021-02-22 18:22:00
# Name: datetime, dtype: datetime64[ns]
为您提供进一步处理数据的更多可能性。
我有这样的日期列
0 Feb-23-21 10:35AM
1 10:18AM
2 10:13AM
3 10:10AM
4 09:15AM
5 09:02AM
6 08:13AM
7 08:07AM
8 05:34AM
9 12:52AM
10 Feb-22-21 07:00PM
11 07:00PM
12 06:22PM
13 05:56PM
14 05:18PM
15 05:07PM
16 05:00PM
17 04:31PM
18 04:11PM
19 04:05PM
期望的输出是我想在不同的列中拆分日期和时间,如下所示:
0 Feb-23-21 10:35AM
1 Feb-23-21 10:18AM
2 Feb-23-21 10:13AM
3 Feb-23-21 10:10AM
4 Feb-23-21 09:15AM
5 Feb-23-21 09:02AM
6 Feb-23-21 08:13AM
7 Feb-23-21 08:07AM
8 Feb-23-21 05:34AM
9 Feb-23-21 12:52AM
10 Feb-22-21 07:00PM
11 Feb-22-21 07:00PM
12 Feb-22-21 06:22PM
13 Feb-22-21 05:56PM
14 Feb-22-21 05:18PM
15 Feb-22-21 05:07PM
16 Feb-22-21 05:00PM
17 Feb-22-21 04:31PM
18 Feb-22-21 04:11PM
19 Feb-22-21 04:05PM
可能,我想在不同的列中显示日期和时间。实际上,我正在从 here 中抓取新闻,而编写的代码是这样的:
news = pd.read_html(str(response.body), attrs={'class': 'fullview-news-outer'})[0]
links = []
for a in response.css('a[class="tab-link-news"]::attr(href)').getall():
links.append(a)
news.columns = ['Date', 'News Headline']
news['Article Link'] = links
使用给定的 date/time 格式,您可以
- 拆分 space 日期和时间
- 将倒数第二个元素放入“日期”列并向前填充空白
- 将最后一个元素放入“time”列
前:
df = pd.DataFrame({'input': ["Feb-23-21 10:35AM", "10:18AM", "10:13AM", "Feb-22-21 07:00PM", "07:00PM", "06:22PM"]})
df['date'] = df['input'].str.split(' ').str[-2].fillna(method='ffill')
df['time'] = df['input'].str.split(' ').str[-1]
# df
# input date time
# 0 Feb-23-21 10:35AM Feb-23-21 10:35AM
# 1 10:18AM Feb-23-21 10:18AM
# 2 10:13AM Feb-23-21 10:13AM
# 3 Feb-22-21 07:00PM Feb-22-21 07:00PM
# 4 07:00PM Feb-22-21 07:00PM
# 5 06:22PM Feb-22-21 06:22PM
现在您还可以从字符串转换为 datetime
,例如
df['datetime'] = pd.to_datetime(df['date']+' '+df['time'])
# df['datetime']
# 0 2021-02-23 10:35:00
# 1 2021-02-23 10:18:00
# 2 2021-02-23 10:13:00
# 3 2021-02-22 19:00:00
# 4 2021-02-22 19:00:00
# 5 2021-02-22 18:22:00
# Name: datetime, dtype: datetime64[ns]
为您提供进一步处理数据的更多可能性。