Hive:如何将 yyyy-mm-ddThh:mm:SS:sssZ 转换为小时单位

Hive: how to convert yyyy-mm-ddThh:mm:SS:sssZ into hour units

我有以下时间戳:

2020-03-09T07:34:06:825Z
2020-03-09T07:54:12:220Z
2020-03-09T03:54:11:041Z
2020-03-09T09:22:10:220Z
2020-03-09T11:13:36:217Z
2020-03-09T11:23:26:040Z
2020-03-09T11:43:35:721Z

我想将它们转换为小时单位,例如:

2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00

这可能吗?任何帮助将不胜感激。 Whosebug 一直是救命稻草。它可以是日期时间或字符串格式。 谢谢大家!

使用unix_timestampfrom_unixtime函数对需要的时间戳进行转换格式化

select from_unixtime(unix_timestamp(string("2020-03-09T07:34:06:825Z"),"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'"),"yyyy-MM-dd'T'hh:00:00") as new_ts;

+-------------------+
|new_ts             |
+-------------------+
|2020-03-09T07:00:00|
+-------------------+

Explanation:

unix_timestamp(
string("2020-03-09T07:34:06:825Z"), --sample data
"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'") --match the data format

from_unixtime('unix_timestamp...etc',"yyyy-MM-dd'T'hh:00:00") --to format as required

使用regexp_replace:

with your_data as (
select stack(
'2020-03-09T07:34:06:825Z',
'2020-03-09T07:54:12:220Z',
'2020-03-09T03:54:11:041Z',
'2020-03-09T09:22:10:220Z',
'2020-03-09T11:13:36:217Z',
'2020-03-09T11:23:26:040Z',
'2020-03-09T11:43:35:721Z'
) as str
)

select regexp_replace(str,'(\d{4}-\d{2}-\d{2})T(\d{2}).*','T:00:00') 
   from your_data;

结果:

2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00

解释:

正则表达式定义两组:

$1 是日期部分 (\d{4}-\d{2}-\d{2})

$2 是 T '(\d{2})' 之后的小时部分 .* 末尾的所有其他内容都将被忽略。

你提取'T:00:00'