Hive:如何将 yyyy-mm-ddThh:mm:SS:sssZ 转换为小时单位
Hive: how to convert yyyy-mm-ddThh:mm:SS:sssZ into hour units
我有以下时间戳:
2020-03-09T07:34:06:825Z
2020-03-09T07:54:12:220Z
2020-03-09T03:54:11:041Z
2020-03-09T09:22:10:220Z
2020-03-09T11:13:36:217Z
2020-03-09T11:23:26:040Z
2020-03-09T11:43:35:721Z
我想将它们转换为小时单位,例如:
2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
这可能吗?任何帮助将不胜感激。 Whosebug 一直是救命稻草。它可以是日期时间或字符串格式。
谢谢大家!
使用unix_timestamp
和from_unixtime
函数对需要的时间戳进行转换格式化
select from_unixtime(unix_timestamp(string("2020-03-09T07:34:06:825Z"),"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'"),"yyyy-MM-dd'T'hh:00:00") as new_ts;
+-------------------+
|new_ts |
+-------------------+
|2020-03-09T07:00:00|
+-------------------+
Explanation:
unix_timestamp(
string("2020-03-09T07:34:06:825Z"), --sample data
"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'") --match the data format
from_unixtime('unix_timestamp...etc',"yyyy-MM-dd'T'hh:00:00") --to format as required
使用regexp_replace:
with your_data as (
select stack(
'2020-03-09T07:34:06:825Z',
'2020-03-09T07:54:12:220Z',
'2020-03-09T03:54:11:041Z',
'2020-03-09T09:22:10:220Z',
'2020-03-09T11:13:36:217Z',
'2020-03-09T11:23:26:040Z',
'2020-03-09T11:43:35:721Z'
) as str
)
select regexp_replace(str,'(\d{4}-\d{2}-\d{2})T(\d{2}).*','T:00:00')
from your_data;
结果:
2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
解释:
正则表达式定义两组:
$1 是日期部分 (\d{4}-\d{2}-\d{2})
$2 是 T '(\d{2})' 之后的小时部分
.*
末尾的所有其他内容都将被忽略。
你提取'T:00:00'
我有以下时间戳:
2020-03-09T07:34:06:825Z
2020-03-09T07:54:12:220Z
2020-03-09T03:54:11:041Z
2020-03-09T09:22:10:220Z
2020-03-09T11:13:36:217Z
2020-03-09T11:23:26:040Z
2020-03-09T11:43:35:721Z
我想将它们转换为小时单位,例如:
2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
这可能吗?任何帮助将不胜感激。 Whosebug 一直是救命稻草。它可以是日期时间或字符串格式。 谢谢大家!
使用unix_timestamp
和from_unixtime
函数对需要的时间戳进行转换格式化
select from_unixtime(unix_timestamp(string("2020-03-09T07:34:06:825Z"),"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'"),"yyyy-MM-dd'T'hh:00:00") as new_ts;
+-------------------+
|new_ts |
+-------------------+
|2020-03-09T07:00:00|
+-------------------+
Explanation:
unix_timestamp(
string("2020-03-09T07:34:06:825Z"), --sample data
"yyyy-MM-dd'T'hh:mm:ss:SSS'Z'") --match the data format
from_unixtime('unix_timestamp...etc',"yyyy-MM-dd'T'hh:00:00") --to format as required
使用regexp_replace:
with your_data as (
select stack(
'2020-03-09T07:34:06:825Z',
'2020-03-09T07:54:12:220Z',
'2020-03-09T03:54:11:041Z',
'2020-03-09T09:22:10:220Z',
'2020-03-09T11:13:36:217Z',
'2020-03-09T11:23:26:040Z',
'2020-03-09T11:43:35:721Z'
) as str
)
select regexp_replace(str,'(\d{4}-\d{2}-\d{2})T(\d{2}).*','T:00:00')
from your_data;
结果:
2020-03-09T07:00:00
2020-03-09T07:00:00
2020-03-09T03:00:00
2020-03-09T09:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
2020-03-09T11:00:00
解释:
正则表达式定义两组:
$1 是日期部分 (\d{4}-\d{2}-\d{2})
$2 是 T '(\d{2})' 之后的小时部分
.*
末尾的所有其他内容都将被忽略。
你提取'T:00:00'