R mdy_hms 不可预测的结果?
R mdy_hms unpredictable results?
使用 mdy_hms 函数处理我拥有的一些数据并且 运行 遇到了一个有趣的问题。我从许多来源上传数据,但它们都应该是 csv 格式并符合相同的准则,所以它们应该都是相同的格式。
我有 2 个变量。
> good_time
[1] "12/28/2019 16:22"
> test_time
[1] "3/4/2020 16:46"
> str(good_time)
chr "12/28/2019 16:22"
> str(test_time)
chr "3/4/2020 16:46"
所以它们在格式方面对我来说似乎是一样的,但是 good_time 可以通过 mdy_hms 解析得很好,而 test_time 不能。谁能给我解释一下为什么?
> mdy_hms(good_time)
[1] "2020-12-28 19:16:22 UTC"
> mdy_hms(test_time)
[1] NA
Warning message:
All formats failed to parse. No formats found.
奇怪的是,如果我使用 mdy_hm(test_time) 它工作正常。
> mdy_hm(test_time)
[1] "2020-03-04 16:46:00 UTC"
lubridate
期望在个位数月份(和天数)中有前导零。
来自?lubridate::mdy_hms
:
truncated: integer, indicating how many formats can be missing. See
details.
...
The most common type of irregularity in date-time data is the
truncation due to rounding or unavailability of the time stamp. If
the 'truncated' parameter is non-zero, the 'ymd_hms()' functions
also check for truncated formats. For example, 'ymd_hms()' with
'truncated = 3' will also parse incomplete dates like 2012-06-01
12:23, 2012-06-01 12 and '2012-06-01'. NOTE: The 'ymd()' family of
functions is based on 'base::strptime()' which currently fails to
parse %y-%m formats.
只需添加 truncated=1
:
lubridate::mdy_hms("3/4/2020 16:46", truncated=1)
# [1] "2020-03-04 16:46:00 UTC"
(这也在 tidyverse/lubridate#669 中讨论过。)
使用 mdy_hms 函数处理我拥有的一些数据并且 运行 遇到了一个有趣的问题。我从许多来源上传数据,但它们都应该是 csv 格式并符合相同的准则,所以它们应该都是相同的格式。
我有 2 个变量。
> good_time
[1] "12/28/2019 16:22"
> test_time
[1] "3/4/2020 16:46"
> str(good_time)
chr "12/28/2019 16:22"
> str(test_time)
chr "3/4/2020 16:46"
所以它们在格式方面对我来说似乎是一样的,但是 good_time 可以通过 mdy_hms 解析得很好,而 test_time 不能。谁能给我解释一下为什么?
> mdy_hms(good_time)
[1] "2020-12-28 19:16:22 UTC"
> mdy_hms(test_time)
[1] NA
Warning message:
All formats failed to parse. No formats found.
奇怪的是,如果我使用 mdy_hm(test_time) 它工作正常。
> mdy_hm(test_time)
[1] "2020-03-04 16:46:00 UTC"
lubridate
期望在个位数月份(和天数)中有前导零。
来自?lubridate::mdy_hms
:
truncated: integer, indicating how many formats can be missing. See
details.
...
The most common type of irregularity in date-time data is the
truncation due to rounding or unavailability of the time stamp. If
the 'truncated' parameter is non-zero, the 'ymd_hms()' functions
also check for truncated formats. For example, 'ymd_hms()' with
'truncated = 3' will also parse incomplete dates like 2012-06-01
12:23, 2012-06-01 12 and '2012-06-01'. NOTE: The 'ymd()' family of
functions is based on 'base::strptime()' which currently fails to
parse %y-%m formats.
只需添加 truncated=1
:
lubridate::mdy_hms("3/4/2020 16:46", truncated=1)
# [1] "2020-03-04 16:46:00 UTC"
(这也在 tidyverse/lubridate#669 中讨论过。)