as.POSIXct 中的可能错误
Possible bug in as.POSIXct
我正在处理时间数据,并将其转换为 POSIXct class(读取为字符串)。当我这样做时,它会处理我的所有数据,但不会处理一个特定的字符串。我做的是本质:
Time1 <- '1900-04-01' # First Year then Month then Day
Time1_convert <- as.POSIXct( Time1, format='%Y-%m-%d')
我将此矢量化,我的所有数据都得到了很好的转换。但是日期是 1920-05-01
Time1 <- '1920-05-01'
Time1_convert <- as.POSIXct( Time1, format='%Y-%m-%d' )
这个return不适用。我不知道为什么会这样。如果我添加到 as.POSIXct 函数 tz = 'GMT';时间可以很好地转换为所有值。我不明白的是为什么会发生这种情况,以及当我尝试使用超过 1500 个不同的时间值时为什么会发生这种情况。
我添加输出图像:
添加了更多代码:
for( m in c(01,02,03,04,05,06,07,08,09,10,11,12)){
print(as.POSIXct(paste0('1920-',m,'-01'),format='%Y-%m-%d'))
}
输出为:
[1] "1920-01-01 CMT"
[1] "1920-02-01 CMT"
[1] "1920-03-01 CMT"
[1] "1920-04-01 CMT"
[1] NA
[1] "1920-06-01 -04"
[1] "1920-07-01 -04"
[1] "1920-08-01 -04"
[1] "1920-09-01 -04"
[1] "1920-10-01 -04"
[1] "1920-11-01 -04"
[1] "1920-12-01 -04"
sessionInfo() 的输出:
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
locale:
[1] LC_CTYPE=es_AR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_AR.UTF-8 LC_COLLATE=es_AR.UTF-8
[5] LC_MONETARY=es_AR.UTF-8 LC_MESSAGES=es_AR.UTF-8
[7] LC_PAPER=es_AR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_AR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
loaded via a namespace (and not attached):
[1] tools_3.3.3
您的本地设置似乎位于阿根廷。碰巧的是,阿根廷在那一天将他们的时区从 UTC-4:16:48 重置为 UTC-4。我认为这意味着 1920 年 5 月 5 日阿根廷 没有 午夜。当您将该字符串转换为 POSIXct 时,它会在您当地时区当天的午夜对其进行解释,巧合的是,这是阿根廷不存在的时间。 (这解释了为什么尝试相同代码的其他人无法重现它。)
http://www.statoids.com/tar.html
Locations in Argentina observed Local Mean Time until 1894-10-31 00:00
(as measured after the transition). At that moment, the entire country
synchronized on Córdoba's Local Mean Time, which was UTC-4:16:48. The
next transition occurred at 1920-05-01 00:00, when clocks were set
ahead sixteen minutes and forty-eight seconds to be an even UTC-4.
Argentina remained unified on UTC-4 until its first daylight saving
time was inaugurated in 1931.
如果您需要 POSIXct 对象,您可以考虑:
a) 指定当天午夜存在的不同时区。
as.POSIXct("1920-05-01", tz = "UTC")
# Or perhaps other nearby time zones didn't have that specific problem?
b) 将时间存储在组件中,其中一个用于日期,一个用于一天中的时间。例如time = hour(Time1) + minute(Time1)/60
。它有点笨拙,但可以执行您需要的日期/时间计算。
我正在处理时间数据,并将其转换为 POSIXct class(读取为字符串)。当我这样做时,它会处理我的所有数据,但不会处理一个特定的字符串。我做的是本质:
Time1 <- '1900-04-01' # First Year then Month then Day
Time1_convert <- as.POSIXct( Time1, format='%Y-%m-%d')
我将此矢量化,我的所有数据都得到了很好的转换。但是日期是 1920-05-01
Time1 <- '1920-05-01'
Time1_convert <- as.POSIXct( Time1, format='%Y-%m-%d' )
这个return不适用。我不知道为什么会这样。如果我添加到 as.POSIXct 函数 tz = 'GMT';时间可以很好地转换为所有值。我不明白的是为什么会发生这种情况,以及当我尝试使用超过 1500 个不同的时间值时为什么会发生这种情况。
我添加输出图像:
添加了更多代码:
for( m in c(01,02,03,04,05,06,07,08,09,10,11,12)){
print(as.POSIXct(paste0('1920-',m,'-01'),format='%Y-%m-%d'))
}
输出为:
[1] "1920-01-01 CMT"
[1] "1920-02-01 CMT"
[1] "1920-03-01 CMT"
[1] "1920-04-01 CMT"
[1] NA
[1] "1920-06-01 -04"
[1] "1920-07-01 -04"
[1] "1920-08-01 -04"
[1] "1920-09-01 -04"
[1] "1920-10-01 -04"
[1] "1920-11-01 -04"
[1] "1920-12-01 -04"
sessionInfo() 的输出:
R version 3.3.3 (2017-03-06)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 9 (stretch)
locale:
[1] LC_CTYPE=es_AR.UTF-8 LC_NUMERIC=C
[3] LC_TIME=es_AR.UTF-8 LC_COLLATE=es_AR.UTF-8
[5] LC_MONETARY=es_AR.UTF-8 LC_MESSAGES=es_AR.UTF-8
[7] LC_PAPER=es_AR.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=es_AR.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
loaded via a namespace (and not attached):
[1] tools_3.3.3
您的本地设置似乎位于阿根廷。碰巧的是,阿根廷在那一天将他们的时区从 UTC-4:16:48 重置为 UTC-4。我认为这意味着 1920 年 5 月 5 日阿根廷 没有 午夜。当您将该字符串转换为 POSIXct 时,它会在您当地时区当天的午夜对其进行解释,巧合的是,这是阿根廷不存在的时间。 (这解释了为什么尝试相同代码的其他人无法重现它。)
http://www.statoids.com/tar.html
Locations in Argentina observed Local Mean Time until 1894-10-31 00:00 (as measured after the transition). At that moment, the entire country synchronized on Córdoba's Local Mean Time, which was UTC-4:16:48. The next transition occurred at 1920-05-01 00:00, when clocks were set ahead sixteen minutes and forty-eight seconds to be an even UTC-4. Argentina remained unified on UTC-4 until its first daylight saving time was inaugurated in 1931.
如果您需要 POSIXct 对象,您可以考虑:
a) 指定当天午夜存在的不同时区。
as.POSIXct("1920-05-01", tz = "UTC")
# Or perhaps other nearby time zones didn't have that specific problem?
b) 将时间存储在组件中,其中一个用于日期,一个用于一天中的时间。例如time = hour(Time1) + minute(Time1)/60
。它有点笨拙,但可以执行您需要的日期/时间计算。