将字符格式的时间列转换为 R 中的可操作时间格式

Convert time column in character format to manipulable time format in R

我的问题是关于b列的标准化。我需要这些数据的格式更容易构建图形。

a<- c("Jackson Brice / The Shocker","Flash Thompson", "Mr. Harrington","Mac Gargan","Betty Brant", "Ann Marie Hoag","Steve Rogers / Captain America", "Pepper Potts", "Karen") 
b<- c("2:30", "2:15", "2", "1:15", "1:15", "1", ":55",":45", "v")

ab <- cbind.data.frame(a,b)

                               a    b
1    Jackson Brice / The Shocker 2:30
2                 Flash Thompson 2:15
3                 Mr. Harrington    2
4                     Mac Gargan 1:15
5                    Betty Brant 1:15
6                 Ann Marie Hoag    1
7 Steve Rogers / Captain America    1
8                   Pepper Potts  :45
9                          Karen    v

作为输出:

                            a        b
1    Jackson Brice / The Shocker 00:02:30
2                 Flash Thompson 00:02:15
3                 Mr. Harrington 00:02:00
4                     Mac Gargan 00:01:15
5                    Betty Brant 00:01:15
6                 Ann Marie Hoag 00:01:00
7 Steve Rogers / Captain America 00:01:00
8                   Pepper Potts 00:00:45
9                          Karen 00:00:00

如果可能,b列的对象采用时间的可操作格式。

使用tidyr::separatetidyr::unite的解决方案可以实现。做法是先把一个包含alphabetic的值替换成00:00:00。将零件分成 3 列。使用 dplyr::mutate_at 所有 3 列都更改为 00 格式。最后,联合所有三个列。

library(tidyverse)

ab %>% mutate_if(is.factor, as.character) %>%  #Change any factor in character
  mutate(b = ifelse(grepl("[[:alpha:]]", b), "00:00:00", b)) %>%
  mutate(b = ifelse(grepl(":", b), b, paste(b,"00",sep=":")) ) %>%
  separate(b, into = c("b1", "b2", "b3"), sep = ":", fill="left", extra = "drop") %>%
  mutate_at(vars(starts_with("b")), 
      funs(sprintf("%02d", as.numeric(ifelse(is.na(.) | . == "",0,.))))) %>%
  unite("b", starts_with("b"), sep=":")

#                                a        b
# 1    Jackson Brice / The Shocker 00:02:30
# 2                 Flash Thompson 00:02:15
# 3                 Mr. Harrington 00:02:00
# 4                     Mac Gargan 00:01:15
# 5                    Betty Brant 00:01:15
# 6                 Ann Marie Hoag 00:01:00
# 7 Steve Rogers / Captain America 00:00:55
# 8                   Pepper Potts 00:00:45
# 9                          Karen 00:00:00

数据:

a<- c("Jackson Brice / The Shocker","Flash Thompson", "Mr. Harrington","Mac Gargan","Betty Brant",
 "Ann Marie Hoag","Steve Rogers / Captain America", "Pepper Potts", "Karen") 
b<- c("2:30", "2:15", "2", "1:15", "1:15", "1", ":55",":45", "v")

ab <- cbind.data.frame(a,b

所以我不得不对您要尝试做的事情做出一些假设,例如单位和你想用字符值做什么,但希望这个函数能给你一些有用的东西。

随着时间的推移,最大的挑战是从文本解析时需要一些相当明确的规则。作为我的结果,我不得不在函数中放置一些 if 语句以使其工作,但只要有可能,尽量保持你的时间格式尽可能一致。

library(lubridate)

formatTime <- function(x) {

    # Check for a : seperator in the text
    if(grepl(":",x, fixed = TRUE)) {

        y <- unlist(strsplit(x,":", fixed = TRUE))

        # If there is no value before the : then add "00" before the :
        if(y[1]=="") {
            z <- ms(paste("00",y[2],collapse = ":"), quiet=TRUE)
        } else {
            z <- ms(paste(y,collapse = ":"), quiet=TRUE)
        }
    } else { 

        # If there is no : then add "00" after the :
        z <- ms(paste(x,"00",collapse = ":"), quiet=TRUE)
    }

    # If it did not pare with ms, i.e. it was a character, then assign zero time "00:00"
    if(is.na(z)) z <- ms("0:00")

    # Converted to duration due to issues returning period with lapply.  
    # Make dataframe to retun units and name with lapply.
    return(data.frame(time = as.duration(z)))
}

# Convert factor variable to character
ab$b <- as.character(ab$b)

ab <- cbind(ab,rbindlist(lapply(ab$b,formatTime)))

我开始尝试使用一个时间段,但它不能 return 正确地使用 apply 语句,所以我转换为一个持续时间。这可能与您的示例显示不同,但它应该可以很好地显示图表。
如果我错过了您需要的内容,请告诉我,我会更新答案。