读取存储为字符的数据

Read data stored as character

我在数据框中有一列,其数字如下所示

      City                Temperature
      Edmonton, Alberta   4.1,13.6,15.2,15.7,14.2,15.2,16,14.2,17,13.1
      Edmonton, Alberta   15,18.2,14.8,16.5,14.6,16.9,14.3,17.5,13,15.8
      Edmonton, Alberta   15.8,17.9,16.9,15.1,13.2,13.1,16.8,12.4,14.7,15.6
      Edmonton, Alberta   14.3,17.3,14.6,17.3,14.8,14,15.4,14.1,16,15.4

我的 objective 是读取 Temperature 列中的数据并创建两个额外的列来存储最低和最高温度,如下所示。

      City                Temperature                                         Min      Max
      Edmonton, Alberta   4.1,13.6,15.2,15.7,14.2,15.2,16,14.2,17,13.1        4.1      16
      Edmonton, Alberta   15,18.2,14.8,16.5,14.6,16.9,14.3,17.5,13,15.8       13       18.2
      Edmonton, Alberta   15.8,17.9,16.9,15.1,13.2,13.1,16.8,12.4,14.7,15.6   12.4     17.9
      Edmonton, Alberta   14.3,17.3,14.6,17.3,14.8,14,15.4,14.1,16,15.4       14.1     17.3

我尝试了简单的 min(df$Temperature[1]) 功能,但没有用。所以不确定如何处理这些数据,非常感谢任何意见或建议。

我们需要 split 'Temperature' 列通过 ',',转换为 numeric,得到 rangerbind 它并创建两个列

df1[c("Min", "Max")] <- do.call(rbind, lapply(strsplit(as.character(df1$Temperature), ','), 
                        function(x) range(as.numeric(x))))

仅当 'Temperature' 列为 factor class 时才需要 as.character

scan 函数可以跨文本字段读取并解析除以 "sep" 参数的值:

> dat$min_temp <- sapply( as.character(dat$Temperature), 
                    function(x) min( as.numeric( scan( text=x, sep=",", what=""))))
Read 10 items
Read 10 items
Read 10 items
Read 10 items
> dat
              City                                       Temperature
1 Edmonton,Alberta      4.1,13.6,15.2,15.7,14.2,15.2,16,14.2,17,13.1
2 Edmonton,Alberta     15,18.2,14.8,16.5,14.6,16.9,14.3,17.5,13,15.8
3 Edmonton,Alberta 15.8,17.9,16.9,15.1,13.2,13.1,16.8,12.4,14.7,15.6
4 Edmonton,Alberta     14.3,17.3,14.6,17.3,14.8,14,15.4,14.1,16,15.4
  min_temp
1      4.1
2     13.0
3     12.4
4     14.0