如何以日期格式转换特定 excel 文件的列?

How to convert in Date format the columns of a particular excel file?

我有一个 excel 文件,其中有 77 列(有 43 列 NA),长度不同,其中 12 列是日期。理想情况下,我想将它导入 R 数据集,其中包含以日期格式引用日期的列,而其他列以数字格式引用。 Whosebug 中有很多 material,我尝试了所有选项,但它不起作用。

第一个选择是直接从 excel:

dataset <- read_xlsx("Data.xlsx", col_types = "numeric") #it gives everything numeric but column date always in this format "36164"

#I also tried something like this:

dataset <- read_xlsx("Data.xlsx", col_types = c("date", rep("numeric", n))) #where "n" stands for all the columns with numbers I have but it did not work

我可以导入日期列不正确的数据。经过一些清洁(删除 NA 列)后,我得到一个具有不同列长度的 tbl。我尝试了以下代码将不正确的列日期转换为日期格式:

dataset <- janitor::remove_empty(dataset, which = "cols") #remove NA columns
dataset <- dataset[-c(1),] #remove the first row of all columns

# Now using this command I could transform each incorrect date column into a date format:

  date <- as.Date(as.numeric(dataset$column1), origin = "1899-12-30")

# I would like to do it for all the date columns in one shot but when I try to do it in this way

  as.Date(as.numeric(dataset[,c(1,3,5,7,14,16,18,20,21,23,25,32)]), origin = "1899-12-30")

# I get an error, probably because the columns have different length
# the error is: Error in as.Date(as.numeric(var_dataset[, c(1, 3, 5, 7, 14, 16, 18, 20,  : 
  'list' object cannot be coerced to type 'double'
# unlisting the object doesn't solve the problem

我知道它缺少重现我的问题的数据,但在第一种情况下,我不知道如何估算我相当大的 excel 文件,而在第二种情况下,我不知道如何创建一个包含许多不同长度列的 tbl 而不会浪费很多时间。对不起。

你有什么解决办法吗?直接从 Excel 导入或使用 dataframe

非常感谢

我在这里附上我的数据集的结构:

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   5500 obs. of  77 variables:
 $ Name...1                                                     : chr  "Code" "36164" "36165" "36166" ...
 $ VSTOXX VOLATILITY INDEX - PRICE INDEX                        : chr  "VSTOXXI(PI)" "18.2" "29.69" "25.17" ...
 $ ...3                                                         : logi  NA NA NA NA NA NA ...
 $ ...4                                                         : logi  NA NA NA NA NA NA ...
 $ ...5                                                         : logi  NA NA NA NA NA NA ...
 $ ...6                                                         : logi  NA NA NA NA NA NA ...
 $ Name...7                                                     : chr  "Code" "36799" "36830" "36860" ...
 $ EM COMPOSITE INDICATOR OF SOVEREIGN STRESS: GDP WEIGHTS NADJ : chr  "EMEBSCGWR" "7.8255999999999992E-2" "8.9886999999999995E-2" "8.0714999999999995E-2" ...
 $ ...9                                                         : logi  NA NA NA NA NA NA ...
 $ Name...10                                                    : chr  "Code" "36168" "36175" "36182" ...
 $ CISS BOND MKT: GOV & NFC VOLATILITY - ECONOMIC SERIES        : chr  "EMCIBMG" "4.4651999999999997E-2" "6.6535999999999998E-2" "4.9789E-2" ...
 $ ...12                                                        : logi  NA NA NA NA NA NA ...
 $ Name...13                                                    : chr  "Code" "36168" "36175" "36182" ...
 $ CISS MONEY MKT: 3M RATE+ VOLATILITY - ECONOMIC SERIES        : chr  "EMECM3E" "5.7435999999999994E-2" "7.463199999999999E-2" "7.2263999999999995E-2" ...
 $ CISS FX MKT: EUR VOLATILITY - ECONOMIC SERIES                : chr  "EMECFEM" "7.2139999999999996E-2" "8.6049E-2" "4.5948999999999997E-2" ...
 $ CISS FIN INTERM: BANK+ VOLATILITY - ECONOMIC SERIES          : chr  "EMCIFIN" "4.5384999999999995E-2" "0.11820399999999999" "0.11516499999999999" ...
 $ CISS NF EQUITY: VOLATILITY - ECONOMIC SERIES                 : chr  "EMCIEMN" "7.7453999999999995E-2" "0.12733" "0.11918899999999999" ...
 $ CISS: CROSS SUBINDEXCORRELATION - ECONOMIC SERIES            : chr  "EMCICRO" "-0.21210999999999999" "-0.29791000000000001" "-0.2369" ...
 $ SYSTEMIC STRESS COMPINDICATOR - ECONOMIC SERIES              : chr  "EMCISSI" "8.4954000000000002E-2" "0.174844" "0.16546" ...
 $ ...20                                                        : logi  NA NA NA NA NA NA ...
 $ ...21                                                        : logi  NA NA NA NA NA NA ...
 $ ...22                                                        : logi  NA NA NA NA NA NA ...
 $ ...23                                                        : logi  NA NA NA NA NA NA ...
 $ ...24                                                        : logi  NA NA NA NA NA NA ...
 $ ...25                                                        : logi  NA NA NA NA NA NA ...
 $ Name...26                                                    : chr  "Code" "33253" "33284" "33312" ...
 $ Z8 IPI: MFG., VOLUME INDEX OF PRODUCTION, 2015=100 (WDA) VOLA: chr  "Z8ES493KG" "81" "79.7" "79.400000000000006" ...
 $ ...28                                                        : logi  NA NA NA NA NA NA ...
 $ ...29                                                        : logi  NA NA NA NA NA NA ...
 $ ...30                                                        : logi  NA NA NA NA NA NA ...
 $ ...31                                                        : logi  NA NA NA NA NA NA ...
 $ ...32                                                        : logi  NA NA NA NA NA NA ...
 $ ...33                                                        : logi  NA NA NA NA NA NA ...
 $ ...34                                                        : logi  NA NA NA NA NA NA ...
 $ Name...35                                                    : chr  "Code" "35779" "35810" "35841" ...
 $ EH HICP: ALL-ITEMS NADJ                                      : chr  "EHES795WR" "1.7" "1.6" "1.6" ...
 $ ...37                                                        : logi  NA NA NA NA NA NA ...
 $ ...38                                                        : logi  NA NA NA NA NA NA ...
 $ Name...39                                                    : chr  "Code" "35110" "35139" "35170" ...
 $ EH HICP: ALL-ITEMS (%MOM) NADJ                               : chr  "EHESPQ93R" "0.4" "0.4" "0.3" ...
 $ ...41                                                        : logi  NA NA NA NA NA NA ...
 $ ...42                                                        : logi  NA NA NA NA NA NA ...
 $ ...43                                                        : logi  NA NA NA NA NA NA ...
 $ Name...44                                                    : chr  "Code" "35445" "35476" "35504" ...
 $ EH HICP: ALL-ITEMS HICP (%YOY) NADJ                          : chr  "EHESAKZER" "2.2000000000000002" "2" "1.7" ...
 $ ...46                                                        : logi  NA NA NA NA NA NA ...
 $ ...47                                                        : logi  NA NA NA NA NA NA ...
 $ ...48                                                        : logi  NA NA NA NA NA NA ...
 $ ...49                                                        : logi  NA NA NA NA NA NA ...
 $ Name...50                                                    : chr  "Code" "36206" "36234" "36265" ...
 $ EM EUROSYSTEM: BASE MONEY CURN                               : chr  "EMEBSMYBA" "426.64374199999997" "430.51499999999999" "432.34064499999999" ...
 $ ...52                                                        : logi  NA NA NA NA NA NA ...
 $ ...53                                                        : logi  NA NA NA NA NA NA ...
 $ ...54                                                        : logi  NA NA NA NA NA NA ...
 $ ...55                                                        : logi  NA NA NA NA NA NA ...
 $ Name...56                                                    : chr  "Code" "35703" "35734" "35762" ...
 $ EM EUROSYSTEM: TOTAL ASSETS/LIABILITIES (EP) CURN            : chr  "EMECBSALA" "710257.53500000003" "711193.47100000002" "714957.58900000004" ...
 $ ...58                                                        : logi  NA NA NA NA NA NA ...
 $ ...59                                                        : logi  NA NA NA NA NA NA ...
 $ ...60                                                        : logi  NA NA NA NA NA NA ...
 $ ...61                                                        : logi  NA NA NA NA NA NA ...
 $ ...62                                                        : logi  NA NA NA NA NA NA ...
 $ ...63                                                        : logi  NA NA NA NA NA NA ...
 $ Name...64                                                    : chr  "Code" "41548" "41579" "41609" ...
 $ TR EU FWD INFL-LKD SWAP 10YF20Y - MIDDLE RATE                : chr  "TREFSTT" NA NA NA ...
 $ TR EU FWD INFL-LKD SWAP 10YF10Y - MIDDLE RATE                : chr  "TREFS1T" NA NA NA ...
 $ TR EU FWD INFL-LKD SWAP 2YF2Y - MIDDLE RATE                  : chr  "TREFS22" "1.5158" "1.4669000000000001" "1.4715" ...
 $ TR EU FWD INFL-LKD SWAP 1YF1Y - MIDDLE RATE                  : chr  "TREFS11" "1.4509000000000001" "1.2338" "1.1225000000000001" ...
 $ TR EU FWD INFL-LKD SWAP 2YF3Y - MIDDLE RATE                  : chr  "TREFS23" "1.5906000000000002" "1.5453000000000001" "1.5283000000000002" ...
 $ TR EU FWD INFL-LKD SWAP 5YF10Y - MIDDLE RATE                 : chr  "TREFS5T" "2.3516000000000004" "2.3323" "2.3070000000000004" ...
 $ ...71                                                        : logi  NA NA NA NA NA NA ...
 $ ...72                                                        : logi  NA NA NA NA NA NA ...
 $ ...73                                                        : logi  NA NA NA NA NA NA ...
 $ ...74                                                        : logi  NA NA NA NA NA NA ...
 $ ...75                                                        : logi  NA NA NA NA NA NA ...
 $ Name...76                                                    : chr  "Code" "41255" "41286" "41317" ...
 $ TR EU FWD INFL-LKD SWAP 5YF5Y - MIDDLE RATE                  : chr  "TREFS55" "2.2027000000000001" "2.2637" "2.383" ...

编辑:虽然这很可能会解决您遇到的错误,但我同意爱德华的建议,使用 readxl::read_excel 应该保留日期。

的问题
  as.Date(as.numeric(dataset[,c(1,3,5,7,14,16,18,20,21,23,25,32)]), origin = "1899-12-30")

是您将 as.numeric 应用于内部为列表的 tibble。相反

dplyr::mutate_at(
    dataset, 
    c(1,3,5,7,14,16,18,20,21,23,25,32), 
    dplyr::funs(as.numeric, as.Date), 
    origin = "1899-12-30",
    format = "%Y-%m-%d"
)

您说列的长度不同,但这在 R 的 table 类结构(tibble、data.frame、data.table)中是不可能的。 课程:始终注意您正在使用的数据类型,例如str(dataset)as.numeric 不适用于 tables 但需要应用于特定列,例如使用mutate.

您必须在 read_excel(或 read_xlsx)命令中正确指定 col_types。例如:

dataset <- read_xlsx("Data.xlsx",
     col_types=c("numeric","date","numeric","date","numeric", "date", ...))

编辑:最后经过多次询问,问题是您的数据从第 3 行开始,而不是第 2 行。因此跳过第一行 (skip=1) 并尝试再次.

dataset <- read_xlsx("Data.xlsx", skip=1)