遍历多个数据框(csv),添加一个新列并分配一个由 R 中的原始数据框名称驱动的值
Iterate through multiple dataframes (csv), add a new column and assign a value drived from original dataframe names in R
我有一个 .xyz 和 .csv 扩展名的数据帧列表。我尝试通过应用函数迭代 .xyz(例如 apalachicola_mile_76.xyz)扩展和 .csv 文件;
split_func
拆分并从文件名中获取唯一值,然后将其复制到新列中。但是,它给出了一个错误:“有 30 个警告(使用 warnings() 查看它们)”。
一次只能处理一个数据帧
# Load needed packages
library(tibble)
library(plyr)
library(readr)
filez <- list.files('.', full.names = T, pattern = '*.xyz')
# create a function to assign values
split_func <- function(mylist, df){
split_first <- unlist(strsplit(mylist, '_mile_'))[2] #split the dataframe name,select a value
split_sec <- unlist(strsplit(split_first, '\.'))[1]
conv_num <- as.numeric(split_sec) #turn the selcted value to a number (integer)
add_column(df, RM = conv_num) # create new column and add the number
}
#iterate the function over each csv file
# first read the exported data
filez_df <- list.files('.', full.names = T, pattern = '.csv') #import exported data
#apply split_func function to all files
for(i in 1:length(filez_df)){ # iterate through the length of the file
df_holder <- vector(mode = 'list', length = length(filez_df)) # create an empty list
df_holder[i] <- split_func(filez[i], read.csv(filez_df[i])) # apply the function
# Get paths to all .csv files in working dir
csvs <- list.files(pattern = ".csv")
xyzs <- list.files(pattern = ".xyz")
# Empty list to hold the result of each iteration
all_files <- list()
for(i in 1:length(csvs)){
temp <- read.csv(csvs[i])
mile_num <- sub(pattern = ".*_(\d{+}).xyz", replacement = "\1", x = xyzs[i])
temp$mile <- mile_num
all_files[[i]] <- temp
}
# Convert list of dataframes to a single dataframe
do.call(rbind, all_files)
我有一个 .xyz 和 .csv 扩展名的数据帧列表。我尝试通过应用函数迭代 .xyz(例如 apalachicola_mile_76.xyz)扩展和 .csv 文件;
split_func
拆分并从文件名中获取唯一值,然后将其复制到新列中。但是,它给出了一个错误:“有 30 个警告(使用 warnings() 查看它们)”。
一次只能处理一个数据帧
# Load needed packages
library(tibble)
library(plyr)
library(readr)
filez <- list.files('.', full.names = T, pattern = '*.xyz')
# create a function to assign values
split_func <- function(mylist, df){
split_first <- unlist(strsplit(mylist, '_mile_'))[2] #split the dataframe name,select a value
split_sec <- unlist(strsplit(split_first, '\.'))[1]
conv_num <- as.numeric(split_sec) #turn the selcted value to a number (integer)
add_column(df, RM = conv_num) # create new column and add the number
}
#iterate the function over each csv file
# first read the exported data
filez_df <- list.files('.', full.names = T, pattern = '.csv') #import exported data
#apply split_func function to all files
for(i in 1:length(filez_df)){ # iterate through the length of the file
df_holder <- vector(mode = 'list', length = length(filez_df)) # create an empty list
df_holder[i] <- split_func(filez[i], read.csv(filez_df[i])) # apply the function
# Get paths to all .csv files in working dir
csvs <- list.files(pattern = ".csv")
xyzs <- list.files(pattern = ".xyz")
# Empty list to hold the result of each iteration
all_files <- list()
for(i in 1:length(csvs)){
temp <- read.csv(csvs[i])
mile_num <- sub(pattern = ".*_(\d{+}).xyz", replacement = "\1", x = xyzs[i])
temp$mile <- mile_num
all_files[[i]] <- temp
}
# Convert list of dataframes to a single dataframe
do.call(rbind, all_files)