如何在 R 中逐个元素地比较两个数据集？

Question

我需要以 A、B、C、D 多项选择的方式检查 50 名不同学生的测试结果。

我有一个答案键的一维数据集，"answers" 我读入为 answers <- read.table("A1_Ans_only.txt", header = FALSE, sep = ",")

View(answers)

我的数据集 "results" 包含所有 50 名学生的所有答案。我读为 results <- read.csv("Form A1_only.csv", header = FALSE)

View(results)

因此，当我尝试 results==answers 或“评估（结果，答案）”之类的操作时，评估是我编写的定义为 'evaluate <- function(x,y){x==y}' 的函数，我会遇到各种错误，例如 "not equal-length data frames" 或当我将每个子集化为一维时不相同的级别向量。

谁能帮我评估结果数据框的每个元素，以确定每个学生答对了哪些问题？

This is a small sample of results: 


structure(list(V1 = c(1L, 3L, 5L), V2 = c(NA, NA, NA), V3 = structure(c(2L, 
1L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"), V4 =     structure(c(1L, 
1L, 1L), .Label = c("A", "B", "C", "D"), class = "factor"), V5 = structure(c(2L, 
2L, 3L), .Label = c("A", "B", "C", "D"), class = "factor"), V6 = structure(c(1L, 
1L, 1L), .Label = c("A", "B", "C"), class = "factor"), V7 = structure(c(1L, 
1L, 1L), .Label = c("A", "C", "D"), class = "factor"), V8 = structure(c(2L, 
1L, 2L), .Label = c("A", "B", "D"), class = "factor"), V9 = structure(c(1L, 
1L, 1L), .Label = c("A", "C", "D"), class = "factor"), V10 = structure(c(2L, 
2L, 1L), .Label = c("A", "B", "C"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10"), row.names = c(NA, 
3L), class = "data.frame")


This is the sample from answers: 

structure(list(V1 = structure(1L, .Label = "AAAAKEY", class = "factor"), 
V2 = NA, V3 = structure(1L, .Label = "C", class = "factor"), 
V4 = structure(1L, .Label = "A", class = "factor"), V5 = structure(1L, .Label = "C", class = "factor"), 
V6 = structure(1L, .Label = "A", class = "factor"), V7 = structure(1L, .Label = "A", class = "factor"), 
V8 = structure(1L, .Label = "B", class = "factor"), V9 = structure(1L, .Label = "A", class = "factor"), 
V10 = structure(1L, .Label = "B", class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10"), class = "data.frame", row.names = c(NA, 
-1L))

Answer 1

我们可以在复制'answers'后进行比较，使长度相等

results==answers[col(results)]
#     V1 V2    V3   V4    V5   V6   V7    V8   V9   V10
#1 FALSE NA FALSE TRUE FALSE TRUE TRUE  TRUE TRUE  TRUE
#2 FALSE NA FALSE TRUE FALSE TRUE TRUE FALSE TRUE  TRUE
#3 FALSE NA FALSE TRUE  TRUE TRUE TRUE  TRUE TRUE FALSE

'answers' 的 V2 列中的 NA 导致 NA 输出，因为与 NA 的任何相等比较都会导致 NA。如果我们需要它作为 FALSE，那么之后要么将 NA 更改为 FALSE，要么使用 !is.na(answers)[col(results)]

执行 &

如何在 R 中逐个元素地比较两个数据集？

How Do I Compare Two Datasets element by element in R?

compare

r

elements

dataframe