如何计算 R 中常见的 object 个字符
How to count common object of characters in R
我不确定如何问这个问题,所以我会把标题放在尽可能接近的地方,所以如果你能找到更好的阶段,请修复它。
我有一个矩阵,第 1 列是品牌名称,比方说“A”、“B”和“C”(共有 1212 个),第 2 列是一种代码。它是 4 位数字,每个品牌只有一位,但也不一定要在那里。
> data3
[,1] [,2]
[1,] "A" "A012"
[2,] "A" "A001"
[3,] "A" "A123"
[4,] "A" "A005"
[5,] "A" "A004"
[6,] "A" "A100"
[7,] "A" "A023"
[8,] "A" "A055"
[9,] "A" "A044"
[10,] "A" "A101"
[11,] "B" "A012"
[12,] "B" "A123"
[13,] "B" "A005"
[14,] "B" "A055"
[15,] "B" "A044"
[16,] "B" "A101"
[17,] "C" "A032"
[18,] "C" "A001"
[19,] "C" "A323"
[20,] "C" "A003"
[21,] "C" "A011"
[22,] "C" "A111"
[23,] "C" "A013"
[24,] "C" "A015"
[25,] "C" "A014"
[26,] "C" "A009"
[27,] "C" "A011"
[28,] "C" "A073"
[29,] "C" "A063"
[30,] "C" "A030"
[31,] "C" "A028"
[32,] "C" "A007"
A和B共有多少个编码? “A”和“C”也一样。这是一个简单的例子,我可以用手数,但因为在真实的例子中它变得混乱,我需要弄清楚如何数数。
我的最终目标是为 sim(A,C) 计算一个类似
的数字。
我最初想通过删除“A”将第二列转换为数字
例如
[1,] "A" 12
[2,] "A" 1
[3,] "A" 123
[4,] "A" 5
[5,] "A" 4
然后使用 |和其他逻辑 object 但我无法将字符转换为数字。
为此,您可以将 %in%
运算符与 sum()
结合使用。但首先,您需要稍微转换一下数据。
将矩阵转换为数据框
# Convert to data frame.
# Colnames will be set to "V1" (your column 1) and "V2" (your column 2) automatically
d <- as.data.frame(data3)
# Filter data and extract variable
col2A <- d[d$V1 == "A", 2]
col2B <- d[d$V1 == "B", 2]
获取常用值的个数
# check element-wise if value from col2A is in col2B
col2A %in% col2B
# Output: [1] FALSE FALSE TRUE TRUE FALSE (...)
# get number of common values
sum(col2A %in% col2B)
我不确定如何问这个问题,所以我会把标题放在尽可能接近的地方,所以如果你能找到更好的阶段,请修复它。
我有一个矩阵,第 1 列是品牌名称,比方说“A”、“B”和“C”(共有 1212 个),第 2 列是一种代码。它是 4 位数字,每个品牌只有一位,但也不一定要在那里。
> data3
[,1] [,2]
[1,] "A" "A012"
[2,] "A" "A001"
[3,] "A" "A123"
[4,] "A" "A005"
[5,] "A" "A004"
[6,] "A" "A100"
[7,] "A" "A023"
[8,] "A" "A055"
[9,] "A" "A044"
[10,] "A" "A101"
[11,] "B" "A012"
[12,] "B" "A123"
[13,] "B" "A005"
[14,] "B" "A055"
[15,] "B" "A044"
[16,] "B" "A101"
[17,] "C" "A032"
[18,] "C" "A001"
[19,] "C" "A323"
[20,] "C" "A003"
[21,] "C" "A011"
[22,] "C" "A111"
[23,] "C" "A013"
[24,] "C" "A015"
[25,] "C" "A014"
[26,] "C" "A009"
[27,] "C" "A011"
[28,] "C" "A073"
[29,] "C" "A063"
[30,] "C" "A030"
[31,] "C" "A028"
[32,] "C" "A007"
A和B共有多少个编码? “A”和“C”也一样。这是一个简单的例子,我可以用手数,但因为在真实的例子中它变得混乱,我需要弄清楚如何数数。
我的最终目标是为 sim(A,C) 计算一个类似 的数字。
我最初想通过删除“A”将第二列转换为数字 例如
[1,] "A" 12
[2,] "A" 1
[3,] "A" 123
[4,] "A" 5
[5,] "A" 4
然后使用 |和其他逻辑 object 但我无法将字符转换为数字。
为此,您可以将 %in%
运算符与 sum()
结合使用。但首先,您需要稍微转换一下数据。
将矩阵转换为数据框
# Convert to data frame.
# Colnames will be set to "V1" (your column 1) and "V2" (your column 2) automatically
d <- as.data.frame(data3)
# Filter data and extract variable
col2A <- d[d$V1 == "A", 2]
col2B <- d[d$V1 == "B", 2]
获取常用值的个数
# check element-wise if value from col2A is in col2B
col2A %in% col2B
# Output: [1] FALSE FALSE TRUE TRUE FALSE (...)
# get number of common values
sum(col2A %in% col2B)