试图找到一种方法将 adist() 用于单词而不是 R 中的字符
Trying to find a way to use adist() for words instead of characters in R
我希望 adist 函数的工作方式与它对单词的工作方式和对字符的工作方式相同。我的意思是我希望 deletion/substitution/insertion 应用于整个单词而不是字符。例如,我希望 "Alert 12 went off at 3am" 和 "Alert 17 was heard at 3am" 的编辑距离为 3,因为从一个字符串到另一个字符串需要三个单词替换。谢谢
我想你可以试试下面的代码来计算不同的单词
library(vecsets)
d <- length(vsetdiff(unlist(strsplit(s1," ")),unlist(strsplit(s2," "))))
这样
> d
[1] 3
DATa
s1 <- "Alert 12 went off at 3am"
s2 <- "Alert 17 was heard at 3am"
我希望 adist 函数的工作方式与它对单词的工作方式和对字符的工作方式相同。我的意思是我希望 deletion/substitution/insertion 应用于整个单词而不是字符。例如,我希望 "Alert 12 went off at 3am" 和 "Alert 17 was heard at 3am" 的编辑距离为 3,因为从一个字符串到另一个字符串需要三个单词替换。谢谢
我想你可以试试下面的代码来计算不同的单词
library(vecsets)
d <- length(vsetdiff(unlist(strsplit(s1," ")),unlist(strsplit(s2," "))))
这样
> d
[1] 3
DATa
s1 <- "Alert 12 went off at 3am"
s2 <- "Alert 17 was heard at 3am"