如果 CASE 表达式中只有一个字符不同,则使两个字符串相等
Make two strings equal if there is only one character difference in CASE expression
我有两列要比较,如果字符串相同,最大差异或缺少一个字符,我想为它制作一个标志。例如:
select
,name1
,name2
,case when "name1 is like name2 except only 1 different character, or
lack of 1 character compared to the other" then 1
else 0
end same_flag
from example
示例输出:
name1 - name 2 - sameflag
john - jon - 1
sara - sarah - 1
filip - filis - 1
phillip - philis - 0
我想让它反过来工作。所以 name1 可以与 name2 不同,但在另一行中 name2 可以与 name1 不同。
试试这个并适应计算两者的长度并进行比较。
How to find count and names of distinct characters in string in PL/SQL
实际上不是我的答案,但这提供了计算长度和比较数值差异的基础。
这是非常线性的——只是遍历字母并计算差异。
我已经更新了这个 - 现在 Richard 和 Rchard 会被认为是同一个人...
FUNCTION compare_strings
(P_string1 IN VARCHAR2
,P_string2 IN VARCHAR2)
RETURN NUMBER
IS
l_long_string VARCHAR2(100) ;
l_short_string VARCHAR2(100) ;
l_diff_count NUMBER := 0 ;
l_result NUMBER ;
j NUMBER := 1 ;
k NUMBER := 1 ;
BEGIN
IF LENGTH(P_string1) >= LENGTH(P_string2) THEN
l_long_string := P_string1 ;
l_short_string := P_string2 ;
ELSE
l_long_string := P_string2 ;
l_short_string := P_string1 ;
END IF ;
--if one string is more than one char longer than the other then we must
--have a difference
IF LENGTH(l_long_string) - LENGTH(l_short_string) > 1 THEN
l_result := 0 ;
END IF ;
FOR i IN 1..LENGTH(l_long_string) LOOP
IF NVL(SUBSTR(P_string1,j,1),'##') != NVL(SUBSTR(P_string2,k,1),'##') THEN
l_diff_count := l_diff_count + 1 ;
--shift along one letter in the long string but stay put in the short string
j := j + 1 ;
ELSE
--shift along on both strings
j := j + 1 ;
k := k + 1 ;
END IF ;
--EXIT WHEN l_diff_count > 1 ;
END LOOP ;
IF l_diff_count > 1 THEN
l_result := 1;
ELSE
l_result := 0 ;
END IF ;
RETURN(l_result) ;
--RETURN(l_diff_count) ;
END compare_strings ;
您可以从 utl_match
包中选择一个函数:
with data (name1, name2) as (
select'john','jon' from dual union all
select'sara','sarah' from dual union all
select'filip','filis' from dual union all
select'phillip','philis' from dual
)
select name1, name2,
utl_match.edit_distance(name1, name2) as ed,
utl_match.edit_distance_similarity(name1, name2) as ed_similarity,
utl_match.jaro_winkler(name1, name2) as jw,
utl_match.jaro_winkler_similarity(name1, name2) as jw_similarity
from data;
returns:
NAME1 | NAME2 | ED | ED_SIMILARITY | JW | JW_SIMILARITY
--------+--------+----+---------------+------+--------------
john | jon | 1 | 75 | 0.93 | 93
sara | sarah | 1 | 80 | 0.96 | 96
filip | filis | 1 | 80 | 0.92 | 92
phillip | philis | 2 | 72 | 0.91 | 90
根据您的需要和喜欢的结果,您可以执行以下操作:
case when utl_match.edit_distance(name1, name2) < 2 then 1 else e end
或使用百分比作为阈值:
case when utl_match.edit_distance_similarity(name1, name2) > 75 then 1 else e end
我有两列要比较,如果字符串相同,最大差异或缺少一个字符,我想为它制作一个标志。例如:
select
,name1
,name2
,case when "name1 is like name2 except only 1 different character, or
lack of 1 character compared to the other" then 1
else 0
end same_flag
from example
示例输出:
name1 - name 2 - sameflag
john - jon - 1
sara - sarah - 1
filip - filis - 1
phillip - philis - 0
我想让它反过来工作。所以 name1 可以与 name2 不同,但在另一行中 name2 可以与 name1 不同。
试试这个并适应计算两者的长度并进行比较。
How to find count and names of distinct characters in string in PL/SQL
实际上不是我的答案,但这提供了计算长度和比较数值差异的基础。
这是非常线性的——只是遍历字母并计算差异。
我已经更新了这个 - 现在 Richard 和 Rchard 会被认为是同一个人...
FUNCTION compare_strings
(P_string1 IN VARCHAR2
,P_string2 IN VARCHAR2)
RETURN NUMBER
IS
l_long_string VARCHAR2(100) ;
l_short_string VARCHAR2(100) ;
l_diff_count NUMBER := 0 ;
l_result NUMBER ;
j NUMBER := 1 ;
k NUMBER := 1 ;
BEGIN
IF LENGTH(P_string1) >= LENGTH(P_string2) THEN
l_long_string := P_string1 ;
l_short_string := P_string2 ;
ELSE
l_long_string := P_string2 ;
l_short_string := P_string1 ;
END IF ;
--if one string is more than one char longer than the other then we must
--have a difference
IF LENGTH(l_long_string) - LENGTH(l_short_string) > 1 THEN
l_result := 0 ;
END IF ;
FOR i IN 1..LENGTH(l_long_string) LOOP
IF NVL(SUBSTR(P_string1,j,1),'##') != NVL(SUBSTR(P_string2,k,1),'##') THEN
l_diff_count := l_diff_count + 1 ;
--shift along one letter in the long string but stay put in the short string
j := j + 1 ;
ELSE
--shift along on both strings
j := j + 1 ;
k := k + 1 ;
END IF ;
--EXIT WHEN l_diff_count > 1 ;
END LOOP ;
IF l_diff_count > 1 THEN
l_result := 1;
ELSE
l_result := 0 ;
END IF ;
RETURN(l_result) ;
--RETURN(l_diff_count) ;
END compare_strings ;
您可以从 utl_match
包中选择一个函数:
with data (name1, name2) as (
select'john','jon' from dual union all
select'sara','sarah' from dual union all
select'filip','filis' from dual union all
select'phillip','philis' from dual
)
select name1, name2,
utl_match.edit_distance(name1, name2) as ed,
utl_match.edit_distance_similarity(name1, name2) as ed_similarity,
utl_match.jaro_winkler(name1, name2) as jw,
utl_match.jaro_winkler_similarity(name1, name2) as jw_similarity
from data;
returns:
NAME1 | NAME2 | ED | ED_SIMILARITY | JW | JW_SIMILARITY
--------+--------+----+---------------+------+--------------
john | jon | 1 | 75 | 0.93 | 93
sara | sarah | 1 | 80 | 0.96 | 96
filip | filis | 1 | 80 | 0.92 | 92
phillip | philis | 2 | 72 | 0.91 | 90
根据您的需要和喜欢的结果,您可以执行以下操作:
case when utl_match.edit_distance(name1, name2) < 2 then 1 else e end
或使用百分比作为阈值:
case when utl_match.edit_distance_similarity(name1, name2) > 75 then 1 else e end