SQL 服务器:删除具有特定长度的单词
SQL Server: Remove words having specific length
我需要从列中删除具有特定长度的单词。在我的例子中,我需要删除长度 <=2 的单词。我总共有100万条记录。
+---------------------------+------------------------+
| Input | Output |
+---------------------------+------------------------+
| Steel rod 10 X 15 MM | Steel rod |
+---------------------------+------------------------+
| Syringe SM B 2.5 ML | Syringe 2.5 |
+---------------------------+------------------------+
| 3.5 ML Syringe Disposable | 3.5 Syringe Disposable |
+---------------------------+------------------------+
| Syringe Disposable 2.5 ML | Syringe Disposable 2.5 |
+---------------------------+------------------------+
甚至我也不需要 输入 描述栏中的数字。我有删除数字的功能。请指教
SQL 服务器的字符串处理功能很差。这可能是您在将数据加载到数据库时想要做的事情。
数据库中的一个解决方案是递归 CTE:
with cte as (
select input, convert(varchar(max), input + ' ') as rest, convert(varchar(max), '') as output, 1 as lev
from t
union all
select input,
stuff(rest, 1, charindex(' ', rest), ''),
(case when charindex(' ', rest) <= 3 then output
else output + left(rest, charindex(' ', rest))
end),
lev + 1
from cte
where rest <> ''
)
select input, output
from (select cte.*, max(lev) over (partition by input) as max_lev
from cte
) cte
where lev = max_lev;
Here 是一个 db<>fiddle.
我需要从列中删除具有特定长度的单词。在我的例子中,我需要删除长度 <=2 的单词。我总共有100万条记录。
+---------------------------+------------------------+
| Input | Output |
+---------------------------+------------------------+
| Steel rod 10 X 15 MM | Steel rod |
+---------------------------+------------------------+
| Syringe SM B 2.5 ML | Syringe 2.5 |
+---------------------------+------------------------+
| 3.5 ML Syringe Disposable | 3.5 Syringe Disposable |
+---------------------------+------------------------+
| Syringe Disposable 2.5 ML | Syringe Disposable 2.5 |
+---------------------------+------------------------+
甚至我也不需要 输入 描述栏中的数字。我有删除数字的功能。请指教
SQL 服务器的字符串处理功能很差。这可能是您在将数据加载到数据库时想要做的事情。
数据库中的一个解决方案是递归 CTE:
with cte as (
select input, convert(varchar(max), input + ' ') as rest, convert(varchar(max), '') as output, 1 as lev
from t
union all
select input,
stuff(rest, 1, charindex(' ', rest), ''),
(case when charindex(' ', rest) <= 3 then output
else output + left(rest, charindex(' ', rest))
end),
lev + 1
from cte
where rest <> ''
)
select input, output
from (select cte.*, max(lev) over (partition by input) as max_lev
from cte
) cte
where lev = max_lev;
Here 是一个 db<>fiddle.