尝试在 phpmyadmin 中查找单词的出现时计算意外字符
counting unexpected Chars when trying to find occurrence of the words in phpmyadmin
我有一个table,其中包含从一个单词到 40 个单词的不同长度的句子,我想分别计算每个单词以及它在 table 中出现的次数,
但是只要句子只包含一个词,它就会打印出意想不到的字符。出于某些原因,有什么想法吗?
这是我的数据库的演示
这是代码
create table messages(sent varchar(200), verif int);
insert into messages values
('HI' , null),
('HI alex how are you' , null),
('bye' , null);
select * from messages;
UPDATE messages set sent = TRIM(sent);
UPDATE messages set sent = REGEXP_REPLACE(sent,' +',' ')
with recursive cte as (
select
substring(concat(sent, ' '), 1, locate(' ', sent)) word,
substring(concat(sent, ' '), locate(' ', sent) + 1) sent
from messages
union all
select
substring(sent, 1, locate(' ', sent)) word,
substring(sent, locate(' ', sent) + 1) sent
from cte
where locate(' ', sent) > 0
)
select row_number() over(order by count(*) desc, word) wid, word, count(*) freq
from cte
group by word
order by wid
out put of the code
wid word freq
1 2
2 HI 2
3 alex 1
4 are 1
5 bye 1
6 how 1
7 you 1
expected output
wid word freq
1 HI 2
2 alex 1
3 are 1
4 bye 1
5 how 1
6 you 1
您的问题在这些行中:
substring(concat(sent, ' '), 1, locate(' ', sent)) word,
substring(concat(sent, ' '), locate(' ', sent) + 1) sent
当sent
不包含space时,locate(' ', sent)
returns0和substring
returns为空字符串,即您所看到的内容计入您的输出。要解决此问题,请使用 concat(sent, ' ')
代替 sent
:
substring(concat(sent, ' '), 1, locate(' ', concat(sent, ' '))) word,
substring(concat(sent, ' '), locate(' ', concat(sent, ' ')) + 1) sent
对于您的示例数据,这给出了:
wid word freq
1 HI 2
2 alex 1
3 are 1
4 bye 1
5 how 1
6 you 1
我有一个table,其中包含从一个单词到 40 个单词的不同长度的句子,我想分别计算每个单词以及它在 table 中出现的次数, 但是只要句子只包含一个词,它就会打印出意想不到的字符。出于某些原因,有什么想法吗?
这是我的数据库的演示
这是代码
create table messages(sent varchar(200), verif int);
insert into messages values
('HI' , null),
('HI alex how are you' , null),
('bye' , null);
select * from messages;
UPDATE messages set sent = TRIM(sent);
UPDATE messages set sent = REGEXP_REPLACE(sent,' +',' ')
with recursive cte as (
select
substring(concat(sent, ' '), 1, locate(' ', sent)) word,
substring(concat(sent, ' '), locate(' ', sent) + 1) sent
from messages
union all
select
substring(sent, 1, locate(' ', sent)) word,
substring(sent, locate(' ', sent) + 1) sent
from cte
where locate(' ', sent) > 0
)
select row_number() over(order by count(*) desc, word) wid, word, count(*) freq
from cte
group by word
order by wid
out put of the code
wid word freq
1 2
2 HI 2
3 alex 1
4 are 1
5 bye 1
6 how 1
7 you 1
expected output
wid word freq
1 HI 2
2 alex 1
3 are 1
4 bye 1
5 how 1
6 you 1
您的问题在这些行中:
substring(concat(sent, ' '), 1, locate(' ', sent)) word,
substring(concat(sent, ' '), locate(' ', sent) + 1) sent
当sent
不包含space时,locate(' ', sent)
returns0和substring
returns为空字符串,即您所看到的内容计入您的输出。要解决此问题,请使用 concat(sent, ' ')
代替 sent
:
substring(concat(sent, ' '), 1, locate(' ', concat(sent, ' '))) word,
substring(concat(sent, ' '), locate(' ', concat(sent, ' ')) + 1) sent
对于您的示例数据,这给出了:
wid word freq
1 HI 2
2 alex 1
3 are 1
4 bye 1
5 how 1
6 you 1