SQL 错误 [22P02] 无法识别在 where 子句中转换为整数的 varchar
SQL Error [22P02] does not recognise varchar converted to integer in the where clause
我有 table TROQ,它确实有一个名为 cod 的字段定义为 Varchar(13) (Postgres 11.8)
当cod的前四个字符是数字时,表示它是“特殊的troq”。特殊的Troqs,根据这个数字前四个字符可以在这四个数字字符组成的代码小于5000时为“Production”,在代码超过5000时为“Development”。 这只是为了举例,在实际问题中,特殊 troqs 有更多的分类,但它们中的每一个都形成一个数字范围,如示例
所以我尝试了以下查询:
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000
函数的实现是数字的:
CREATE OR REPLACE FUNCTION public.isnumeric(text)
RETURNS boolean
LANGUAGE plpgsql
AS $function$
DECLARE x NUMERIC;
BEGIN
x = ::NUMERIC;
RETURN TRUE;
EXCEPTION WHEN others THEN
RETURN FALSE;
END;
$function$
;
我得到错误:
SQL Error [22P02]: ERROR: la sintaxis de entrada no es válida para integer: «INFD»
... Meaning about: Input sintax not valid for integer <<INFD>>
如果我从外部查询中去掉 where:
select cod, pref
from
(select cod, substr(cod, 1, 4)::int pref
from troq
where isnumeric(substr(cod,1, 4))) CT
然后它不会抛出任何错误消息。它显示所有前四个字符具有数字代码的 TROQ。
但我发现无法在此预选字段上应用任何条件。
我接下来尝试的是在 where 上进行各种类型的转换,有些有意义,有些则没有,但结果没有任何改善......意思是:
... where pref::varchar < '5000'
... where pref::numeric < 5000
以防万一问题与转换有关,我尝试了 to_number 并得到相同的结果(如果我删除 where 子句并在我尝试添加条件时给出此 22P02 错误,查询将不会出错在铸造领域偏好):
select cod, pref
from
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000
如能就此事提供任何解释或帮助,我们将不胜感激。提前谢谢你
仅用四个记录重现练习的脚本:
CREATE TABLE public.troq (
cod varchar(13) NOT NULL
);
INSERT INTO public.troq (cod) VALUES
('1234Trala')
,('Tururu')
,('4532Vargas')
,('n4567Titi')
;
(Troq records that should be recovered would be 1234Trala and 4532Vargas)
我只是尝试以不同的方式进行操作并且成功了。不明白为什么。有效的解决方案:
with CT as
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where isnumeric(substr(cod,1, 4)))
select * from CT
where pref < 5000;
我想这样我 更多 确保优化器首先执行内部查询,但据我所知 SQL 我不认为我无论如何都必须确保这一点。
问题在于 PostgreSQL 展平了子查询并优化了查询,以便首先评估条件 pref < 5000
,因为 PostgreSQL 认为这样更便宜:
EXPLAIN (COSTS OFF)
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000;
QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════
Seq Scan on troq
Filter: (((substr(cod, 1, 4))::integer < 5000) AND isnumeric(substr(cod, 1, 4)))
(2 rows)
那 is documented 和预期的一样:
The order of evaluation of subexpressions is not defined.
你可以做的一个技巧是告诉 PostgreSQL 该函数非常便宜(我同时设置它 IMMUTABLE
,因为它是):
ALTER FUNCTION isnumeric COST 1 IMMUTABLE;
那改变了执行计划,让你的函数先被评估,避免错误:
EXPLAIN (COSTS OFF)
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000;
QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════
Seq Scan on troq
Filter: (isnumeric(substr(cod, 1, 4)) AND ((substr(cod, 1, 4))::integer < 5000))
(2 rows)
但这当然不完全可靠(其他表达式也可能非常“便宜”),更好的解决方案是向子查询添加一个优化器屏障,如 OFFSET 0
以防止它从被压扁:
EXPLAIN (COSTS OFF)
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))
OFFSET 0) CT
where pref < 5000;
QUERY PLAN
══════════════════════════════════════════════
Subquery Scan on ct
Filter: (ct.pref < 5000)
-> Seq Scan on troq
Filter: isnumeric(substr(cod, 1, 4))
(4 rows)
另一种选择是使用正则表达式来验证前 4 个字符是否为数字,并完全放弃 is_numeric 函数。
select cod, pref
from
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where cod ~ '^\d{4}'
) s
where pref < 5000;
现在,正则表达式往往更慢。但是它比子串、函数调用、类型转换、赋值、异常处理慢吗?只有测试才能说明问题。
我有 table TROQ,它确实有一个名为 cod 的字段定义为 Varchar(13) (Postgres 11.8)
当cod的前四个字符是数字时,表示它是“特殊的troq”。特殊的Troqs,根据这个数字前四个字符可以在这四个数字字符组成的代码小于5000时为“Production”,在代码超过5000时为“Development”。 这只是为了举例,在实际问题中,特殊 troqs 有更多的分类,但它们中的每一个都形成一个数字范围,如示例
所以我尝试了以下查询:
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000
函数的实现是数字的:
CREATE OR REPLACE FUNCTION public.isnumeric(text)
RETURNS boolean
LANGUAGE plpgsql
AS $function$
DECLARE x NUMERIC;
BEGIN
x = ::NUMERIC;
RETURN TRUE;
EXCEPTION WHEN others THEN
RETURN FALSE;
END;
$function$
;
我得到错误:
SQL Error [22P02]: ERROR: la sintaxis de entrada no es válida para integer: «INFD»
... Meaning about: Input sintax not valid for integer <<INFD>>
如果我从外部查询中去掉 where:
select cod, pref
from
(select cod, substr(cod, 1, 4)::int pref
from troq
where isnumeric(substr(cod,1, 4))) CT
然后它不会抛出任何错误消息。它显示所有前四个字符具有数字代码的 TROQ。 但我发现无法在此预选字段上应用任何条件。
我接下来尝试的是在 where 上进行各种类型的转换,有些有意义,有些则没有,但结果没有任何改善......意思是:
... where pref::varchar < '5000'
... where pref::numeric < 5000
以防万一问题与转换有关,我尝试了 to_number 并得到相同的结果(如果我删除 where 子句并在我尝试添加条件时给出此 22P02 错误,查询将不会出错在铸造领域偏好):
select cod, pref
from
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000
如能就此事提供任何解释或帮助,我们将不胜感激。提前谢谢你
仅用四个记录重现练习的脚本:
CREATE TABLE public.troq (
cod varchar(13) NOT NULL
);
INSERT INTO public.troq (cod) VALUES
('1234Trala')
,('Tururu')
,('4532Vargas')
,('n4567Titi')
;
(Troq records that should be recovered would be 1234Trala and 4532Vargas)
我只是尝试以不同的方式进行操作并且成功了。不明白为什么。有效的解决方案:
with CT as
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where isnumeric(substr(cod,1, 4)))
select * from CT
where pref < 5000;
我想这样我 更多 确保优化器首先执行内部查询,但据我所知 SQL 我不认为我无论如何都必须确保这一点。
问题在于 PostgreSQL 展平了子查询并优化了查询,以便首先评估条件 pref < 5000
,因为 PostgreSQL 认为这样更便宜:
EXPLAIN (COSTS OFF)
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000;
QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════
Seq Scan on troq
Filter: (((substr(cod, 1, 4))::integer < 5000) AND isnumeric(substr(cod, 1, 4)))
(2 rows)
那 is documented 和预期的一样:
The order of evaluation of subexpressions is not defined.
你可以做的一个技巧是告诉 PostgreSQL 该函数非常便宜(我同时设置它 IMMUTABLE
,因为它是):
ALTER FUNCTION isnumeric COST 1 IMMUTABLE;
那改变了执行计划,让你的函数先被评估,避免错误:
EXPLAIN (COSTS OFF)
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000;
QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════
Seq Scan on troq
Filter: (isnumeric(substr(cod, 1, 4)) AND ((substr(cod, 1, 4))::integer < 5000))
(2 rows)
但这当然不完全可靠(其他表达式也可能非常“便宜”),更好的解决方案是向子查询添加一个优化器屏障,如 OFFSET 0
以防止它从被压扁:
EXPLAIN (COSTS OFF)
select cod, pref
from
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))
OFFSET 0) CT
where pref < 5000;
QUERY PLAN
══════════════════════════════════════════════
Subquery Scan on ct
Filter: (ct.pref < 5000)
-> Seq Scan on troq
Filter: isnumeric(substr(cod, 1, 4))
(4 rows)
另一种选择是使用正则表达式来验证前 4 个字符是否为数字,并完全放弃 is_numeric 函数。
select cod, pref
from
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where cod ~ '^\d{4}'
) s
where pref < 5000;
现在,正则表达式往往更慢。但是它比子串、函数调用、类型转换、赋值、异常处理慢吗?只有测试才能说明问题。