SQL 错误 [22P02] 无法识别在 where 子句中转换为整数的 varchar

SQL Error [22P02] does not recognise varchar converted to integer in the where clause

我有 table TROQ,它确实有一个名为 cod 的字段定义为 Varchar(13) (Postgres 11.8)

当cod的前四个字符是数字时,表示它是“特殊的troq”。特殊的Troqs,根据这个数字前四个字符可以在这四个数字字符组成的代码小于5000时为“Production”,在代码超过5000时为“Development”。 这只是为了举例,在实际问题中,特殊 troqs 有更多的分类,但它们中的每一个都形成一个数字范围,如示例

所以我尝试了以下查询:

select cod, pref
from 
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000

函数的实现是数字的:

CREATE OR REPLACE FUNCTION public.isnumeric(text)
 RETURNS boolean
 LANGUAGE plpgsql
AS $function$
DECLARE x NUMERIC;
BEGIN
       x = ::NUMERIC;
       RETURN TRUE;
EXCEPTION WHEN others THEN
       RETURN FALSE;
END;
$function$
;

我得到错误:

    SQL Error [22P02]: ERROR: la sintaxis de entrada no es válida para integer: «INFD» 
... Meaning about: Input sintax not valid for integer <<INFD>>

如果我从外部查询中去掉 where:

select cod, pref
from 
(select cod, substr(cod, 1, 4)::int pref
from troq
where isnumeric(substr(cod,1, 4))) CT

然后它不会抛出任何错误消息。它显示所有前四个字符具有数字代码的 TROQ。 但我发现无法在此预选字段上应用任何条件。

我接下来尝试的是在 where 上进行各种类型的转换,有些有意义,有些则没有,但结果没有任何改善......意思是:

... where pref::varchar < '5000'
... where pref::numeric < 5000

以防万一问题与转换有关,我尝试了 to_number 并得到相同的结果(如果我删除 where 子句并在我尝试添加条件时给出此 22P02 错误,查询将不会出错在铸造领域偏好):

select cod, pref
from 
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000

如能就此事提供任何解释或帮助,我们将不胜感激。提前谢谢你

仅用四个记录重现练习的脚本:

CREATE TABLE public.troq (
    cod varchar(13) NOT NULL
);

INSERT INTO public.troq (cod) VALUES 
('1234Trala')
,('Tururu')
,('4532Vargas')
,('n4567Titi')
;

(Troq records that should be recovered would be 1234Trala and 4532Vargas)

我只是尝试以不同的方式进行操作并且成功了。不明白为什么。有效的解决方案:

with CT as
(select cod, to_number(substr(cod, 1, 4), '0000') pref
from troq
where isnumeric(substr(cod,1, 4)))
select * from CT 
where pref < 5000;

我想这样我 更多 确保优化器首先执行内部查询,但据我所知 SQL 我不认为我无论如何都必须确保这一点。

问题在于 PostgreSQL 展平了子查询并优化了查询,以便首先评估条件 pref < 5000,因为 PostgreSQL 认为这样更便宜:

EXPLAIN (COSTS OFF)
select cod, pref
from 
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT                  
where pref < 5000;
                                     QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════
 Seq Scan on troq
   Filter: (((substr(cod, 1, 4))::integer < 5000) AND isnumeric(substr(cod, 1, 4)))
(2 rows)

is documented 和预期的一样:

The order of evaluation of subexpressions is not defined.

你可以做的一个技巧是告诉 PostgreSQL 该函数非常便宜(我同时设置它 IMMUTABLE,因为它是):

ALTER FUNCTION isnumeric COST 1 IMMUTABLE;

那改变了执行计划,让你的函数先被评估,避免错误:

EXPLAIN (COSTS OFF)
select cod, pref
from 
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))) CT
where pref < 5000;
                                     QUERY PLAN
════════════════════════════════════════════════════════════════════════════════════
 Seq Scan on troq
   Filter: (isnumeric(substr(cod, 1, 4)) AND ((substr(cod, 1, 4))::integer < 5000))
(2 rows)

但这当然不完全可靠(其他表达式也可能非常“便宜”),更好的解决方案是向子查询添加一个优化器屏障,如 OFFSET 0 以防止它从被压扁:

EXPLAIN (COSTS OFF)
select cod, pref
from 
(select cod, substr(cod, 1, 4)::integer pref
from troq
where isnumeric(substr(cod,1, 4))
OFFSET 0) CT
where pref < 5000;
                  QUERY PLAN                  
══════════════════════════════════════════════
 Subquery Scan on ct
   Filter: (ct.pref < 5000)
   ->  Seq Scan on troq
         Filter: isnumeric(substr(cod, 1, 4))
(4 rows)

另一种选择是使用正则表达式来验证前 4 个字符是否为数字,并完全放弃 is_numeric 函数。

select cod, pref
from 
    (select cod, to_number(substr(cod, 1, 4), '0000') pref
       from troq
      where  cod ~ '^\d{4}' 
    ) s
where pref < 5000;

现在,正则表达式往往更慢。但是它比子串、函数调用、类型转换、赋值、异常处理慢吗?只有测试才能说明问题。