遍历 SQLite 查询的结果作为后续查询的输入
Iterate through results of SQLite query as input to subsequent query
我有一个 SQLite table,它具有以下字段,表示从存储在磁盘上的单个文件中提取的元数据。每个文件都有一条记录:
__path denotes the full path and filename (in effect the PK)
__dirpath denotes the directory path excluding the filename
__dirname denotes the directory name in which the file is found
refid denotes an attribute of interest, pulled from the underlying file on disk
- 文件在创建时按__dirname分组存储
- 所有文件在
__dirname 应该有相同的 refid,但 refid 有时不存在
- 作为起点,我想确定每个 __dirpath
有不合格的文件。
我的识别违规文件夹的查询如下:
SELECT __dirpath
FROM (
SELECT DISTINCT __dirpath,
__dirname,
refid
FROM source
)
GROUP BY __dirpath
HAVING count( * ) > 1
ORDER BY __dirpath, __dirname;
是否可以遍历查询的结果并将每个结果用作另一个查询的输入,而无需借助 SQLite 来使用 Python 之类的东西?例如,要查看属于失败集的记录:
SELECT __dirpath, refid
FROM source
WHERE __dirpath = <nth result from aforementioned query>;
如果您想要所有有问题的行,一种选择是:
select t.*
from (
select t.*,
min(refid) over(partition by __dirpath, __dirname) as min_refid,
max(refid) over(partition by __dirpath, __dirname) as max_refid
from mytable t
) t
where min_refid <> max_refid
逻辑是比较具有相同目录路径和目录名称的每组行的最小值和最大值refid
。如果它们不同,则该行令人反感。
我们也可以使用 exists
- 这将更好地处理 refid
中可能的 null
值:
select t.*
from mytable t
where exists (
select 1
from mytable t1
where
t1.__dirpath = t.__dirpath
and t1.__dirname = t.__dirname
and t1.ref_id is not t.ref_id
)
我有一个 SQLite table,它具有以下字段,表示从存储在磁盘上的单个文件中提取的元数据。每个文件都有一条记录:
__path denotes the full path and filename (in effect the PK)
__dirpath denotes the directory path excluding the filename
__dirname denotes the directory name in which the file is found
refid denotes an attribute of interest, pulled from the underlying file on disk
- 文件在创建时按__dirname分组存储
- 所有文件在 __dirname 应该有相同的 refid,但 refid 有时不存在
- 作为起点,我想确定每个 __dirpath 有不合格的文件。
我的识别违规文件夹的查询如下:
SELECT __dirpath
FROM (
SELECT DISTINCT __dirpath,
__dirname,
refid
FROM source
)
GROUP BY __dirpath
HAVING count( * ) > 1
ORDER BY __dirpath, __dirname;
是否可以遍历查询的结果并将每个结果用作另一个查询的输入,而无需借助 SQLite 来使用 Python 之类的东西?例如,要查看属于失败集的记录:
SELECT __dirpath, refid
FROM source
WHERE __dirpath = <nth result from aforementioned query>;
如果您想要所有有问题的行,一种选择是:
select t.*
from (
select t.*,
min(refid) over(partition by __dirpath, __dirname) as min_refid,
max(refid) over(partition by __dirpath, __dirname) as max_refid
from mytable t
) t
where min_refid <> max_refid
逻辑是比较具有相同目录路径和目录名称的每组行的最小值和最大值refid
。如果它们不同,则该行令人反感。
我们也可以使用 exists
- 这将更好地处理 refid
中可能的 null
值:
select t.*
from mytable t
where exists (
select 1
from mytable t1
where
t1.__dirpath = t.__dirpath
and t1.__dirname = t.__dirname
and t1.ref_id is not t.ref_id
)