从数据层次结构中获取指定级别类型的对象 (Oracle 12 SQL)
Fetch object of specified level-type from data hierarchy (Oracle 12 SQL)
我在 table object_type_t
中的数据集如下所示:
OBJ_ID PARENT_OBJ OBJECT_TYPE OBJECT_DESC
--------- ------------ ------------- -----------------------
ES01 <null> ESTATE Bucks Estate
BUI01 ES01 BUILDING Leisure Centre
BUI02 ES01 BUILDING Fire Station
BUI03 <null> BUILDING Housing Block
SQ01 BUI01 ROOM Squash Court
BTR01 BUI02 ROOM Bathroom
AP01 BUI03 APARTMENT Flat No. 1
AP02 BUI03 APARTMENT Flat No. 2
BTR02 AP01 ROOM Bathroom
BDR01 AP01 ROOM Bedroom
BTR03 AP02 ROOM Bathroom
SHR01 BTR01 OBJECT Shower
SHR02 BTR02 OBJECT Shower
SHR03 BTR03 OBJECT Shower
从实际的层次结构来看,它看起来像这样:
ES01
|--> BUI01
| |--> SQ01
|--> BUI02
| |--> BTR01
|--> SHR01
=======
BUI03
|--> AP01
| |--> BTR02
| | |--> SHR02
| |--> BDR01
|--> AP02
|--> BTR03
|--> SHR03
我知道如何使用分层查询,例如CONNECT BY PRIOR
。我也知道如何通过 connect_by_root
找到树的根。但是我想要做的是找到给定的 "level" 树(即不是根级别,而是给定对象的 "BUIDLING" 级别)。
例如,我希望能够查询出层次结构中属于 BUI01
的每个对象。
然后反过来,给定一个对象 ID,我希望能够查询出该对象的关联(比如说)ROOM object_id
。
如果我可以将每个 OBJECT_TYPE
与给定的 level
相关联,事情会容易得多。但是正如您从上面的示例中看到的那样,BUILDING 并不总是出现在层次结构中的级别 1。
我最初的想法是将数据提取为中间表格格式(可能是物化视图),如下所示。这将允许我通过对物化视图的简单 SQL 查询找到我想要的数据:
OBJ_ID OBJECT_DESC ESTATE_OBJ BUILDING_OBJ ROOM_OBJ
--------- ---------------- ---------- ------------ ----------
ES01 Bucks Estate ES01
BUI01 Leisure Centre ES01 BUI01
BUI02 Fire Station ES01 BUI02
BUI03 Housing Block BUI03
SQ01 Squash Court ES01 BUI01 SQ01
BTR01 Bathroom ES01 BUI02 BTR01
AP01 Flat No. 1 BUI03
AP02 Flat No. 2 BUI03
BTR02 Bathroom BUI03 BTR02
BDR01 Bedroom BUI03 BDR01
BTR03 Bathroom BUI03 BTR03
SHR01 Shower ES01 BUI02 BTR01
SHR02 Shower BUI03 BTR02
SHR03 Shower BUI03 BTR03
但是(缺少写作 PL/SLQ,我想避免),我无法简明地构造一个可以实现这种表格格式的查询。
有谁知道我该怎么做?可以吗?
解决方案必须在 Oracle 12c 中执行table。
另外:性能很重要,因为我的底层数据结构包含几十万行,而且结构可能非常深。因此,较慢的解决方案将优先于较快的解决方案:-)
提前感谢您的帮助。
所需的输出有 3 列,由对象类型决定。一般来说,这可以用更多的列来扩展,一个用于字段 object_type
的每个可能值。即使使用给定的示例数据,也可以想象一个额外的列 apartment_obj
.
为了使这个泛型不需要像对象类型值一样多次自连接 table,可以使用 CONNECT BY
和 PIVOT
子句的组合:
SELECT *
FROM (
SELECT obj_id,
object_desc,
CONNECT_BY_ROOT obj_id AS pivot_col_value,
CONNECT_BY_ROOT object_type AS pivot_col_name
FROM object_type_t
-- skip the STARTS WITH clause to get all connected pairs
CONNECT BY parent_obj = PRIOR obj_id
)
PIVOT (
MAX(pivot_col_value) AS obj
FOR (pivot_col_name) IN (
'ESTATE' AS estate,
'BUILDING' AS building,
'ROOM' AS room
)
);
FOR ... IN
子句有一个硬编码的所需列名称列表——没有 _obj
后缀,因为它是在数据透视转换期间添加的。
Oracle 不允许动态检索此列表。注意:当使用 PIVOT XML
syntax 时,此规则有一个例外,但是你会在一列中得到 XML 输出,然后你需要对其进行解析。那将是相当低效的。
带有 CONNECT BY
子句的子查询没有 STARTS WITH
子句,这使得该查询以任何记录为起点并从那里产生后代。与 CONNECT_BY_ROOT
选择一起,这允许生成所有 connected 对的完整列表,其中两者在层次结构中的距离可以是任何值。 JOIN
然后匹配两者中较深的那个,所以你得到那个节点的所有祖先(包括节点本身)。然后将这些祖先旋转到列中。
CONNECT BY
子查询也可以写成层级向后遍历的方式。输出相同,但可能存在性能差异。如果是这样,我认为变化可以有更好的性能,但我没有在大型数据集上测试它:
SELECT *
FROM (
SELECT CONNECT_BY_ROOT obj_id AS obj_id,
CONNECT_BY_ROOT object_desc AS object_desc,
obj_id AS pivot_col_value,
object_type AS pivot_col_name
FROM object_type_t
-- Connect in backward direction:
CONNECT BY obj_id = PRIOR parent_obj
)
PIVOT (
MAX(pivot_col_value) AS obj
FOR (pivot_col_name) IN (
'ESTATE' AS estate,
'BUILDING' AS building,
'ROOM' AS room
)
);
请注意,在此变体中,CONNECT_BY_ROOT
returns 对中的较深节点,因为相反的遍历。
基于自连接的备选方案(上一个答案)
您可以使用此查询:
SELECT t1.obj_id,
t1.object_desc,
CASE 'ESTATE'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END estate_obj,
CASE 'BUILDING'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END building_obj,
CASE 'ROOM'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END room_obj
FROM object_type_t t1
LEFT JOIN object_type_t t2 ON t2.obj_id = t1.parent_obj
LEFT JOIN object_type_t t3 ON t3.obj_id = t2.parent_obj
如果我正确理解你的需求,也许你可以避免表格视图,直接查询你的 table;
假设你想找到属于BUI01
的所有对象,你可以试试:
with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(
select 'ES01','','ESTATE','Bucks Estate' from dual union all
select 'BUI01','ES01','BUILDING','Leisure Centre' from dual union all
select 'BUI02','ES01','BUILDING','Fire Station' from dual union all
select 'BUI03','','BUILDING','Housing Block' from dual union all
select 'SQ01','BUI01','ROOM','Squash Court' from dual union all
select 'BTR01','BUI02','ROOM','Bathroom' from dual union all
select 'AP01','BUI03','APARTMENT','Flat No. 1' from dual union all
select 'AP02','BUI03','APARTMENT','Flat No. 2' from dual union all
select 'BTR02','AP01','ROOM','Bathroom' from dual union all
select 'BDR01','AP01','ROOM','Bedroom' from dual union all
select 'BTR03','AP02','ROOM','Bathroom' from dual union all
select 'SHR01','BTR01','OBJECT','Shower' from dual union all
select 'SHR02','BTR02','OBJECT','Shower' from dual union all
select 'SHR03','BTR03','OBJECT','Shower' from dual
)
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by prior obj_id = parent_obj
start with obj_ID = 'BUI01'
这认为BUI01
属于自己;如果您不想这样,您可以通过非常简单的方式修改查询以切断您的起始值。
反之,假设你正在寻找SHR01
所在的房间,你可以尝试以下方法;它基本上是相同的递归思想,但按升序排列,而不是降序排列:
with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(...
)
SELECT *
FROM (
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by obj_id = PRIOR parent_obj
start with obj_ID = 'SHR01'
)
WHERE object_type = 'ROOM'
在这两种情况下,你只扫描你的table一次,没有任何其他结构;这样,就有机会足够快了。
非常感谢@trincot 的启发,我制定了以下解决方案。它在生产数据上不是非常快,但它确实适用于任意深度的树。这不是动态的唯一方法是必须提前选择要提取树的哪些级别,并且必须添加一个额外的列来捕获该数据。
原理是建一个sys_connect_by_path
栏,用正则表达式从里面提取需要的关卡数据
WITH base_data (obj_id, parent_obj, object_type, object_desc) AS (
SELECT 'ES01','','ESTATE','Bucks Estate' FROM dual union all
SELECT 'BUI01','ES01','BUILDING','Leisure Centre' FROM dual union all
SELECT 'BUI02','ES01','BUILDING','Fire Station' FROM dual union all
SELECT 'BUI03','','BUILDING','Housing Block' FROM dual union all
SELECT 'SQ01','BUI01','ROOM','Squash Court' FROM dual union all
SELECT 'BTR01','BUI02','ROOM','Bathroom' FROM dual union all
SELECT 'AP01','BUI03','APARTMENT','Flat No. 1' FROM dual union all
SELECT 'AP02','BUI03','APARTMENT','Flat No. 2' FROM dual union all
SELECT 'BTR02','AP01','ROOM','Bathroom' FROM dual union all
SELECT 'BDR01','AP01','ROOM','Bedroom' FROM dual union all
SELECT 'BTR03','AP02','ROOM','Bathroom' FROM dual union all
SELECT 'SHR01','BTR01','OBJECT','Shower' FROM dual union all
SELECT 'SHR02','BTR02','OBJECT','Shower' FROM dual union all
SELECT 'SHR03','BTR03','OBJECT','Shower' FROM dual ),
obj_hierarchy AS (
SELECT object_type, obj_id, object_desc, parent_obj, sys_connect_by_path(object_type||':'||obj_id,'/')||'/' r_path
FROM base_data
START WITH parent_obj IS null
CONNECT BY PRIOR obj_id = parent_obj
)
SELECT obj_id, object_desc,
CASE
WHEN instr(h.r_path, 'ESTATE:') > 1
THEN regexp_replace (h.r_path,'.*/ESTATE:([^/]+).*$', '')
ELSE ''
END obj_estate,
CASE
WHEN instr(h.r_path, 'BUILDING:') > 1
THEN regexp_replace (h.r_path,'.*/BUILDING:([^/]+).*$', '')
ELSE ''
END obj_building,
CASE
WHEN instr(h.r_path, 'ROOM:') > 1
THEN regexp_replace (h.r_path,'.*/ROOM:([^/]+).*$', '')
ELSE ''
END obj_room
FROM obj_hierarchy h
我在 table object_type_t
中的数据集如下所示:
OBJ_ID PARENT_OBJ OBJECT_TYPE OBJECT_DESC
--------- ------------ ------------- -----------------------
ES01 <null> ESTATE Bucks Estate
BUI01 ES01 BUILDING Leisure Centre
BUI02 ES01 BUILDING Fire Station
BUI03 <null> BUILDING Housing Block
SQ01 BUI01 ROOM Squash Court
BTR01 BUI02 ROOM Bathroom
AP01 BUI03 APARTMENT Flat No. 1
AP02 BUI03 APARTMENT Flat No. 2
BTR02 AP01 ROOM Bathroom
BDR01 AP01 ROOM Bedroom
BTR03 AP02 ROOM Bathroom
SHR01 BTR01 OBJECT Shower
SHR02 BTR02 OBJECT Shower
SHR03 BTR03 OBJECT Shower
从实际的层次结构来看,它看起来像这样:
ES01
|--> BUI01
| |--> SQ01
|--> BUI02
| |--> BTR01
|--> SHR01
=======
BUI03
|--> AP01
| |--> BTR02
| | |--> SHR02
| |--> BDR01
|--> AP02
|--> BTR03
|--> SHR03
我知道如何使用分层查询,例如CONNECT BY PRIOR
。我也知道如何通过 connect_by_root
找到树的根。但是我想要做的是找到给定的 "level" 树(即不是根级别,而是给定对象的 "BUIDLING" 级别)。
例如,我希望能够查询出层次结构中属于 BUI01
的每个对象。
然后反过来,给定一个对象 ID,我希望能够查询出该对象的关联(比如说)ROOM object_id
。
如果我可以将每个 OBJECT_TYPE
与给定的 level
相关联,事情会容易得多。但是正如您从上面的示例中看到的那样,BUILDING 并不总是出现在层次结构中的级别 1。
我最初的想法是将数据提取为中间表格格式(可能是物化视图),如下所示。这将允许我通过对物化视图的简单 SQL 查询找到我想要的数据:
OBJ_ID OBJECT_DESC ESTATE_OBJ BUILDING_OBJ ROOM_OBJ
--------- ---------------- ---------- ------------ ----------
ES01 Bucks Estate ES01
BUI01 Leisure Centre ES01 BUI01
BUI02 Fire Station ES01 BUI02
BUI03 Housing Block BUI03
SQ01 Squash Court ES01 BUI01 SQ01
BTR01 Bathroom ES01 BUI02 BTR01
AP01 Flat No. 1 BUI03
AP02 Flat No. 2 BUI03
BTR02 Bathroom BUI03 BTR02
BDR01 Bedroom BUI03 BDR01
BTR03 Bathroom BUI03 BTR03
SHR01 Shower ES01 BUI02 BTR01
SHR02 Shower BUI03 BTR02
SHR03 Shower BUI03 BTR03
但是(缺少写作 PL/SLQ,我想避免),我无法简明地构造一个可以实现这种表格格式的查询。
有谁知道我该怎么做?可以吗?
解决方案必须在 Oracle 12c 中执行table。
另外:性能很重要,因为我的底层数据结构包含几十万行,而且结构可能非常深。因此,较慢的解决方案将优先于较快的解决方案:-)
提前感谢您的帮助。
所需的输出有 3 列,由对象类型决定。一般来说,这可以用更多的列来扩展,一个用于字段 object_type
的每个可能值。即使使用给定的示例数据,也可以想象一个额外的列 apartment_obj
.
为了使这个泛型不需要像对象类型值一样多次自连接 table,可以使用 CONNECT BY
和 PIVOT
子句的组合:
SELECT *
FROM (
SELECT obj_id,
object_desc,
CONNECT_BY_ROOT obj_id AS pivot_col_value,
CONNECT_BY_ROOT object_type AS pivot_col_name
FROM object_type_t
-- skip the STARTS WITH clause to get all connected pairs
CONNECT BY parent_obj = PRIOR obj_id
)
PIVOT (
MAX(pivot_col_value) AS obj
FOR (pivot_col_name) IN (
'ESTATE' AS estate,
'BUILDING' AS building,
'ROOM' AS room
)
);
FOR ... IN
子句有一个硬编码的所需列名称列表——没有 _obj
后缀,因为它是在数据透视转换期间添加的。
Oracle 不允许动态检索此列表。注意:当使用 PIVOT XML
syntax 时,此规则有一个例外,但是你会在一列中得到 XML 输出,然后你需要对其进行解析。那将是相当低效的。
带有 CONNECT BY
子句的子查询没有 STARTS WITH
子句,这使得该查询以任何记录为起点并从那里产生后代。与 CONNECT_BY_ROOT
选择一起,这允许生成所有 connected 对的完整列表,其中两者在层次结构中的距离可以是任何值。 JOIN
然后匹配两者中较深的那个,所以你得到那个节点的所有祖先(包括节点本身)。然后将这些祖先旋转到列中。
CONNECT BY
子查询也可以写成层级向后遍历的方式。输出相同,但可能存在性能差异。如果是这样,我认为变化可以有更好的性能,但我没有在大型数据集上测试它:
SELECT *
FROM (
SELECT CONNECT_BY_ROOT obj_id AS obj_id,
CONNECT_BY_ROOT object_desc AS object_desc,
obj_id AS pivot_col_value,
object_type AS pivot_col_name
FROM object_type_t
-- Connect in backward direction:
CONNECT BY obj_id = PRIOR parent_obj
)
PIVOT (
MAX(pivot_col_value) AS obj
FOR (pivot_col_name) IN (
'ESTATE' AS estate,
'BUILDING' AS building,
'ROOM' AS room
)
);
请注意,在此变体中,CONNECT_BY_ROOT
returns 对中的较深节点,因为相反的遍历。
基于自连接的备选方案(上一个答案)
您可以使用此查询:
SELECT t1.obj_id,
t1.object_desc,
CASE 'ESTATE'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END estate_obj,
CASE 'BUILDING'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END building_obj,
CASE 'ROOM'
WHEN t1.object_type THEN t1.obj_id
WHEN t2.object_type THEN t2.obj_id
WHEN t3.object_type THEN t3.obj_id
END room_obj
FROM object_type_t t1
LEFT JOIN object_type_t t2 ON t2.obj_id = t1.parent_obj
LEFT JOIN object_type_t t3 ON t3.obj_id = t2.parent_obj
如果我正确理解你的需求,也许你可以避免表格视图,直接查询你的 table;
假设你想找到属于BUI01
的所有对象,你可以试试:
with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(
select 'ES01','','ESTATE','Bucks Estate' from dual union all
select 'BUI01','ES01','BUILDING','Leisure Centre' from dual union all
select 'BUI02','ES01','BUILDING','Fire Station' from dual union all
select 'BUI03','','BUILDING','Housing Block' from dual union all
select 'SQ01','BUI01','ROOM','Squash Court' from dual union all
select 'BTR01','BUI02','ROOM','Bathroom' from dual union all
select 'AP01','BUI03','APARTMENT','Flat No. 1' from dual union all
select 'AP02','BUI03','APARTMENT','Flat No. 2' from dual union all
select 'BTR02','AP01','ROOM','Bathroom' from dual union all
select 'BDR01','AP01','ROOM','Bedroom' from dual union all
select 'BTR03','AP02','ROOM','Bathroom' from dual union all
select 'SHR01','BTR01','OBJECT','Shower' from dual union all
select 'SHR02','BTR02','OBJECT','Shower' from dual union all
select 'SHR03','BTR03','OBJECT','Shower' from dual
)
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by prior obj_id = parent_obj
start with obj_ID = 'BUI01'
这认为BUI01
属于自己;如果您不想这样,您可以通过非常简单的方式修改查询以切断您的起始值。
反之,假设你正在寻找SHR01
所在的房间,你可以尝试以下方法;它基本上是相同的递归思想,但按升序排列,而不是降序排列:
with test(OBJ_ID, PARENT_OBJ, OBJECT_TYPE, OBJECT_DESC) as
(...
)
SELECT *
FROM (
select OBJECT_TYPE, OBJ_ID, OBJECT_DESC
from test
connect by obj_id = PRIOR parent_obj
start with obj_ID = 'SHR01'
)
WHERE object_type = 'ROOM'
在这两种情况下,你只扫描你的table一次,没有任何其他结构;这样,就有机会足够快了。
非常感谢@trincot 的启发,我制定了以下解决方案。它在生产数据上不是非常快,但它确实适用于任意深度的树。这不是动态的唯一方法是必须提前选择要提取树的哪些级别,并且必须添加一个额外的列来捕获该数据。
原理是建一个sys_connect_by_path
栏,用正则表达式从里面提取需要的关卡数据
WITH base_data (obj_id, parent_obj, object_type, object_desc) AS (
SELECT 'ES01','','ESTATE','Bucks Estate' FROM dual union all
SELECT 'BUI01','ES01','BUILDING','Leisure Centre' FROM dual union all
SELECT 'BUI02','ES01','BUILDING','Fire Station' FROM dual union all
SELECT 'BUI03','','BUILDING','Housing Block' FROM dual union all
SELECT 'SQ01','BUI01','ROOM','Squash Court' FROM dual union all
SELECT 'BTR01','BUI02','ROOM','Bathroom' FROM dual union all
SELECT 'AP01','BUI03','APARTMENT','Flat No. 1' FROM dual union all
SELECT 'AP02','BUI03','APARTMENT','Flat No. 2' FROM dual union all
SELECT 'BTR02','AP01','ROOM','Bathroom' FROM dual union all
SELECT 'BDR01','AP01','ROOM','Bedroom' FROM dual union all
SELECT 'BTR03','AP02','ROOM','Bathroom' FROM dual union all
SELECT 'SHR01','BTR01','OBJECT','Shower' FROM dual union all
SELECT 'SHR02','BTR02','OBJECT','Shower' FROM dual union all
SELECT 'SHR03','BTR03','OBJECT','Shower' FROM dual ),
obj_hierarchy AS (
SELECT object_type, obj_id, object_desc, parent_obj, sys_connect_by_path(object_type||':'||obj_id,'/')||'/' r_path
FROM base_data
START WITH parent_obj IS null
CONNECT BY PRIOR obj_id = parent_obj
)
SELECT obj_id, object_desc,
CASE
WHEN instr(h.r_path, 'ESTATE:') > 1
THEN regexp_replace (h.r_path,'.*/ESTATE:([^/]+).*$', '')
ELSE ''
END obj_estate,
CASE
WHEN instr(h.r_path, 'BUILDING:') > 1
THEN regexp_replace (h.r_path,'.*/BUILDING:([^/]+).*$', '')
ELSE ''
END obj_building,
CASE
WHEN instr(h.r_path, 'ROOM:') > 1
THEN regexp_replace (h.r_path,'.*/ROOM:([^/]+).*$', '')
ELSE ''
END obj_room
FROM obj_hierarchy h