SAS 临时 DB2 表 - 创建索引

SAS temporary DB2 tables - Creating an index

我还没有从我的 DBA 那里找到关于这个问题的明确答案。我在一个 DB2 仓库中……巨大的 tables。我经常用rsubmit练习temp tables,比如...

execute (declare global temporary table session.tmp1(task char(9))on commit preserve rows not logged) by db2;
execute (create unique index session.indexa on session.tmp1(task)) by db2;
insert into session.tmp1 select * from connection to db2 
(
select   distinct a.column
from     table1 a
where    ...
for fetch only with ur
);

然后当我需要那组特定的值时,我会加入它...

from session.tmp1 t
inner join tablex x on t.task = x.task

您会注意到我声明了一个索引(甚至是唯一索引)。我的问题是...如果索引在我用来构建临时文件 table 的原始 db2 table 上尚不存在...我创建的索引是否重要?此外,一位高级分析师建议我,如果我在构建临时 table 时 "order by" 我正在索引的 attribute/column,当我将其用于额外 table 秒。任何人都可以确认这些问题吗?可能看起来微不足道......但我真的在寻找一些关于速度的技巧,特别是当我击中的 tables 非常大时......

先生。分析师“订购依据”备注:

视场景而定:

If during creation of the tmp table no order by is used in the plan at all, and in the following join the query plan does not show the need to order the data, the time to order the data in the tmp table will be larger then the time saved in the join.

If in the join the data is ordered before joining, adding an order by might increase the speed of this step (the plan will still show the ordering of the data since it does not know that the data is ordered), but the time gained will most likely be at maximum equal to the time you lost when ordering the tmp table. So when you use your tmp table more then once, it might save you some time. Use it just once, and it is pretty useless.

您创建的索引:

The index will help with later joins/where conditions on the tmp table. So if you are using the indexed columns in a join or where: Go for it.

One exception on this is: Sometimes when you join all data in a table, the index is not beneficial. It might be ignored (see the plan again), or it might even slow you down when it is used. This is highly DBMS dependent: Oracle: Full table scan most of the time quicker when joining all rows, MySQL/MariaDB: Even with full joins, adding the index saves you hours, SQL Server determines it by itself pretty good (usually uses the index), DB2: Please post here once you determined this.