Query Optimizer Enhancement
in Informix 12.1
Bingjie Miao
IBM
1
Agenda
• sqexplain overview
• Set operations
• View folding enhancements
• Subquery flattening after view folding
• ANSI OUTER JOIN to informix outer join
transformation
• Hash join support for ANSI JOIN queries
• Optimizer costing enhancement for hash join
• Temp table optimization
• LATERAL derived table
• Predicate derivation for ANSI JOIN query
sqexplain Overview
• Print out query plan information
• Includes runtime statistics
• Ways to turn on explain
– set explain on;
– set explain file to ‘file_name’;
– set explain on avoid_execute;
– EXPLAIN directive on a query
Sections in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-07-2013 17:28:30)
------
select {+ FULL(tab1) AVOID_FULL(tab2)} *
from tab1, tab2 where tab1.id = tab2.id
DIRECTIVES FOLLOWED:
FULL ( tab1 )
AVOID_FULL ( tab2 )
DIRECTIVES NOT FOLLOWED:
Estimated Cost: 4
Estimated # of Rows Returned: 1
1) informix.tab1: SEQUENTIAL SCAN
2) informix.tab2: INDEX PATH
(1) Index Name: informix.t2idx1
Index Keys: id (Serial, fragments: ALL)
Lower Index Filter: informix.tab1.id = informix.tab2.id
NESTED LOOP JOIN
query text
general query
information
access paths
and joins
Sections in sqexplain – cont.
Query statistics:
-----------------
Table map :
----------------------------
Internal name Table name
----------------------------
t1 tab1
t2 tab2
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t1 1000 1 1000 00:00.00 2
type table rows_prod est_rows rows_scan time est_cost
-------------------------------------------------------------------
scan t2 1000 1 1000 00:00.01 0
type rows_prod est_rows time est_cost
-------------------------------------------------
nljoin 1000 1 00:00.01 4
Runtime query statistics
sqexplain for ANSI JOIN query
QUERY:
------
Select * from (t1 left outer join
(t2 left outer join t3 on t2.c1=t3.c1) on t1.c2=t2.c2 and t2.c1 < 5)
left outer join t4 on t1.c1=t4.c1
Estimated Cost: 14
Estimated # of Rows Returned: 4
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
Filters: informix.t2.c1 < 5
(1) Index Keys: c2 (Serial, fragments: ALL)
Lower Index Filter: informix.t1.c2 = informix.t2.c2
3) informix.t3: AUTOINDEX PATH
(1) Index Keys: c1
Lower Index Filter: informix.t2.c1 = informix.t3.c1
ON-Filters:informix.t2.c1 = informix.t3.c1
NESTED LOOP JOIN(LEFT OUTER JOIN)
ON-Filters:(informix.t1.c2 = informix.t2.c2 AND informix.t2.c1 < 5 )
NESTED LOOP JOIN(LEFT OUTER JOIN)
4) sqlqa.t4: AUTOINDEX PATH
(1) Index Keys: c1
Lower Index Filter: informix.t1.c1 = informix.t4.c1
ON-Filters:informix.t1.c1 = informix.t4.c1
NESTED LOOP JOIN(LEFT OUTER JOIN)
Set Operations
• Similar to UNION
• INTERSECT – rows common to both arms
– internally transformed into EXISTS subquery with
special NULL handling
• MINUS or EXCEPT – rows in first arm that’s not
in second arm
– internally transformed into NOT EXISTS subquery
with special NULL handling
Set Operations in explain
QUERY: (OPTIMIZATION TIMESTAMP: 03-08-2013 15:04:22)
------
select intcol from tab1
intersect
select intcol2 from tab2
Estimated Cost: 4
Estimated # of Rows Returned: 1
1) informix.tab1: SEQUENTIAL SCAN
2) informix.tab2: SEQUENTIAL SCAN (First Row)
Filters: informix.tab1.intcol == informix.tab2.intcol2
NESTED LOOP JOIN (Semi Join)
Set Operations in explain – cont.
QUERY: (OPTIMIZATION TIMESTAMP: 03-08-2013 15:13:28)
------
select intcol, charcol from tab1
intersect
select intcol2, charcol2 from tab2
minus
select intcol3, charcol3 from tab3
Estimated Cost: 6
Estimated # of Rows Returned: 1
1) informix.tab1: SEQUENTIAL SCAN
2) informix.tab2: SEQUENTIAL SCAN (First Row)
Filters: (informix.tab1.intcol == informix.tab2.intcol2 AND
informix.tab1.charcol == informix.tab2.charcol2 )
NESTED LOOP JOIN (Semi Join)
3) informix.tab3: SEQUENTIAL SCAN (First Row)
Filters: (informix.tab1.charcol == informix.tab3.charcol3 AND
informix.tab1.intcol == informix.tab3.intcol3 )
NESTED LOOP JOIN (Anti Semi Join)
View folding enhancement
• Views containing ANSI JOIN or Informix outer
join can now be folded into main query, for
better performance
– view must be referenced as a dominant table in
the main query
– if view is used as subservient table, then the view
still needs to be materialized first
View folding example
create view v1(vc1, vc2) as
select t1.c, t2.c
from t1 left join t2 on t2.a = t1.a;
select *
from v1 left join t3 on v1.vc1 = t3.a;
select *
from v1 right join t3 on v1.vc1 = t3.a;
View folding in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:23:41)
------
select * from v1 left join t3 on v1.vc1 = t3.a
Estimated Cost: 6
Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t2.a = informix.t1.a
NESTED LOOP JOIN
3) informix.t3: INDEX PATH
(1) Index Name: informix.ind3
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t1.c = informix.t3.a
NESTED LOOP JOIN
View folding in sqexplain – cont.
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:28:47)
------
create view "informix".v1 (vc1,vc2) as select x0.c ,x1.c from ("informix".t1 x0
left join "informix".t2 x1 on (x1.a = x0.a ) );
Estimated Cost: 4
Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t2.a = informix.t1.a
NESTED LOOP JOIN
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:28:47)
------
select * from v1 right join t3 on v1.vc1 = t3.a
Estimated Cost: 5
Estimated # of Rows Returned: 3
1) informix.t3: SEQUENTIAL SCAN
2) (Temp Table For View): SEQUENTIAL SCAN
DYNAMIC HASH JOIN
Dynamic Hash Filters: (Temp Table For View).vc1 = informix.t3.a
Subquery flattening after view folding
• Subquery flattening improves query
performance, however previously it is
disabled if query contains view or derived
table reference
• In 12.1 subquery flattening is attempted again
after view folding process, and can be done
with the view either folded, or materialized
into temp table
Subquery flattening after view folding
in sqexplain
create view v4 (v4_c1, v4_c2) as
select t1_c1 + 1, MAX(1) from t1 group by 1;
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:52:44)
------
select 1 from v4 where exists (select 1 from t2 where t2_c1 = v4_c1)
Estimated Cost: 6
Estimated # of Rows Returned: 1
Temporary Files Required For: Group By
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: SEQUENTIAL SCAN (First Row)
Filters: informix.t2.t2_c1 = informix.t1.t1_c1 + 1
NESTED LOOP JOIN (Semi Join)
Subquery flattening after view folding
in sqexplain – cont.
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:02:37)
------
select z.a from z
where z.b = some (select v.a from vag1 v, z z1 where v.a > z.a)
Estimated Cost: 9
Estimated # of Rows Returned: 11
1) informix.z: SEQUENTIAL SCAN
Filters: informix.z.b > informix.z.a
2) (Temp Table For View): AUTOINDEX PATH (First Row)
(1) Index Name: (Auto Index)
Index Keys: a (Key-Only)
Lower Index Filter: informix.z.b = (Temp Table For View).a
NESTED LOOP JOIN (Semi Join)
3) informix.z1: SEQUENTIAL SCAN (First Row)
NESTED LOOP JOIN (Semi Join)
ANSI OUTER JOIN to Informix Outer
Join Transformation
• “Simple” ANSI OUTER JOIN can be converted
to informix outer join
– potentially more join choices by the optimizer
– ON clause filters must be of the type “col = col”
involving the current join
– WHERE clause filters cannot reference subservient
tables, flattened subquery tables, correlated
subqueries or UDR references
• If one join is not transformed, then entire
query is not transformed
ANSI OUTER JOIN to Informix Outer
Join in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:23:32)
------
select * from t1 left join t2 on t1.a = t2.a
Estimated Cost: 4
Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t1.a = informix.t2.a
NESTED LOOP JOIN
ANSI OUTER JOIN to Informix Outer
Join in sqexplain – cont.
QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:24:01)
------
select * from t1 left join t2 on t1.a = t2.a and t1.a = 1
Estimated Cost: 4
Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t1.a = informix.t2.a
ON-Filters:(informix.t1.a = informix.t2.a AND informix.t1.a = 1 )
NESTED LOOP JOIN(LEFT OUTER JOIN)
Hash Join Support in ANSI JOIN
• Hash join is supported in ANSI JOIN queries
• Optimizer can consider and choose best join
method for each join – hash join can be faster
for large joins
• Optimizer costing is adjusted for situation
where build/probe sides for hash join can be
composite
Hash Join for ANSI JOIN in sqexplain
QUERY: (OPTIMIZATION TIMESTAMP: 03-14-2013 15:01:22)
------
select * from (t1 left join t2 on t1.a = t2.a )
left join (t3 inner join t4 on t3.a = t4.a) on t4.a = t1.a
Estimated Cost: 9
Estimated # of Rows Returned: 3
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: INDEX PATH
(1) Index Name: informix.ind2
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t1.a = informix.t2.a
ON-Filters:informix.t1.a = informix.t2.a
NESTED LOOP JOIN(LEFT OUTER JOIN)
3) informix.t3: SEQUENTIAL SCAN
4) informix.t4: INDEX PATH
(1) Index Name: informix.ind4
Index Keys: a (Serial, fragments: ALL)
Lower Index Filter: informix.t3.a = informix.t4.a
ON-Filters:informix.t3.a = informix.t4.a
NESTED LOOP JOIN
ON-Filters:informix.t4.a = informix.t1.a
DYNAMIC HASH JOIN (LEFT OUTER JOIN)
Dynamic Hash Filters: informix.t4.a = informix.t1.a
Optimizer costing improvements
• Current optimizer costing tends to favor index
based scans and joins, which can be
problematic for large tables
• In 12.1, introduced costing modifications to
make hash join more favorable for large tables
• Under control of undocumented ONCONFIG
parameter SQL_DEF_CTRL (off by default)
– add 0x200 and 0x800 bits
– set to 0xeb0 to include “on-by-default” bits
Optimizer costing example
SQL_DEF_CTRL=0x4b0
SELECT
dm.dm_s_symb AS stock,
MONTH(dm.dm_date) AS month,
COUNT(*) AS count_num_days,
MAX(dm.dm_vol) AS max_vol
FROM daily_market dm, security s
WHERE
dm.dm_s_symb = s.s_symb AND
YEAR(dm.dm_date) = "2001"
GROUP BY 1,2
Estimated Cost: 5018350
Estimated # of Rows Returned: 1779368
Temporary Files Required For: Group By
1) informix.s: INDEX PATH
(1) Index Name: informix.pk_security
Index Keys: s_symb (Key-Only)
(Serial, fragments: ALL)
2) informix.dm: INDEX PATH
Filters: YEAR (informix.dm.dm_date
) = 2001
(1) Index Name:
informix.fk_daily_market_security
Index Keys: dm_s_symb (Serial,
fragments: ALL)
Lower Index Filter:
informix.dm.dm_s_symb =
informix.s.s_symb
NESTED LOOP JOIN
SQL_DEF_CTRL=0xeb0
SELECT
dm.dm_s_symb AS stock,
MONTH(dm.dm_date) AS month,
COUNT(*) AS count_num_days,
MAX(dm.dm_vol) AS max_vol
FROM daily_market dm, security s
WHERE
dm.dm_s_symb = s.s_symb AND
YEAR(dm.dm_date) = "2001"
GROUP BY 1,2
Estimated Cost: 4183207
Estimated # of Rows Returned: 1779368
Temporary Files Required For: Group By
1) informix.dm: SEQUENTIAL SCAN
Filters: YEAR (informix.dm.dm_date
) = 2001
2) informix.s: INDEX PATH
(1) Index Name: informix.pk_security
Index Keys: s_symb (Key-Only)
(Serial, fragments: ALL)
DYNAMIC HASH JOIN
Dynamic Hash Filters:
infomix.dm.dm_s_symb =
informix.s.s_symb
Temp table optimization
• Temp tables are created when a view or
derived table cannot be folded into main
query
• Previously when a temp table is created, it
includes all columns from underlying tables
• In 12.1, a temp table only includes columns
that are required in the query
– smaller temp table
– more efficient query processing
Temp table optimization example
select rtrim(D12.C36), rtrim(D12.C48), D12.C103, D12.C104
from ( select stock_trans_type.stt_type as C0,
stock_trans_type.stt_desc as C1,
stock_movements.stk_trans_type as C2,
......
stock_master.user_num1 as C103,
system_table.systbl_code as C104
from stock_trans_type, stock_master, system_table,
outer stock_movements
where ......
) D12
right outer join system_type
on D12.C29 = system_type.type_id
where D12.C12 between 129 and 256
and D12.C16 is not null;
TEMP table for D12 contains only the following columns:
C12, C16, C29, C36, C48, C103, C104
LATERAL derived table
• Correlated column reference inside a derived table
select * from t1, LATERAL (select * from t2
where t1.c1 = t2.c2 ) as dtab(dc1);
• LATERAL derived table may or may not be folded into
main query
• Logic in the optimizer to ensure LATERAL correlation
reference is properly satisfied at run time
LATERAL derived table in sqexplain
select t1.c1, dc1 from t1, LATERAL (select t2.c2
from t2 where t1.c1 = t2.c2) as dtab(dc1)
Estimated Cost: 4
Estimated # of Rows Returned: 1
1) informix.t1: SEQUENTIAL SCAN
2) informix.t2: SEQUENTIAL SCAN
DYNAMIC HASH JOIN
Dynamic Hash Filters: informix.t1.c1 =
informix.t2.c2
LATERAL derived table in sqexplain
– cont.
select * from t1, t2, t3, t4, LATERAL ( select t5.c5 from t5
where t1.c1 = t5.c5 and t5.c5 < t2.c2 group by 1) as
dtab(vc1) where t4.c4 = t3.c3
1) informix.t3: SEQUENTIAL SCAN
2) informix.t4: SEQUENTIAL SCAN
DYNAMIC HASH JOIN
Dynamic Hash Filters: informix.t4.c4 = informix.t3.c3
3) informix.t2: SEQUENTIAL SCAN
NESTED LOOP JOIN
4) informix.t1: SEQUENTIAL SCAN
NESTED LOOP JOIN
5) (Temp Table For Collection Subquery): SEQUENTIAL SCAN
NESTED LOOP JOIN
Predicate derivation for
ANSI JOIN Query
• Optimizer is able to derive predicates based
on existing predicates
– t1.c1 = t2.c2 and t1.c1 = t3.c3  t2.c2 = t3.c3
– t1.c1 = t2.c2 and t1.c1 >= 5  t2.c2 >= 5
• Predicate derivation is now enabled for ANSI
JOIN query as well (among dominant tables)
Predicate derivation for ANSI JOIN
in sqexplainQUERY: (OPTIMIZATION TIMESTAMP: 03-15-2013 12:17:32)
------
select int1, value1, word1, int3, int4, value4
from aoj1 left join (aoj3 left join aoj4 on value3 = value4)
on (value1 = value3 and int1 = int3)
where value3 > 15
Estimated Cost: 6
Estimated # of Rows Returned: 1
1) informix.aoj1: INDEX PATH
(1) Index Name: informix.aoj1_value1
Index Keys: value1 (Serial, fragments: ALL)
Lower Index Filter: informix.aoj1.value1 > 15
2) informix.aoj3: AUTOINDEX PATH
(1) Index Name: (Auto Index)
Index Keys: value3 int3 (Key-Only)
Lower Index Filter: (informix.aoj1.value1 = informix.aoj3.value3
AND informix.aoj1.int1 = informix.aoj3.int3 )
Index Key Filters: (informix.aoj3.value3 > 15 )
3) informix.aoj4: SEQUENTIAL SCAN
ON-Filters:informix.aoj3.value3 = informix.aoj4.value4
DYNAMIC HASH JOIN (LEFT OUTER JOIN)
Dynamic Hash Filters: informix.aoj3.value3 = informix.aoj4.value4
ON-Filters:(informix.aoj1.value1 = informix.aoj3.value3 AND informix.aoj1.in
t1 = informix.aoj3.int3 )
NESTED LOOP JOIN
Summary
• sqexplain overview
• Set operations
• View folding enhancements
• Subquery flattening after view folding
• ANSI OUTER JOIN to informix outer join
transformation
• Hash join support for ANSI JOIN queries
• Optimizer costing enhancement for hash join
• Temp table optimization
• LATERAL derived table
• Predicate derivation for ANSI JOIN query
Questions?
Bingjie Miao
bingjie@us.ibm.com
32

Optimizer Enhancement in Informix

  • 1.
    Query Optimizer Enhancement inInformix 12.1 Bingjie Miao IBM 1
  • 2.
    Agenda • sqexplain overview •Set operations • View folding enhancements • Subquery flattening after view folding • ANSI OUTER JOIN to informix outer join transformation • Hash join support for ANSI JOIN queries • Optimizer costing enhancement for hash join • Temp table optimization • LATERAL derived table • Predicate derivation for ANSI JOIN query
  • 3.
    sqexplain Overview • Printout query plan information • Includes runtime statistics • Ways to turn on explain – set explain on; – set explain file to ‘file_name’; – set explain on avoid_execute; – EXPLAIN directive on a query
  • 4.
    Sections in sqexplain QUERY:(OPTIMIZATION TIMESTAMP: 03-07-2013 17:28:30) ------ select {+ FULL(tab1) AVOID_FULL(tab2)} * from tab1, tab2 where tab1.id = tab2.id DIRECTIVES FOLLOWED: FULL ( tab1 ) AVOID_FULL ( tab2 ) DIRECTIVES NOT FOLLOWED: Estimated Cost: 4 Estimated # of Rows Returned: 1 1) informix.tab1: SEQUENTIAL SCAN 2) informix.tab2: INDEX PATH (1) Index Name: informix.t2idx1 Index Keys: id (Serial, fragments: ALL) Lower Index Filter: informix.tab1.id = informix.tab2.id NESTED LOOP JOIN query text general query information access paths and joins
  • 5.
    Sections in sqexplain– cont. Query statistics: ----------------- Table map : ---------------------------- Internal name Table name ---------------------------- t1 tab1 t2 tab2 type table rows_prod est_rows rows_scan time est_cost ------------------------------------------------------------------- scan t1 1000 1 1000 00:00.00 2 type table rows_prod est_rows rows_scan time est_cost ------------------------------------------------------------------- scan t2 1000 1 1000 00:00.01 0 type rows_prod est_rows time est_cost ------------------------------------------------- nljoin 1000 1 00:00.01 4 Runtime query statistics
  • 6.
    sqexplain for ANSIJOIN query QUERY: ------ Select * from (t1 left outer join (t2 left outer join t3 on t2.c1=t3.c1) on t1.c2=t2.c2 and t2.c1 < 5) left outer join t4 on t1.c1=t4.c1 Estimated Cost: 14 Estimated # of Rows Returned: 4 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH Filters: informix.t2.c1 < 5 (1) Index Keys: c2 (Serial, fragments: ALL) Lower Index Filter: informix.t1.c2 = informix.t2.c2 3) informix.t3: AUTOINDEX PATH (1) Index Keys: c1 Lower Index Filter: informix.t2.c1 = informix.t3.c1 ON-Filters:informix.t2.c1 = informix.t3.c1 NESTED LOOP JOIN(LEFT OUTER JOIN) ON-Filters:(informix.t1.c2 = informix.t2.c2 AND informix.t2.c1 < 5 ) NESTED LOOP JOIN(LEFT OUTER JOIN) 4) sqlqa.t4: AUTOINDEX PATH (1) Index Keys: c1 Lower Index Filter: informix.t1.c1 = informix.t4.c1 ON-Filters:informix.t1.c1 = informix.t4.c1 NESTED LOOP JOIN(LEFT OUTER JOIN)
  • 7.
    Set Operations • Similarto UNION • INTERSECT – rows common to both arms – internally transformed into EXISTS subquery with special NULL handling • MINUS or EXCEPT – rows in first arm that’s not in second arm – internally transformed into NOT EXISTS subquery with special NULL handling
  • 8.
    Set Operations inexplain QUERY: (OPTIMIZATION TIMESTAMP: 03-08-2013 15:04:22) ------ select intcol from tab1 intersect select intcol2 from tab2 Estimated Cost: 4 Estimated # of Rows Returned: 1 1) informix.tab1: SEQUENTIAL SCAN 2) informix.tab2: SEQUENTIAL SCAN (First Row) Filters: informix.tab1.intcol == informix.tab2.intcol2 NESTED LOOP JOIN (Semi Join)
  • 9.
    Set Operations inexplain – cont. QUERY: (OPTIMIZATION TIMESTAMP: 03-08-2013 15:13:28) ------ select intcol, charcol from tab1 intersect select intcol2, charcol2 from tab2 minus select intcol3, charcol3 from tab3 Estimated Cost: 6 Estimated # of Rows Returned: 1 1) informix.tab1: SEQUENTIAL SCAN 2) informix.tab2: SEQUENTIAL SCAN (First Row) Filters: (informix.tab1.intcol == informix.tab2.intcol2 AND informix.tab1.charcol == informix.tab2.charcol2 ) NESTED LOOP JOIN (Semi Join) 3) informix.tab3: SEQUENTIAL SCAN (First Row) Filters: (informix.tab1.charcol == informix.tab3.charcol3 AND informix.tab1.intcol == informix.tab3.intcol3 ) NESTED LOOP JOIN (Anti Semi Join)
  • 10.
    View folding enhancement •Views containing ANSI JOIN or Informix outer join can now be folded into main query, for better performance – view must be referenced as a dominant table in the main query – if view is used as subservient table, then the view still needs to be materialized first
  • 11.
    View folding example createview v1(vc1, vc2) as select t1.c, t2.c from t1 left join t2 on t2.a = t1.a; select * from v1 left join t3 on v1.vc1 = t3.a; select * from v1 right join t3 on v1.vc1 = t3.a;
  • 12.
    View folding insqexplain QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:23:41) ------ select * from v1 left join t3 on v1.vc1 = t3.a Estimated Cost: 6 Estimated # of Rows Returned: 3 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH (1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t2.a = informix.t1.a NESTED LOOP JOIN 3) informix.t3: INDEX PATH (1) Index Name: informix.ind3 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.c = informix.t3.a NESTED LOOP JOIN
  • 13.
    View folding insqexplain – cont. QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:28:47) ------ create view "informix".v1 (vc1,vc2) as select x0.c ,x1.c from ("informix".t1 x0 left join "informix".t2 x1 on (x1.a = x0.a ) ); Estimated Cost: 4 Estimated # of Rows Returned: 3 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH (1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t2.a = informix.t1.a NESTED LOOP JOIN QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:28:47) ------ select * from v1 right join t3 on v1.vc1 = t3.a Estimated Cost: 5 Estimated # of Rows Returned: 3 1) informix.t3: SEQUENTIAL SCAN 2) (Temp Table For View): SEQUENTIAL SCAN DYNAMIC HASH JOIN Dynamic Hash Filters: (Temp Table For View).vc1 = informix.t3.a
  • 14.
    Subquery flattening afterview folding • Subquery flattening improves query performance, however previously it is disabled if query contains view or derived table reference • In 12.1 subquery flattening is attempted again after view folding process, and can be done with the view either folded, or materialized into temp table
  • 15.
    Subquery flattening afterview folding in sqexplain create view v4 (v4_c1, v4_c2) as select t1_c1 + 1, MAX(1) from t1 group by 1; QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 16:52:44) ------ select 1 from v4 where exists (select 1 from t2 where t2_c1 = v4_c1) Estimated Cost: 6 Estimated # of Rows Returned: 1 Temporary Files Required For: Group By 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: SEQUENTIAL SCAN (First Row) Filters: informix.t2.t2_c1 = informix.t1.t1_c1 + 1 NESTED LOOP JOIN (Semi Join)
  • 16.
    Subquery flattening afterview folding in sqexplain – cont. QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:02:37) ------ select z.a from z where z.b = some (select v.a from vag1 v, z z1 where v.a > z.a) Estimated Cost: 9 Estimated # of Rows Returned: 11 1) informix.z: SEQUENTIAL SCAN Filters: informix.z.b > informix.z.a 2) (Temp Table For View): AUTOINDEX PATH (First Row) (1) Index Name: (Auto Index) Index Keys: a (Key-Only) Lower Index Filter: informix.z.b = (Temp Table For View).a NESTED LOOP JOIN (Semi Join) 3) informix.z1: SEQUENTIAL SCAN (First Row) NESTED LOOP JOIN (Semi Join)
  • 17.
    ANSI OUTER JOINto Informix Outer Join Transformation • “Simple” ANSI OUTER JOIN can be converted to informix outer join – potentially more join choices by the optimizer – ON clause filters must be of the type “col = col” involving the current join – WHERE clause filters cannot reference subservient tables, flattened subquery tables, correlated subqueries or UDR references • If one join is not transformed, then entire query is not transformed
  • 18.
    ANSI OUTER JOINto Informix Outer Join in sqexplain QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:23:32) ------ select * from t1 left join t2 on t1.a = t2.a Estimated Cost: 4 Estimated # of Rows Returned: 3 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH (1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.a = informix.t2.a NESTED LOOP JOIN
  • 19.
    ANSI OUTER JOINto Informix Outer Join in sqexplain – cont. QUERY: (OPTIMIZATION TIMESTAMP: 03-12-2013 17:24:01) ------ select * from t1 left join t2 on t1.a = t2.a and t1.a = 1 Estimated Cost: 4 Estimated # of Rows Returned: 3 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH (1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.a = informix.t2.a ON-Filters:(informix.t1.a = informix.t2.a AND informix.t1.a = 1 ) NESTED LOOP JOIN(LEFT OUTER JOIN)
  • 20.
    Hash Join Supportin ANSI JOIN • Hash join is supported in ANSI JOIN queries • Optimizer can consider and choose best join method for each join – hash join can be faster for large joins • Optimizer costing is adjusted for situation where build/probe sides for hash join can be composite
  • 21.
    Hash Join forANSI JOIN in sqexplain QUERY: (OPTIMIZATION TIMESTAMP: 03-14-2013 15:01:22) ------ select * from (t1 left join t2 on t1.a = t2.a ) left join (t3 inner join t4 on t3.a = t4.a) on t4.a = t1.a Estimated Cost: 9 Estimated # of Rows Returned: 3 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: INDEX PATH (1) Index Name: informix.ind2 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t1.a = informix.t2.a ON-Filters:informix.t1.a = informix.t2.a NESTED LOOP JOIN(LEFT OUTER JOIN) 3) informix.t3: SEQUENTIAL SCAN 4) informix.t4: INDEX PATH (1) Index Name: informix.ind4 Index Keys: a (Serial, fragments: ALL) Lower Index Filter: informix.t3.a = informix.t4.a ON-Filters:informix.t3.a = informix.t4.a NESTED LOOP JOIN ON-Filters:informix.t4.a = informix.t1.a DYNAMIC HASH JOIN (LEFT OUTER JOIN) Dynamic Hash Filters: informix.t4.a = informix.t1.a
  • 22.
    Optimizer costing improvements •Current optimizer costing tends to favor index based scans and joins, which can be problematic for large tables • In 12.1, introduced costing modifications to make hash join more favorable for large tables • Under control of undocumented ONCONFIG parameter SQL_DEF_CTRL (off by default) – add 0x200 and 0x800 bits – set to 0xeb0 to include “on-by-default” bits
  • 23.
    Optimizer costing example SQL_DEF_CTRL=0x4b0 SELECT dm.dm_s_symbAS stock, MONTH(dm.dm_date) AS month, COUNT(*) AS count_num_days, MAX(dm.dm_vol) AS max_vol FROM daily_market dm, security s WHERE dm.dm_s_symb = s.s_symb AND YEAR(dm.dm_date) = "2001" GROUP BY 1,2 Estimated Cost: 5018350 Estimated # of Rows Returned: 1779368 Temporary Files Required For: Group By 1) informix.s: INDEX PATH (1) Index Name: informix.pk_security Index Keys: s_symb (Key-Only) (Serial, fragments: ALL) 2) informix.dm: INDEX PATH Filters: YEAR (informix.dm.dm_date ) = 2001 (1) Index Name: informix.fk_daily_market_security Index Keys: dm_s_symb (Serial, fragments: ALL) Lower Index Filter: informix.dm.dm_s_symb = informix.s.s_symb NESTED LOOP JOIN SQL_DEF_CTRL=0xeb0 SELECT dm.dm_s_symb AS stock, MONTH(dm.dm_date) AS month, COUNT(*) AS count_num_days, MAX(dm.dm_vol) AS max_vol FROM daily_market dm, security s WHERE dm.dm_s_symb = s.s_symb AND YEAR(dm.dm_date) = "2001" GROUP BY 1,2 Estimated Cost: 4183207 Estimated # of Rows Returned: 1779368 Temporary Files Required For: Group By 1) informix.dm: SEQUENTIAL SCAN Filters: YEAR (informix.dm.dm_date ) = 2001 2) informix.s: INDEX PATH (1) Index Name: informix.pk_security Index Keys: s_symb (Key-Only) (Serial, fragments: ALL) DYNAMIC HASH JOIN Dynamic Hash Filters: infomix.dm.dm_s_symb = informix.s.s_symb
  • 24.
    Temp table optimization •Temp tables are created when a view or derived table cannot be folded into main query • Previously when a temp table is created, it includes all columns from underlying tables • In 12.1, a temp table only includes columns that are required in the query – smaller temp table – more efficient query processing
  • 25.
    Temp table optimizationexample select rtrim(D12.C36), rtrim(D12.C48), D12.C103, D12.C104 from ( select stock_trans_type.stt_type as C0, stock_trans_type.stt_desc as C1, stock_movements.stk_trans_type as C2, ...... stock_master.user_num1 as C103, system_table.systbl_code as C104 from stock_trans_type, stock_master, system_table, outer stock_movements where ...... ) D12 right outer join system_type on D12.C29 = system_type.type_id where D12.C12 between 129 and 256 and D12.C16 is not null; TEMP table for D12 contains only the following columns: C12, C16, C29, C36, C48, C103, C104
  • 26.
    LATERAL derived table •Correlated column reference inside a derived table select * from t1, LATERAL (select * from t2 where t1.c1 = t2.c2 ) as dtab(dc1); • LATERAL derived table may or may not be folded into main query • Logic in the optimizer to ensure LATERAL correlation reference is properly satisfied at run time
  • 27.
    LATERAL derived tablein sqexplain select t1.c1, dc1 from t1, LATERAL (select t2.c2 from t2 where t1.c1 = t2.c2) as dtab(dc1) Estimated Cost: 4 Estimated # of Rows Returned: 1 1) informix.t1: SEQUENTIAL SCAN 2) informix.t2: SEQUENTIAL SCAN DYNAMIC HASH JOIN Dynamic Hash Filters: informix.t1.c1 = informix.t2.c2
  • 28.
    LATERAL derived tablein sqexplain – cont. select * from t1, t2, t3, t4, LATERAL ( select t5.c5 from t5 where t1.c1 = t5.c5 and t5.c5 < t2.c2 group by 1) as dtab(vc1) where t4.c4 = t3.c3 1) informix.t3: SEQUENTIAL SCAN 2) informix.t4: SEQUENTIAL SCAN DYNAMIC HASH JOIN Dynamic Hash Filters: informix.t4.c4 = informix.t3.c3 3) informix.t2: SEQUENTIAL SCAN NESTED LOOP JOIN 4) informix.t1: SEQUENTIAL SCAN NESTED LOOP JOIN 5) (Temp Table For Collection Subquery): SEQUENTIAL SCAN NESTED LOOP JOIN
  • 29.
    Predicate derivation for ANSIJOIN Query • Optimizer is able to derive predicates based on existing predicates – t1.c1 = t2.c2 and t1.c1 = t3.c3  t2.c2 = t3.c3 – t1.c1 = t2.c2 and t1.c1 >= 5  t2.c2 >= 5 • Predicate derivation is now enabled for ANSI JOIN query as well (among dominant tables)
  • 30.
    Predicate derivation forANSI JOIN in sqexplainQUERY: (OPTIMIZATION TIMESTAMP: 03-15-2013 12:17:32) ------ select int1, value1, word1, int3, int4, value4 from aoj1 left join (aoj3 left join aoj4 on value3 = value4) on (value1 = value3 and int1 = int3) where value3 > 15 Estimated Cost: 6 Estimated # of Rows Returned: 1 1) informix.aoj1: INDEX PATH (1) Index Name: informix.aoj1_value1 Index Keys: value1 (Serial, fragments: ALL) Lower Index Filter: informix.aoj1.value1 > 15 2) informix.aoj3: AUTOINDEX PATH (1) Index Name: (Auto Index) Index Keys: value3 int3 (Key-Only) Lower Index Filter: (informix.aoj1.value1 = informix.aoj3.value3 AND informix.aoj1.int1 = informix.aoj3.int3 ) Index Key Filters: (informix.aoj3.value3 > 15 ) 3) informix.aoj4: SEQUENTIAL SCAN ON-Filters:informix.aoj3.value3 = informix.aoj4.value4 DYNAMIC HASH JOIN (LEFT OUTER JOIN) Dynamic Hash Filters: informix.aoj3.value3 = informix.aoj4.value4 ON-Filters:(informix.aoj1.value1 = informix.aoj3.value3 AND informix.aoj1.in t1 = informix.aoj3.int3 ) NESTED LOOP JOIN
  • 31.
    Summary • sqexplain overview •Set operations • View folding enhancements • Subquery flattening after view folding • ANSI OUTER JOIN to informix outer join transformation • Hash join support for ANSI JOIN queries • Optimizer costing enhancement for hash join • Temp table optimization • LATERAL derived table • Predicate derivation for ANSI JOIN query
  • 32.