datori

Database administration for fun and profit

Helping the DB2 optimizer, well, optimize: SELECTIVITY clause.

The DB2 (we’re talking DB2 for LUW here) query optimizer usually does a pretty good job at choosing a good execution plan, and normally you would not want to coerce it, with optimization profiles or weird search predicates, to go where it does not want to. However, once in a while one does feel the urge to give it a good kick in the right direction.

Imagine a query like this:

SELECT something FROM a_table WHERE start_time BETWEEN ? AND ?

How is the optimizer supposed to know what would be the selectivity of such a predicate? It can’t, so it will take a guess, which in some cases will be wrong (although you can’t really blame it, can you). Suppose a_table contains some sort of log records for the past three years, start_time is a timestamp, and you’re only interested in a range of few hours at a time. How can you explain to the optimizer that you only expect about a hundred of rows as a result, not a hundred million, and it better choose the index on start_time for access?

Simple, you tell it:

SELECT something FROM a_table WHERE start_time BETWEEN ? AND ? 
SELECTIVITY 0.00001

The SELECTIVITY clause informs the optimizer what percentage of rows you expect the predicate to return from the corresponding table — or join

...FROM table1, table2 WHERE table1.id = table2.id SELECTIVITY 0.005

The value should be between 0 and 1, obviously.

The original purpose was to provide the optimizer with extra information when user-defined functions were referenced in query predicates:

...WHERE column1 = my_func(column2)...

For this reason the query parser will reject the SELECTIVITY clause if it is used with other types of predicates, returning SQLCODE -20046. To enable its use with a wider range of predicates, the manual explains, you will need to set the registry variable DB2_SELECTIVITY to YES and restart the instance.

You do that and… still get SQL20046N when you try using SELECTIVITY with predicates like

...start_time BETWEEN ? AND ?...

or

...column1 > (SELECT something FROM somewhere)...

Luckily, there’s a backdoor: you can set DB2_SELECTIVITY to the undocumented value ALL, which allows the use of SELECTIVITY clause in a much wider range of circumstances. So,

db2set DB2_SELECTIVITY=ALL
db2stop
db2start

and off you go.

It is worth noting that the optimizer is not obliged to heed your selectivity suggestions. If this approach does not work for you, try something else.

File under “before-I-forget”.

Leave a Reply

Your email address will not be published. Required fields are marked *