Lost control file, So what?

Recently got involved in couple conversations regarding common “RULE OF THUMB” about backing up control file, and double backup of control file backup etc..

I guess its good thing to do, its does not take much space anyway, but if you lost it or control file got corrupted, and do not have backup, how big deal is it?

If you remember your database character set and have access to filesystem/ASM to find where datafiles and redo logs are – not really big deal.

STARTUP NOMOUNT;

CREATE CONTROLFILE REUSE DATABASE "DBNAME" NORESETLOGS ARCHIVELOG
 MAXLOGFILES 16
 MAXLOGMEMBERS 3
 MAXDATAFILES 1000
 MAXINSTANCES 8
 MAXLOGHISTORY 292
LOGFILE
 GROUP 1 '/path/to/redo/file' SIZE 512M BLOCKSIZE 512, ...
DATAFILE
 '/path/to/data/file',...
CHARACTER SET <DB character set>

RECOVER DATABASE;

ALTER DATABASE OPEN;

No need to even RESETLOGS if redo logs are still available, so – ZERO data loss!

Some rman metadata could be lost, but most of it could be registered back by:

RMAN> CATALOG START WITH '/directory/where/backups/are/'

Same thing for registering back archive logs:

RMAN> CATALOG START WITH '/directory/where/archlogs/are'

Used it couple times, worked as a charm.

Posted in Oracle, SQL | Tagged , , , , , , , | Leave a comment

ORA-01555 on standby

If you see “ORA-01555: snapshot too old” from Select what is been run on Physical standby, but same select runs on primary just fine, and all relevant init.ora parameters and undo tablespace are exactly identical between primary and standby,- its odd.  But its easy to explain.

First check if DB is below 11.2.0.3 DB it could be bug 10320455.

If not – it most probably just undo retention problem.

On Primary instance Increase undo_retention to value (in seconds) higher than time query runs, and it should fix the problem. (You can increase the parameter on standby too, but it really take effect on primary only)

Why with same undo_retention problem does not show up on primary?

There is underscore parameter _undo_autotune. With it been set to TRUE (default in 11g) Oracle would track oldest running query, and will try to keep undo from time of start of that oldest query, even if its older than undo_retention. If undo space affords.  So it basically overwrites undo_retention and can be much higher than undo_retention. The overwritten/actual undo retention can be seen in tuned_undoretention column in v$undostat.

But primary DB have no vision on what queries are running on standby, so even if long running query running on standby – undo autotune does not kick in, and real undo retention on primay (and so standby) equal to what is set in undo_retention parameter.

So setting undo_retention on Primary to higher value would fix the problem. And note it had to be set on primary. For standby DB undo tablespace is Read-Only so it has no control on it.

Setting undo_retention higher should not hurt much. It can grow your tablespace to what autoextend allows, but if it hits the limit and more undo been generated what is really needed by uncommitted transactions, it will purge old undo what is only needed by selects. It is soft limit.

Posted in Oracle, SQL | Tagged , , , , , | 1 Comment

VIP is not reachable after failover

If VIP was failed over from one host in subnet to another via manually running “ifconfig down” on one host and “ifconfig up” on another, you can see case when VIP become unreachable from outside of subnet. the problem should clear. but it can take up to several hours, depending on the platform.
If you have similar symptoms run on machines where VIP failed over to:

[root@linuxtest ~]# arping -U your_vip_ip

It will send ARP response, Gratuitous ARP, aka Unsolicited ARP. What would broadcast new MAC for the VIP and so update ARP caches on machines in same subnet, including router what routes traffic to the subnet from external world.
If not done, ARP caches will be updated but it would take time:
- for some cisco devices – 4 hours,
- for brocade – default is 10 minutes,
- for Linux – default is 60 seconds,
- for Windows – up to 10 minutes.

So cleaning arp cache would help for new connections to avoid waiting for arp timeout.
Also it will help for clients which had existing TCP connection before failover: they will immediately get a reset packet sent. This results in clients getting errors immediately rather than waiting for the TCP timeout (up to 15 minutes, depending on platform).

Btw, Ability to send Unsolicited ARP is the only reason Oracle RAC need VIPs.

Posted in Linux, Systems | Tagged , , , , , | Leave a comment

High CPU usage on DB host?

Its Unix 101 statement, but I have heard it wrong so many time, so decided to put a blog for it.

So:
If in OLTP environment, per ‘top’ %wa (IOWAIT) is major contributor for CPU busy, adding CPU would not help. There is no need for more CPU. !Period!
%wa IS PERCENTAGE WHAT COUNT TO CPU IDLE ******************

Simple test.

Push some IO:

user1@myhost:~$ dd if=/dev/zero of=/tmp/file1 conv=notrunc  bs=1000 count=3000000 &
[1] 31240
user1@myhost:~$ dd if=/dev/zero of=/tmp/file2 conv=notrunc  bs=1000 count=3000000 &
[2] 31241
user1@myhost:~$ dd if=/dev/zero of=/tmp/file3 conv=notrunc  bs=1000 count=3000000 &
[3] 31242
user1@myhost:~$ dd if=/dev/zero of=/tmp/file4 conv=notrunc  bs=1000 count=3000000 &
[4] 31243

top looks:

user1@myhost:~$ top -b -i

top - 23:05:42 up  8:37, 12 users,  load average: 4.36, 3.91, 6.28
Tasks: 239 total,   5 running, 230 sleeping,   0 stopped,   4 zombie
Cpu(s):  3.1%us, 20.5%sy,  0.0%ni, 12.9%id, 63.3%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:   4080460k total,  3809420k used,   271040k free,     1580k buffers
Swap:  4145148k total,   104240k used,  4040908k free,  1824928k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
31240 amoseyev  20   0  4376  588  496 D    2  0.0   0:12.29 dd
31241 amoseyev  20   0  4376  588  500 D    2  0.0   0:12.32 dd
31242 amoseyev  20   0  4376  592  500 D    2  0.0   0:12.38 dd
31243 amoseyev  20   0  4376  592  500 D    1  0.0   0:11.50 dd

%wa is high.

iostat consistently giving write performance about 44MB/sec:

user1@myhost:~$ iostat 1 1000

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.76    0.00   12.09   50.13    0.00   36.02

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda            2111.00      8352.00     45668.00       8352      45668

As shown above, dd process what causes I/O load almost always in “D” state, what is “uninterruptible sleep”.
“uninterruptible” comes from fact what they cant be killed, as process is in kernel mode (IO call has to be done in kernel mode).
Its uninterruptible but still SLEEP. Its idle process. It does not block CPU. If any other thread would require CPU (either for number cranching or for another I/O call), schedule would put it on CPU while dd is in sleep.
But if no other CPU load is available, top counts next idle CPU cycle as %wa.

Now push some real CPU load:

user1@myhost:~$ cat /dev/urandom  > /dev/null &
[1] 31224
user1@myhost:~$ cat /dev/urandom  > /dev/null &
[2] 31225
user1@myhost:~$ cat /dev/urandom  > /dev/null &
[3] 31229
user1@myhost:~$ cat /dev/urandom  > /dev/null &
[4] 31231

user1@myhost:~$
user1@myhost:~$ top -b -i

top - 23:19:16 up  8:50, 12 users,  load average: 7.84, 7.15, 7.10
Tasks: 239 total,   6 running, 229 sleeping,   0 stopped,   4 zombie
Cpu(s):  0.8%us, 98.1%sy,  0.0%ni,  0.0%id,  0.5%wa,  0.0%hi,  0.6%si,  0.0%st
Mem:   4080460k total,  3838860k used,   241600k free,     2168k buffers
Swap:  4145148k total,   104240k used,  4040908k free,  2264144k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
31279 amoseyev  20   0  4220  544  456 R   96  0.0   0:16.22 cat
31281 amoseyev  20   0  4220  544  456 R   87  0.0   0:13.96 cat
31280 amoseyev  20   0  4220  544  456 R   83  0.0   0:15.20 cat
31278 amoseyev  20   0  4220  540  456 R   80  0.0   0:16.01 cat
31241 amoseyev  20   0  4376  588  500 D    2  0.0   0:14.29 dd
31242 amoseyev  20   0  4376  592  500 D    2  0.0   0:14.26 dd
31240 amoseyev  20   0  4376  588  496 D    1  0.0   0:14.17 dd
31243 amoseyev  20   0  4376  592  500 D    1  0.0   0:13.34 dd

%wa went to almost 0. %sy is close to 100%.
So when CPU spends its cycles on real load, it does not addup to %wa. And same time, I/O throughput did not change with CPU been 100% busy:

user1@myhost:~$ iostat sda 1 1000

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.50    0.00   98.50    0.00    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda            1415.00      5376.00     44500.00       5376      44500

What again proves what OLTP I/O does not need that much CPU.

The “D” process also add up to “load avarage”, so its its also not the best value to judge how busy CPU is.

All said above would be applicable for for OLTP, A lot of small I/Os. NAS, SAN. RAW/block devices, or file system. All of them.

If we are talking about some crazy 1+GB/sec full table scans in OLAP/dw world, CPU probably would be affected, especially if its NFS (and not direct NFS). But it will be mostly on %sy and %si (not %wa), as ethernet traffic would be handled thru soft interrupts, and with high throughput, its CPU intensive. Context switches may also add up to CPU on some platforms, where they still used for switching between user/kernel modes.

Posted in Linux, Oracle, Systems, Uncategorized | Tagged , , , , , , , , , , , | Leave a comment

Quckest was to find oracle alert log on linux box

Recently I had to work in a lot of new to me oracle environments. And I found a some places do not have standard for ORACLE_BASE, and no correlation between SID and Unique_name.
With that – finding oracle alert log
$ORACLE_BASE/diag/rdbms/{DB_UNIQUE_NAME}/{ORACLE_SID}/trace/alert_{ORACLE_SID}.log
takes more than 30 seconds to find. Its frustrating. At least for lazy people like me.

So here is quicker way to locate alert log:


[amoseyev@linuxtest] /usr/sbin/lsof | grep trc
oracle   14180   oracle    9w    REG    0,20   1064    50693 /u01/oracle/diag/rdbms/orcl2/orcl2/trace/orcl2_ora_14180.trc (deleted) (10.100.100.50:/vol/nfs1)
oracle   21201   oracle   17w    REG    0,20   1877    31141 /u01/oracle/diag/rdbms/orcl2/orcl2/trace/orcl2_diag_21201.trc (deleted) (10.100.100.50:/vol/nfs1)

Its not hard to extrapolate where alert log will be.
With that I am down to 3 seconds in journey of finding alert logs.
Happy seconds/time/life savings :)

Posted in Linux, Oracle, Uncategorized | Tagged , , , , , , , , , | Leave a comment

Missing oracle validated RPM for RHEL6?

I have used oracle-validated RPM a lot on Red Hat 5 (RHEL5) and Oracle Enterprice Linix 5 installations. And found it very handy!

With Oracle announcing support for RHEL6/OEL6, world discovered 2 bad things:

  • Oracle does not support ASMLib on RHEL6.
  • Oracle replaced oracle-validated RPM with oracle-rdbms-server-11gR2-preinstall RPM, and new version of the RPM requires kernel-uek to be installed, which is not available on RHEL. In short – oracle-validated/oracle-rdbms-server-11gR2-preinstall is not supported in RHEL6.

You will be surprised but both of those are supported in OEL6. J

Not sure if that 2 facts suppose to make OEL6 more marketable/attractive. Maybe for DBAs. But in 99% of cases DBAs have little to say in choosing OS platform, and you guess what – most SAs still prefer RHEL, and I see why. At least the ones I worked with recently.

I created short bash script what would at least verify RPMs oracle need:

[oracle@mylinux ~]$ cat rhel6_x86_64_validate.sh

#!/bin/bash

array=(
binutils-2.20.51.0.2-5.11.el6.x86_64 
compat-libcap1-1.10-1.x86_64 
compat-libstdc++-33-3.2.3-69.el6.x86_64 
compat-libstdc++-33-3.2.3-69.el6.i686 
gcc-4.4.4-13.el6.x86_64 
gcc-c++-4.4.4-13.el6.x86_64 
glibc-2.12-1.7.el6.i686 
glibc-2.12-1.7.el6.x86_64 
glibc-devel-2.12-1.7.el6.x86_64 
glibc-devel-2.12-1.7.el6.i686 
ksh.x86_64
libgcc-4.4.4-13.el6.i686 
libgcc-4.4.4-13.el6.x86_64 
libstdc++-4.4.4-13.el6.x86_64 
libstdc++-4.4.4-13.el6.i686 
libstdc++-devel-4.4.4-13.el6.x86_64 
libstdc++-devel-4.4.4-13.el6.i686 
libaio-0.3.107-10.el6.x86_64 
libaio-0.3.107-10.el6.i686 
libaio-devel-0.3.107-10.el6.x86_64 
libaio-devel-0.3.107-10.el6.i686 
make-3.81-19.el6.x86_64 
sysstat-9.0.4-11.el6.x86_64
)

len=${#array[*]}
i=0
line='-------------------------------------------------------------------'
while [[ $i -lt $len ]]; do
req_pkg="^${array[$i]%%.*}"
req_arch="${array[$i]##*.}"
req_arch="${req_arch/6/.}"; req_arch="${req_arch/3/.}$"
curr_rpm=`rpm -qa --qf "%{NAME}-%{VERSION}-%{RELEASE}.%{ARCH}\n" | grep ${req_pkg} | grep ${req_arch}`;
if [ $(expr "${curr_rpm%%.*}" \>= "${req_pkg}") -eq 1 ]; then result1=pass; else result1=failed; fi
printf "%s %s [$result1]\n" ${array[$i]} ${line:${#array[$i]}}
echo $curr_rpm;
let i++;
done

This is for RHEL6 x86_64, but if need it for any other rpm based linux just replace list in “array=()” with list of required RPMs from oracle documentation. should be exactly copy/paste from documentation.

Save this script in any file (rhel6_x86_64_validate.sh in my case). chmod 700. And run without arguments:

[oracle@mylinux ~]$ ./rhel6_x86_64_validate.sh
binutils-2.20.51.0.2-5.11.el6.x86_64 ------------------------------- [pass]
binutils-2.20.51.0.2-5.34.el6.x86_64
compat-libcap1-1.10-1.x86_64 --------------------------------------- [failed]

compat-libstdc++-33-3.2.3-69.el6.x86_64 ---------------------------- [pass]
compat-libstdc++-33-3.2.3-69.el6.x86_64
compat-libstdc++-33-3.2.3-69.el6.i686 ------------------------------ [pass]
compat-libstdc++-33-3.2.3-69.el6.i686
gcc-4.4.4-13.el6.x86_64 -------------------------------------------- [pass]
gcc-4.4.6-4.el6.x86_64
gcc-c++-4.4.4-13.el6.x86_64 ---------------------------------------- [pass]
gcc-c++-4.4.6-4.el6.x86_64
glibc-2.12-1.7.el6.i686 -------------------------------------------- [pass]
glibc-2.12-1.80.el6_3.5.i686
glibc-2.12-1.7.el6.x86_64 ------------------------------------------ [pass]
glibc-2.12-1.80.el6_3.5.x86_64
glibc-devel-2.12-1.7.el6.x86_64 ------------------------------------ [pass]
glibc-devel-2.12-1.80.el6_3.5.x86_64
glibc-devel-2.12-1.7.el6.i686 -------------------------------------- [pass]
glibc-devel-2.12-1.80.el6_3.5.i686
ksh ---------------------------------------------------------------- [pass]
ksh-20100621-16.el6.x86_64
libgcc-4.4.4-13.el6.i686 ------------------------------------------- [pass]
libgcc-4.4.6-4.el6.i686
libgcc-4.4.4-13.el6.x86_64 ----------------------------------------- [pass]
libgcc-4.4.6-4.el6.x86_64
libstdc++-4.4.4-13.el6.x86_64 -------------------------------------- [pass]
libstdc++-4.4.6-4.el6.x86_64
libstdc++-4.4.4-13.el6.i686 ---------------------------------------- [pass]
libstdc++-4.4.6-4.el6.i686
libstdc++-devel-4.4.4-13.el6.x86_64 -------------------------------- [pass]
libstdc++-devel-4.4.6-4.el6.x86_64
libstdc++-devel-4.4.4-13.el6.i686 ---------------------------------- [failed]

libaio-0.3.107-10.el6.x86_64 --------------------------------------- [pass]
libaio-0.3.107-10.el6.x86_64
libaio-0.3.107-10.el6.i686 ----------------------------------------- [pass]
libaio-0.3.107-10.el6.i686
libaio-devel-0.3.107-10.el6.x86_64 --------------------------------- [pass]
libaio-devel-0.3.107-10.el6.x86_64
libaio-devel-0.3.107-10.el6.i686 ----------------------------------- [pass]
libaio-devel-0.3.107-10.el6.i686
make-3.81-19.el6.x86_64 -------------------------------------------- [pass]
make-3.81-20.el6.x86_64
sysstat-9.0.4-11.el6.x86_64 ---------------------------------------- [pass]
sysstat-9.0.4-20.el6.x86_64
[oracle@mylinux ~]$

If package is missing or older version script would spot it out.

This version will also install the RPM for you. (It has to be run as root, and have yum default repo configured).
Remove “.doc” from file name when download script:

rhel6_x86_64_validate_and_install_rpm.sh

Posted in Linux, Uncategorized | Tagged , , , , | Leave a comment

Top-N query

Term of “TOP-N” intuitively makes thinking what they query should run faster than just ORDERED query, as only N elements needed from it.
But, as well known, not always..

This blog describes method how using proper index would optimize TOP-N  query making it benefit from fact what only N rows needed.

This is how common case would look before optimization:

(for reference will call this Example 1)

CREATE TABLE EMPLOYEE
  (
    EMPLOYEE_ID NUMBER,
    BRANCH      VARCHAR2(100),
    DEPARTMENT  VARCHAR2(20),
    HIRE_DATE   DATE,
    SALARY      NUMBER
  );

CREATE INDEX IDX1_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT);

And table has data distribution like:

SQL> select count(1) from employee where branch='US' and department='IT';

COUNT(1)
----------------------
818008

SQL> select count(1) from employee;

COUNT(1)
----------------------
4745976

TOP-N query is:

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
order by salary
) where rownum<=3;

3 rows selected.

Elapsed: 00:00:03.42

Execution Plan
----------------------------------------------------------
Plan hash value: 2997571357

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time       |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |               |     3 |   297 |       | 13754	 (1)| 00:02:46 |
|*  1 |  COUNT STOPKEY                       |               |       |       |       |              |          |
|   2 |   VIEW                               |               |   818K|    77M|       | 13754	 (1)| 00:02:46 |
|*  3 |    SORT ORDER BY STOPKEY             |               |   818K|    22M|    34M| 13754	 (1)| 00:02:46 |
|   4 |     TABLE ACCESS BY INDEX ROWID      | EMPLOYEE      |   818K|    22M|       |  7240	 (1)| 00:01:27 |
|*  5 |      INDEX RANGE SCAN                | IDX1_EMPLOYEE |   818K|       |       |  2728	 (1)| 00:00:33 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   3 - filter(ROWNUM<=3)    
   5 - access("BRANCH"='US' AND "DEPARTMENT"='IT')  

Statistics  ----------------------------------------------------------
        0  recursive calls
        0  db block gets
     5910  consistent gets
        0  physical reads 
        0  redo size  
      665  bytes sent via SQL*Net to client 
      364  bytes received via SQL*Net from client  
        2  SQL*Net roundtrips to/from client  
        1  sorts (memory)   
        0  sorts (disk)    
        3  rows processed   

“5910 consistent gets” – That is pretty expensive 3 rows..

“rownum<=3″ helps here: query does not have to finish full sort (“SORT ORDER BY STOPKEY”), only 3 top elements required. And depending on sort algorithm been used it can be more or less help. But complete index range scan needs to be done in order to find top elements (“INDEX RANGE SCAN rows=818K”). Rownum does not help there. And its bigger overheat.

The ideal TOP-N optimization for the query would be: adding “ORDER BY” column to the end of the index:

 
SQL> CREATE INDEX IDX2_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT,SALARY);

(for reference will call this Example 2)

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
order by salary
) where rownum<=3;

3 rows selected.

Elapsed: 00:00:01.03

Execution Plan
----------------------------------------------------------
Plan hash value: 3266040937

-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |     3 |   297 |     5	(0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |               |       |       |            |          |
|   2 |   VIEW                        |               |     3 |   297 |     5	(0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| EMPLOYEE      |   818K|    22M|     5	(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IDX2_EMPLOYEE |     3 |       |     3	(0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   4 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  7  consistent gets
	  2  physical reads
	  0  redo size
	665  bytes sent via SQL*Net to client
	364  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  3  rows processed

This execution have several advantages comparing to previous:

  • In this execution plan “SORT” is not mentioned at all, because index is ordered structure.
  • Oracle just takes takes first 3 elements, relying on fact its already ordered,- it will be top entries. I.e. only doing Bounded Range scan is performed (INDEX RANGE SCAN Rows=3). It is huge saving, what is reflected in cost and gets. Consistent gets dropped to 7.

Pretty simple stuff.
Picture below shows Bounded INDEX RANGE SCAN stopped by rownum STOPKEY in visual form:

I would generalize conditions where index helps for TOP-N query:

  1. Having index what has leading columns as:
    • Zero or more columns mentioned in “WHERE” clause. If column is in WHERE condition but not part of index, that condition need to be have low selectivity.
    • “ORDER BY” column(s).
  2. Between 1st column in the index and “ORDER BY” column it should be no columns what are not mentioned in “WHERE” clause.
  3. Columns what stand before “ORDER BY” column in index, has to have “=” or “IN” condition in WHERE predicate. “IN” need additional handing (explained below).
    It actually can be condition in WHERE clause with “>” “<” “!=” “BETWEEN” (i.e. condition with indefinite set of possible combinations on the right), but columns mentioned in it should not be part of the index.

All that may sound a little too abstract. Here are some examples.

RE: “If column is in WHERE condition but not part of index, that condition need to be have low selectivity”

In the table I have 99% of employees was hired 10 years ago or less.
So here is query with low selectivity on HIRE_DATE. Hire date is not part of index:

(for reference will call this Example 3)

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date>sysdate-365*10
order by salary
) where rownum<=3;

Execution Plan
----------------------------------------------------------
Plan hash value: 3266040937

-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name	      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |     3 |   297 |     5	(0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |               |       |       |            |          |
|   2 |   VIEW                        |               |     3 |   297 |     5	(0)| 00:00:01 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| EMPLOYEE      |   818K|   22M |     5	(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IDX2_EMPLOYEE |     3 |       |     3	(0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)    3 - filter("HIRE_DATE">SYSDATE@!-3650)
   4 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  7  consistent gets
	  0  physical reads
	  0  redo size
	665  bytes sent via SQL*Net to client
	364  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  3  rows processed

We see same effective execution as before: 7 consistent gets. Cost 5.
Oracle here use almost same plan as shown at example 2:
Do bounded RANGE SCAN. And yes, its not 100% guaranteed first 3 rows from index will be the ones query would return, as HIRE_DATE condition still needs to be checked (3 – filter(“HIRE_DATE”>SYSDATE@!-3650)). But 99% of employees were hired less than 10 years do, so 99% chance what HIRE_DATE condition will be positive. And even if not lucky and hit those remaining 1% and one of the records would fail on HIRE_DATE condition, – just get next element from index.

Now, in contrast – query with high selectivity on HIRE_DATE:

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date<sysdate-365*10
order by salary
) where rownum<=3;
Execution Plan
----------------------------------------------------------
Plan hash value: 2997571357

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     3 |   297 |  7253	 (1)| 00:01:28 |
|*  1 |  COUNT STOPKEY                 |               |       |       |            |	       |
|   2 |   VIEW                         |               |    23 |  2277 |  7253	 (1)| 00:01:28 |
|*  3 |    SORT ORDER BY STOPKEY       |               |    23 |   667 |  7253	 (1)| 00:01:28 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEE      |    23 |   667 |  7252	 (1)| 00:01:28 |
|*  5 |      INDEX RANGE SCAN          | IDX1_EMPLOYEE |   818K|       |  2728	 (1)| 00:00:33 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   3 - filter(ROWNUM<=3)
   4 - filter("HIRE_DATE"<SYSDATE@!-3650)
   5 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       5910  consistent gets
          0  physical reads
          0  redo size
        489  bytes sent via SQL*Net to client
        353  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed

Plan from example 3 would not work here as “HIRE_DATE<sysdate-3650″ would filter out 99% of rows which are gotten from index. And even the facts rows from index are ordered as needed, would not benefit much, as high count of rowid lookups on table (99% of which will fail on “HIRE_DATE<sysdate-3650″) would kill the deal.
So oracle does not use last column in the index for Sorting, but execute explicit sort “SORT ORDER BY STOPKEY” Index only used for WHERE predicates. Very similar plan to the one in example 1.

RE: “Columns what stand before “ORDER BY” column in index, has to have “=” or “IN” condition in WHERE clause…”

This would be representation of the case in scope of example I follow:

SQL> drop index IDX2_EMPLOYEE;

Index dropped.

Elapsed: 00:00:00.98
SQL> CREATE INDEX IDX3_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT,hire_date,SALARY);

Index created.

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date>sysdate-100
order by salary
) where rownum<=3

Execution Plan
----------------------------------------------------------
Plan hash value: 2997571357

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name	       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     3 |   297 |  7254	 (1)| 00:01:28 |
|*  1 |  COUNT STOPKEY                 |               |       |       |            |          |
|   2 |   VIEW                         |               | 12880 |  1245K|  7254	 (1)| 00:01:28 |
|*  3 |    SORT ORDER BY STOPKEY       |               | 12880 |   364K|  7254	 (1)| 00:01:28 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEE      | 12880 |   364K|  7252	 (1)| 00:01:28 |
|*  5 |      INDEX RANGE SCAN          | IDX1_EMPLOYEE |   818K|       |  2728	 (1)| 00:00:33 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   3 - filter(ROWNUM<=3)    4 - filter("HIRE_DATE">SYSDATE@!-100)
   5 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5910  consistent gets
          0  physical reads
          0  redo size
        662  bytes sent via SQL*Net to client
        364  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed

The plan again looks very similar to Example 1. Oracle does not use SORT of Salary from index, but doing explicit sort.

Picture below shows why.

test1

If single Bounded INDEX RANGE SCAN is used (showed by arrows), the row marked in Red will be missed.

From what shown on the picture – we can do 2 index range scans, and order union of the outputs.
Yes it would be case with “IN” condition. But remember here is “>” condition – i.e. not 2, but possible infinite combination of values, and so – infinite combination of index range scans. It will be for sure more expensive than just follow plan as in Example one, what Oracle did.

RE: “RE: Between 1st column in the index and “ORDER BY” column it should be no columns what are not mentioned in “WHERE” clause.”

This would be query for this case, where offended column in index is HIRE_DATE:

SQL> CREATE INDEX IDX3_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT,hire_date,SALARY);

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
order by salary
) where rownum<=3 

and query above is basically equivalent to following one:

 SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date > sysdate-1000000 and hire_date < sysdate+1000000
order by salary
) where rownum<=3

And it would be exactly case as described in previous example.

There is a little special case with IN condition, as i mentioned earlier.
I will describe it in next blog.

Posted in SQL | Tagged , , , | Leave a comment