High CPU usage on DB host?

Its Unix 101 statement, but I have heard it wrong so many time, so decided to put a blog for it.

So:
If in OLTP environment, per ‘top’ %wa (IOWAIT) is major contributor for CPU busy, adding CPU would not help. There is no need for more CPU. !Period!
%wa IS PERCENTAGE WHAT COUNT TO CPU IDLE ******************

Simple test.

Push some IO:

user1@myhost:~$ dd if=/dev/zero of=/tmp/file1 conv=notrunc  bs=1000 count=3000000 &
[1] 31240
user1@myhost:~$ dd if=/dev/zero of=/tmp/file2 conv=notrunc  bs=1000 count=3000000 &
[2] 31241
user1@myhost:~$ dd if=/dev/zero of=/tmp/file3 conv=notrunc  bs=1000 count=3000000 &
[3] 31242
user1@myhost:~$ dd if=/dev/zero of=/tmp/file4 conv=notrunc  bs=1000 count=3000000 &
[4] 31243

top looks:

user1@myhost:~$ top -b -i

top - 23:05:42 up  8:37, 12 users,  load average: 4.36, 3.91, 6.28
Tasks: 239 total,   5 running, 230 sleeping,   0 stopped,   4 zombie
Cpu(s):  3.1%us, 20.5%sy,  0.0%ni, 12.9%id, 63.3%wa,  0.0%hi,  0.3%si,  0.0%st
Mem:   4080460k total,  3809420k used,   271040k free,     1580k buffers
Swap:  4145148k total,   104240k used,  4040908k free,  1824928k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
31240 amoseyev  20   0  4376  588  496 D    2  0.0   0:12.29 dd
31241 amoseyev  20   0  4376  588  500 D    2  0.0   0:12.32 dd
31242 amoseyev  20   0  4376  592  500 D    2  0.0   0:12.38 dd
31243 amoseyev  20   0  4376  592  500 D    1  0.0   0:11.50 dd

%wa is high.

iostat consistently giving write performance about 44MB/sec:

user1@myhost:~$ iostat 1 1000

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.76    0.00   12.09   50.13    0.00   36.02

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda            2111.00      8352.00     45668.00       8352      45668

As shown above, dd process what causes I/O load almost always in “D” state, what is “uninterruptible sleep”.
“uninterruptible” comes from fact what they cant be killed, as process is in kernel mode (IO call has to be done in kernel mode).
Its uninterruptible but still SLEEP. Its idle process. It does not block CPU. If any other thread would require CPU (either for number cranching or for another I/O call), schedule would put it on CPU while dd is in sleep.
But if no other CPU load is available, top counts next idle CPU cycle as %wa.

Now push some real CPU load:

user1@myhost:~$ cat /dev/urandom  > /dev/null &
[1] 31224
user1@myhost:~$ cat /dev/urandom  > /dev/null &
[2] 31225
user1@myhost:~$ cat /dev/urandom  > /dev/null &
[3] 31229
user1@myhost:~$ cat /dev/urandom  > /dev/null &
[4] 31231

user1@myhost:~$
user1@myhost:~$ top -b -i

top - 23:19:16 up  8:50, 12 users,  load average: 7.84, 7.15, 7.10
Tasks: 239 total,   6 running, 229 sleeping,   0 stopped,   4 zombie
Cpu(s):  0.8%us, 98.1%sy,  0.0%ni,  0.0%id,  0.5%wa,  0.0%hi,  0.6%si,  0.0%st
Mem:   4080460k total,  3838860k used,   241600k free,     2168k buffers
Swap:  4145148k total,   104240k used,  4040908k free,  2264144k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
31279 amoseyev  20   0  4220  544  456 R   96  0.0   0:16.22 cat
31281 amoseyev  20   0  4220  544  456 R   87  0.0   0:13.96 cat
31280 amoseyev  20   0  4220  544  456 R   83  0.0   0:15.20 cat
31278 amoseyev  20   0  4220  540  456 R   80  0.0   0:16.01 cat
31241 amoseyev  20   0  4376  588  500 D    2  0.0   0:14.29 dd
31242 amoseyev  20   0  4376  592  500 D    2  0.0   0:14.26 dd
31240 amoseyev  20   0  4376  588  496 D    1  0.0   0:14.17 dd
31243 amoseyev  20   0  4376  592  500 D    1  0.0   0:13.34 dd

%wa went to almost 0. %sy is close to 100%.
So when CPU spends its cycles on real load, it does not addup to %wa. And same time, I/O throughput did not change with CPU been 100% busy:

user1@myhost:~$ iostat sda 1 1000

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           1.50    0.00   98.50    0.00    0.00    0.00

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda            1415.00      5376.00     44500.00       5376      44500

What again proves what OLTP I/O does not need that much CPU.

The “D” process also add up to “load avarage”, so its its also not the best value to judge how busy CPU is.

All said above would be applicable for for OLTP, A lot of small I/Os. NAS, SAN. RAW/block devices, or file system. All of them.

If we are talking about some crazy 1+GB/sec full table scans in OLAP/dw world, CPU probably would be affected, especially if its NFS (and not direct NFS). But it will be mostly on %sy and %si (not %wa), as ethernet traffic would be handled thru soft interrupts, and with high throughput, its CPU intensive. Context switches may also add up to CPU on some platforms, where they still used for switching between user/kernel modes.

Check us out on http://avmconsulting.net

Enter your email address to follow this blog and receive notifications of new posts by email.

Posted in Linux, Oracle, Systems, Uncategorized | Tagged , , , , , , , , , , , | Leave a comment

Quckest was to find oracle alert log on linux box

Recently I had to work in a lot of new to me oracle environments. And I found a some places do not have standard for ORACLE_BASE, and no correlation between SID and Unique_name.
With that – finding oracle alert log
$ORACLE_BASE/diag/rdbms/{DB_UNIQUE_NAME}/{ORACLE_SID}/trace/alert_{ORACLE_SID}.log
takes more than 30 seconds to find. Its frustrating. At least for lazy people like me.

So here is quicker way to locate alert log:


[amoseyev@linuxtest] /usr/sbin/lsof | grep trc
oracle   14180   oracle    9w    REG    0,20   1064    50693 /u01/oracle/diag/rdbms/orcl2/orcl2/trace/orcl2_ora_14180.trc (deleted) (10.100.100.50:/vol/nfs1)
oracle   21201   oracle   17w    REG    0,20   1877    31141 /u01/oracle/diag/rdbms/orcl2/orcl2/trace/orcl2_diag_21201.trc (deleted) (10.100.100.50:/vol/nfs1)

Its not hard to extrapolate where alert log will be.
With that I am down to 3 seconds in journey of finding alert logs.
Happy seconds/time/life savings 🙂

Check us out on http://avmconsulting.net

Enter your email address to follow this blog and receive notifications of new posts by email.

Posted in Linux, Oracle, Uncategorized | Tagged , , , , , , , , , | Leave a comment

Missing oracle validated RPM for RHEL6?

I have used oracle-validated RPM a lot on Red Hat 5 (RHEL5) and Oracle Enterprice Linix 5 installations. And found it very handy!

With Oracle announcing support for RHEL6/OEL6, world discovered 2 bad things:

  • Oracle does not support ASMLib on RHEL6.
  • Oracle replaced oracle-validated RPM with oracle-rdbms-server-11gR2-preinstall RPM, and new version of the RPM requires kernel-uek to be installed, which is not available on RHEL. In short – oracle-validated/oracle-rdbms-server-11gR2-preinstall is not supported in RHEL6.

You will be surprised but both of those are supported in OEL6. J

Not sure if that 2 facts suppose to make OEL6 more marketable/attractive. Maybe for DBAs. But in 99% of cases DBAs have little to say in choosing OS platform, and you guess what – most SAs still prefer RHEL, and I see why. At least the ones I worked with recently.

I created short bash script what would at least verify RPMs oracle need:

[oracle@mylinux ~]$ cat rhel6_x86_64_validate.sh

#!/bin/bash

array=(
binutils-2.20.51.0.2-5.11.el6.x86_64
compat-libcap1-1.10-1.x86_64
compat-libstdc++-33-3.2.3-69.el6.x86_64
compat-libstdc++-33-3.2.3-69.el6.i686
gcc-4.4.4-13.el6.x86_64
gcc-c++-4.4.4-13.el6.x86_64
glibc-2.12-1.7.el6.i686
glibc-2.12-1.7.el6.x86_64
glibc-devel-2.12-1.7.el6.x86_64
glibc-devel-2.12-1.7.el6.i686
ksh.x86_64
libgcc-4.4.4-13.el6.i686
libgcc-4.4.4-13.el6.x86_64
libstdc++-4.4.4-13.el6.x86_64
libstdc++-4.4.4-13.el6.i686
libstdc++-devel-4.4.4-13.el6.x86_64
libstdc++-devel-4.4.4-13.el6.i686
libaio-0.3.107-10.el6.x86_64
libaio-0.3.107-10.el6.i686
libaio-devel-0.3.107-10.el6.x86_64
libaio-devel-0.3.107-10.el6.i686
make-3.81-19.el6.x86_64
sysstat-9.0.4-11.el6.x86_64
)

len=${#array[*]}
i=0
line='-------------------------------------------------------------------'
while [[ $i -lt $len ]]; do
req_pkg="^${array[$i]%%.*}"
req_arch="${array[$i]##*.}"
req_arch="${req_arch/6/.}"; req_arch="${req_arch/3/.}$"
curr_rpm=`rpm -qa --qf "%{NAME}-%{VERSION}-%{RELEASE}.%{ARCH}\n" | grep ${req_pkg} | grep ${req_arch}`;
if [ $(expr "${curr_rpm%%.*}" \>= "${req_pkg}") -eq 1 ]; then result1=pass; else result1=failed; fi
printf "%s %s [$result1]\n" ${array[$i]} ${line:${#array[$i]}}
echo $curr_rpm;
let i++;
done

This is for RHEL6 x86_64, but if need it for any other rpm based linux just replace list in “array=()” with list of required RPMs from oracle documentation. should be exactly copy/paste from documentation.

Save this script in any file (rhel6_x86_64_validate.sh in my case). chmod 700. And run without arguments:

[oracle@mylinux ~]$ ./rhel6_x86_64_validate.sh
binutils-2.20.51.0.2-5.11.el6.x86_64 ------------------------------- [pass]
binutils-2.20.51.0.2-5.34.el6.x86_64
compat-libcap1-1.10-1.x86_64 --------------------------------------- [failed]

compat-libstdc++-33-3.2.3-69.el6.x86_64 ---------------------------- [pass]
compat-libstdc++-33-3.2.3-69.el6.x86_64
compat-libstdc++-33-3.2.3-69.el6.i686 ------------------------------ [pass]
compat-libstdc++-33-3.2.3-69.el6.i686
gcc-4.4.4-13.el6.x86_64 -------------------------------------------- [pass]
gcc-4.4.6-4.el6.x86_64
gcc-c++-4.4.4-13.el6.x86_64 ---------------------------------------- [pass]
gcc-c++-4.4.6-4.el6.x86_64
glibc-2.12-1.7.el6.i686 -------------------------------------------- [pass]
glibc-2.12-1.80.el6_3.5.i686
glibc-2.12-1.7.el6.x86_64 ------------------------------------------ [pass]
...
[oracle@mylinux ~]$

If package is missing or older version script would spot it out.

This version will also install the RPM for you. (It has to be run as root, and have yum default repo configured).
Remove “.doc” from file name when download script:

rhel6_x86_64_validate_and_install_rpm.sh

 

Also Note from: http://docs.oracle.com/cd/E11882_01/install.112/e24325/toc.htm#BHCGAEJA

Note:
Starting with Oracle Database 11g Release 2 (11.2.0.2), all 32-bit packages, except for gcc-32bit-4.3, listed in this section are no longer required for installing a database on Linux x86-64. Only the 64-bit packages are required. However, for any Oracle Database 11g release before 11.2.0.2, both the 32-bit and 64-bit packages listed in this section are required.
However, when you install the 32-bit client binaries on 64-bit ports, the installer checks for the existence of 32-bit packages.

 

Here is also one nice blog on how to quickly rebuild oracle-rdbms-server-11gR2-preinstall to remove UEK dependency: http://flashdba.com/install-cookbooks/using-oracle-preinstall-rpm-with-red-hat-6/

 
Check us out on http://avmconsulting.net

Enter your email address to follow this blog and receive notifications of new posts by email.

Posted in Linux, Uncategorized | Tagged , , , , | Leave a comment

Top-N query

Term of “TOP-N” intuitively makes thinking what they query should run faster than just ORDERED query, as only N elements needed from it.
But, as well known, not always..

This blog describes method how using proper index would optimize TOP-N  query making it benefit from fact what only N rows needed.

This is how common case would look before optimization:

(for reference will call this Example 1)

CREATE TABLE EMPLOYEE
  (
    EMPLOYEE_ID NUMBER,
    BRANCH      VARCHAR2(100),
    DEPARTMENT  VARCHAR2(20),
    HIRE_DATE   DATE,
    SALARY      NUMBER
  );

CREATE INDEX IDX1_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT);

And table has data distribution like:

SQL> select count(1) from employee where branch='US' and department='IT';

COUNT(1)
----------------------
818008

SQL> select count(1) from employee;

COUNT(1)
----------------------
4745976

TOP-N query is:

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
order by salary
) where rownum<=3;

3 rows selected.

Elapsed: 00:00:03.42

Execution Plan
----------------------------------------------------------
Plan hash value: 2997571357

----------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name          | Rows  | Bytes |TempSpc| Cost (%CPU)| Time       |
----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |               |     3 |   297 |       | 13754	 (1)| 00:02:46 |
|*  1 |  COUNT STOPKEY                       |               |       |       |       |              |          |
|   2 |   VIEW                               |               |   818K|    77M|       | 13754	 (1)| 00:02:46 |
|*  3 |    SORT ORDER BY STOPKEY             |               |   818K|    22M|    34M| 13754	 (1)| 00:02:46 |
|   4 |     TABLE ACCESS BY INDEX ROWID      | EMPLOYEE      |   818K|    22M|       |  7240	 (1)| 00:01:27 |
|*  5 |      INDEX RANGE SCAN                | IDX1_EMPLOYEE |   818K|       |       |  2728	 (1)| 00:00:33 |
----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   3 - filter(ROWNUM<=3)    
   5 - access("BRANCH"='US' AND "DEPARTMENT"='IT')  

Statistics  ----------------------------------------------------------
        0  recursive calls
        0  db block gets
     5910  consistent gets
        0  physical reads 
        0  redo size  
      665  bytes sent via SQL*Net to client 
      364  bytes received via SQL*Net from client  
        2  SQL*Net roundtrips to/from client  
        1  sorts (memory)   
        0  sorts (disk)    
        3  rows processed   

“5910 consistent gets” – That is pretty expensive 3 rows..

“rownum<=3" helps here: query does not have to finish full sort ("SORT ORDER BY STOPKEY"), only 3 top elements required. And depending on sort algorithm been used it can be more or less help. But complete index range scan needs to be done in order to find top elements ("INDEX RANGE SCAN rows=818K"). Rownum does not help there. And its bigger overheat.

The ideal TOP-N optimization for the query would be: adding "ORDER BY" column to the end of the index:

SQL> CREATE INDEX IDX2_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT,SALARY);

(for reference will call this Example 2)

SQL> select * from (
select * from employee a
where department='IT'
and branch='US'
order by salary
) where rownum<=3;

3 rows selected.

Elapsed: 00:00:01.03

Execution Plan
———————————————————-
Plan hash value: 3266040937

———————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————————————–
| 0 | SELECT STATEMENT | | 3 | 297 | 5 (0)| 00:00:01 |
|* 1 | COUNT STOPKEY | | | | | |
| 2 | VIEW | | 3 | 297 | 5 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID| EMPLOYEE | 818K| 22M| 5 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IDX2_EMPLOYEE | 3 | | 3 (0)| 00:00:01 |
———————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – filter(ROWNUM<=3)
4 – access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
———————————————————-
1 recursive calls
0 db block gets
7 consistent gets
2 physical reads
0 redo size
665 bytes sent via SQL*Net to client
364 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed

This execution have several advantages comparing to previous:

  • In this execution plan “SORT” is not mentioned at all, because index is ordered structure.
  • Oracle just takes takes first 3 elements, relying on fact its already ordered,- it will be top entries. I.e. only doing Bounded Range scan is performed (INDEX RANGE SCAN Rows=3). It is huge saving, what is reflected in cost and gets. Consistent gets dropped to 7.

Pretty simple stuff.
Picture below shows Bounded INDEX RANGE SCAN stopped by rownum STOPKEY in visual form:

I would generalize conditions where index helps for TOP-N query:

  1. Having index what has leading columns as:
    • Zero or more columns mentioned in “WHERE” clause. If column is in WHERE condition but not part of index, that condition need to be have low selectivity.
    • “ORDER BY” column(s).
  2. Between 1st column in the index and “ORDER BY” column it should be no columns what are not mentioned in “WHERE” clause.
  3. Columns what stand before “ORDER BY” column in index, has to have “=” or “IN” condition in WHERE predicate. “IN” need additional handing (explained below).
    It actually can be condition in WHERE clause with “>” “<" "!=" "BETWEEN" (i.e. condition with indefinite set of possible combinations on the right), but columns mentioned in it should not be part of the index.

All that may sound a little too abstract. Here are some examples.

RE: “If column is in WHERE condition but not part of index, that condition need to be have low selectivity”

In the table I have 99% of employees was hired 10 years ago or less.
So here is query with low selectivity on HIRE_DATE. Hire date is not part of index:

(for reference will call this Example 3)

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date>sysdate-365*10
order by salary
) where rownum<=3;

Execution Plan
----------------------------------------------------------
Plan hash value: 3266040937

-----------------------------------------------------------------------------------------------
| Id  | Operation                     | Name	      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |               |     3 |   297 |     5	(0)| 00:00:01 |
|*  1 |  COUNT STOPKEY                |               |       |       |            |          |
|   2 |   VIEW                        |               |     3 |   297 |     5	(0)| 00:00:01 |
|*  3 |    TABLE ACCESS BY INDEX ROWID| EMPLOYEE      |   818K|   22M |     5	(0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | IDX2_EMPLOYEE |     3 |       |     3	(0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)    3 - filter("HIRE_DATE">SYSDATE@!-3650)
   4 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
	  1  recursive calls
	  0  db block gets
	  7  consistent gets
	  0  physical reads
	  0  redo size
	665  bytes sent via SQL*Net to client
	364  bytes received via SQL*Net from client
	  2  SQL*Net roundtrips to/from client
	  0  sorts (memory)
	  0  sorts (disk)
	  3  rows processed

We see same effective execution as before: 7 consistent gets. Cost 5.
Oracle here use almost same plan as shown at example 2:
Do bounded RANGE SCAN. And yes, its not 100% guaranteed first 3 rows from index will be the ones query would return, as HIRE_DATE condition still needs to be checked (3 – filter(“HIRE_DATE”>SYSDATE@!-3650)). But 99% of employees were hired less than 10 years do, so 99% chance what HIRE_DATE condition will be positive. And even if not lucky and hit those remaining 1% and one of the records would fail on HIRE_DATE condition, – just get next element from index.

Now, in contrast – query with high selectivity on HIRE_DATE:

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date<sysdate-365*10
order by salary
) where rownum<=3;
Execution Plan
----------------------------------------------------------
Plan hash value: 2997571357

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     3 |   297 |  7253	 (1)| 00:01:28 |
|*  1 |  COUNT STOPKEY                 |               |       |       |            |	       |
|   2 |   VIEW                         |               |    23 |  2277 |  7253	 (1)| 00:01:28 |
|*  3 |    SORT ORDER BY STOPKEY       |               |    23 |   667 |  7253	 (1)| 00:01:28 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEE      |    23 |   667 |  7252	 (1)| 00:01:28 |
|*  5 |      INDEX RANGE SCAN          | IDX1_EMPLOYEE |   818K|       |  2728	 (1)| 00:00:33 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   3 - filter(ROWNUM<=3)
   4 - filter("HIRE_DATE"<SYSDATE@!-3650)
   5 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       5910  consistent gets
          0  physical reads
          0  redo size
        489  bytes sent via SQL*Net to client
        353  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed

Plan from example 3 would not work here as “HIRE_DATE<sysdate-3650" would filter out 99% of rows which are gotten from index. And even the facts rows from index are ordered as needed, would not benefit much, as high count of rowid lookups on table (99% of which will fail on "HIRE_DATE<sysdate-3650") would kill the deal.
So oracle does not use last column in the index for Sorting, but execute explicit sort "SORT ORDER BY STOPKEY" Index only used for WHERE predicates. Very similar plan to the one in example 1.

RE: “Columns what stand before “ORDER BY” column in index, has to have “=” or “IN” condition in WHERE clause…”

This would be representation of the case in scope of example I follow:

SQL> drop index IDX2_EMPLOYEE;

Index dropped.

Elapsed: 00:00:00.98
SQL> CREATE INDEX IDX3_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT,hire_date,SALARY);

Index created.

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date>sysdate-100
order by salary
) where rownum<=3

Execution Plan
----------------------------------------------------------
Plan hash value: 2997571357

------------------------------------------------------------------------------------------------
| Id  | Operation                      | Name	       | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |               |     3 |   297 |  7254	 (1)| 00:01:28 |
|*  1 |  COUNT STOPKEY                 |               |       |       |            |          |
|   2 |   VIEW                         |               | 12880 |  1245K|  7254	 (1)| 00:01:28 |
|*  3 |    SORT ORDER BY STOPKEY       |               | 12880 |   364K|  7254	 (1)| 00:01:28 |
|*  4 |     TABLE ACCESS BY INDEX ROWID| EMPLOYEE      | 12880 |   364K|  7252	 (1)| 00:01:28 |
|*  5 |      INDEX RANGE SCAN          | IDX1_EMPLOYEE |   818K|       |  2728	 (1)| 00:00:33 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(ROWNUM<=3)
   3 - filter(ROWNUM<=3)    4 - filter("HIRE_DATE">SYSDATE@!-100)
   5 - access("BRANCH"='US' AND "DEPARTMENT"='IT')

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5910  consistent gets
          0  physical reads
          0  redo size
        662  bytes sent via SQL*Net to client
        364  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          3  rows processed

The plan again looks very similar to Example 1. Oracle does not use SORT of Salary from index, but doing explicit sort.

Picture below shows why.

test1

If single Bounded INDEX RANGE SCAN is used (showed by arrows), the row marked in Red will be missed.

From what shown on the picture – we can do 2 index range scans, and order union of the outputs.
Yes it would be case with “IN” condition. But remember here is “>” condition – i.e. not 2, but possible infinite combination of values, and so – infinite combination of index range scans. It will be for sure more expensive than just follow plan as in Example one, what Oracle did.

RE: “RE: Between 1st column in the index and “ORDER BY” column it should be no columns what are not mentioned in “WHERE” clause.”

This would be query for this case, where offended column in index is HIRE_DATE:

SQL> CREATE INDEX IDX3_EMPLOYEE ON EMPLOYEE (BRANCH,DEPARTMENT,hire_date,SALARY);

SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
order by salary
) where rownum<=3 

and query above is basically equivalent to following one:

 SQL> select  * from (
select * from employee a
where department='IT'
  and branch='US'
  and hire_date > sysdate-1000000 and hire_date < sysdate+1000000
order by salary
) where rownum<=3

And it would be exactly case as described in previous example.

There is a little special case with IN condition, as i mentioned earlier.
I will describe it in next blog.

Check us out on http://avmconsulting.net

Enter your email address to follow this blog and receive notifications of new posts by email.

Posted in SQL | Tagged , , , | Leave a comment

Oracle SQL Developer Hangs on Linux

If you use SQL Developer 3.x on Linux (tested on Ubuntu, RHEL, but applicable to other distributions, and may happens on Mac) and you do not remember time when you closed it gracefully, but xkill it all the time because it just HAAAAAAAAAAAANGs, keep reading at least next 20 lines, it may be very useful.

Usual scenario is: after some idle time, or after you disconnect from vpn/network, you try to do something (at least close window, or open new session, or really – anything) SQL developer start handing for about 15-18 minutes. And it may cascade to whole gnome session been hanging, so you cant even Ctrl+Alt+F1 to kill sqldeveloper..

1.
To fix hanging after idle time, and also avoid most “connection closed” errors:

Download libkeepalive-0.2.tar.gz from http://libkeepalive.sourceforge.net

$ tar xvf libkeepalive-0.2.tar.gz
$ cd libkeepalive-0.2
$ make
$ sudo cp libkeepalive.so /lib/
$ which sqldeveloper
/usr/local/bin/sqldeveloper
Edit the file above and replace line like:
/opt/sqldeveloper/sqldeveloper.sh
to:
LD_PRELOAD=/lib/libkeepalive.so KEEPCNT=2 KEEPIDLE=30 KEEPINTVL=30 /opt/sqldeveloper/sqldeveloper.sh

2.
To fix hanging after disconnect:

# echo "net.ipv4.tcp_retries2=5" >> /etc/sysctl.conf
# sysctl -p

Have tcpv6 disabled:

# echo "net.ipv6.conf.all.disable_ipv6 = 1" >> /etc/sysctl.conf
# echo "net.ipv6.conf.default.disable_ipv6 = 1" >> /etc/sysctl.conf
# echo "net.ipv6.conf.lo.disable_ipv6 = 1" >> /etc/sysctl.conf
# sysctl -p

If it helped, – this is it, you can ignore what is written below.

But if interested why it hangs – keep reading.

Just before it hangs enable basic debugging on machine where SQL Developer is running:

In first window:

$ ps -ef | grep sqldeve
amoseyev  4637     1  0 Jun22 ?        00:00:00 /bin/sh /usr/local/bin/sqldeveloper
amoseyev  4639  4637  0 Jun22 ?        00:00:00 /bin/bash /opt/sqldeveloper/sqldeveloper.sh
amoseyev  4642  4639  0 Jun22 ?        00:00:00 bash sqldeveloper
amoseyev  4699  4642  1 Jun22 ?        00:21:00 /usr/lib/jvm/java-6-sun/bin/java -Xmx640M -Xms128M -Xverify:none -Doracle.ide.util.AddinPolicyUtils.OVERRIDE_FLAG=true -Dsun.java2d.ddoffscreen=false -Dwindows.shell.font.languages= -XX:MaxPermSize=128M -Doracle.jdbc.mapDateToTimestamp=false -Dide.AssertTracingDisabled=true -Doracle.ide.util.AddinPolicyUtils.OVERRIDE_FLAG=true -Djava.util.logging.config.file=logging.conf -Dsqldev.debug=false -Dide.conf="/opt/sqldeveloper/sqldeveloper/bin/sqldeveloper.conf" -Dide.startingcwd="/opt/sqldeveloper/sqldeveloper/bin" -classpath ../../ide/lib/ide-boot.jar oracle.ide.boot.Launcher
amoseyev 12609 12170  0 01:23 pts/3    00:00:00 grep --color=auto sqldeve

# strace -e trace=network -f -p 4699

In second window:

$ netstat -n | grep 1521
tcp        0      0 10.4.4.51:49261         172.18.99.69:1521       ESTABLISHED

# tcpdump -lnni tun0 host 172.18.99.69 and dst port 1521 -vvv

Make sqldeveloper hang, and after 15-18 minutes (after it will finish handing):

In first window:

# strace -e trace=network -f -p 4699
...
[pid 32467] 17:27:08  0x823a0770, 8192, 0) = -1 ETIMEDOUT (Connection timed out)
[pid 32467] 17:27:08 send(232, "\0\n\0\0\6\0\0\0\0@", 10, 0) = -1 EPIPE (Broken pipe)

In second window:

# tcpdump -lnni tun0 host 172.18.99.69 and dst port 1521 -vvv

17:14:51.924676 IP (tos 0x0, ttl 64, id 28812, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x04f0 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39292289 ecr 3105462411], length 99
17:14:52.451556 IP (tos 0x0, ttl 64, id 28813, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x046c (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39292421 ecr 3105462411], length 99
17:14:53.511511 IP (tos 0x0, ttl 64, id 28814, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x0363 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39292686 ecr 3105462411], length 99
17:14:55.631544 IP (tos 0x0, ttl 64, id 28815, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x0151 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39293216 ecr 3105462411], length 99
17:14:59.871499 IP (tos 0x0, ttl 64, id 28816, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xfd2c (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39294276 ecr 3105462411], length 99
17:15:08.335490 IP (tos 0x0, ttl 64, id 28817, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xf4e8 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39296392 ecr 3105462411], length 99
17:15:25.263514 IP (tos 0x0, ttl 64, id 28818, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xe460 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39300624 ecr 3105462411], length 99
17:15:59.119517 IP (tos 0x0, ttl 64, id 28819, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xc350 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39309088 ecr 3105462411], length 99
17:17:06.831496 IP (tos 0x0, ttl 64, id 28820, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x8130 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39326016 ecr 3105462411], length 99
17:19:06.895498 IP (tos 0x0, ttl 64, id 28821, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x0bf0 (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39356032 ecr 3105462411], length 99
17:21:07.215506 IP (tos 0x0, ttl 64, id 28822, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x966f (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39386112 ecr 3105462411], length 99
17:23:07.535497 IP (tos 0x0, ttl 64, id 28823, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0x20ef (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39416192 ecr 3105462411], length 99
17:25:07.855511 IP (tos 0x0, ttl 64, id 28824, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xab6e (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39446272 ecr 3105462411], length 99
17:25:07.855511 IP (tos 0x0, ttl 64, id 28824, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xab6e (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39446272 ecr 3105462411], length 99
17:25:07.855511 IP (tos 0x0, ttl 64, id 28824, offset 0, flags [DF], proto TCP (6), length 151)
    10.181.70.57.39292 > 10.168.66.198.1521: Flags [P.], cksum 0xab6e (correct), seq 7433:7532, ack 10119, win 1005, options [nop,nop,TS val 39446272 ecr 3105462411], length 99

So we have 15 TCP retransmission timeouts (RTO). And default net.ipv4.tcp_retries2=15. so SQL developer waited for TCP timeout..

In Oracle Database JDBC Developer’s Guide and Reference guide (http://docs.oracle.com/cd/B28359_01/java.111/b31224/apxtblsh.htm)
under “Using JDBC with Firewalls” they say:

Pass oracle.net.READ_TIMEOUT as connection property to enable read timeout on socket. The timeout value is in milliseconds.

SQL Developer is kind of tool what has a good chance to be on another side of firewall. But it does not seem to be any internal sqldeveloper timeout, or at least it does not work… SQL Developer for linux, patiently waits for TCP timeout. We are not talking about milliseconds, but 15+ minutes!

I guess it may be OK for somebody, but I personally forget what I was planning to do there after 15 minutes 🙂

First advice above, makes SQL developer to open TCP sockets with setsockopt(SO_KEEPALIVE), and that socket will never idle for more than 30 seconds.
And it will stop 99% of firewalls from closing/banning the connection due to inactivity. (most firewalls have polices for idle connection time limit, to prevent several types of DoS attacks). Some firewalls may still treat keep alives as “not real traffic” and still ban connection.
This is similar to setting SQLNET.EXPIRE_TIME=x in sqlnet.ora on server side. But in this case keep alives will come from client site.

Second advice above, covers case IF connection still gets broken (either due to firewall or just cable unplug).
So it sets TCP RTO limit from default 15 to 5. With RTO been variable value it will bring TCP timeout to below 30 seconds.
It can cause some data network traffic loss in slow networks, but it gotta be really crappy network.

Check us out on http://avmconsulting.net

Enter your email address to follow this blog and receive notifications of new posts by email.

Posted in Linux, Systems, Ubuntu | Tagged , | 11 Comments

Index Clustering factor. One way how to avoid negative performance impact on joins which involves bad clustered indexes.

High Index Clustering factor is popular problem in SQL world.
The most known solution for it – rebuild table ordering by columns what are in the index.
But in real world, with table size growing exponentially its not always practical or possible to do it.

Here is one way to avoid index clustering problem by SQL optimization..

Consider classical example.

We have tables:
1. customer

CREATE TABLE customer
(customer_id NUMBER,
 name        VARCHAR2(50),
 dob         DATE,
 status      VARCHAR2(20),
...
CONSTRAINT customer_pk PRIMARY KEY (customer_id)
);

2. service_provider

CREATE TABLE service_provider
(service_provider_id NUMBER,
 name                VARCHAR2(100),
 street_address      VARCHAR2(100),
 status              VARCHAR2(20),
...
CONSTRAINT service_provider_pk PRIMARY KEY (service_provider_id)
);

2. feedback

CREATE TABLE feedback
(service_provider_id NUMBER,
 customer_id         NUMBER,
 score               NUMBER,
...
);
CREATE INDEX fdb_serv_provr_cust_scr ON feedback (service_provider_id,customer_id);
CREATE INDEX fdb_cust_serv_provr_scr ON feedback (customer_id,service_provider_id);

Table size is important here, so lets say CUSTOMER and SERVICE_PROVIDER are in millions of rows, with table segments in range 5-30 GBs.
FEEDBACK is intersection table, it’s size is in multi TBs, and row count in 100s of billions. Very Large table.
The tables can be either partitioned or not, but for simplicity of example lest assume they are not.

Now – classical SQL. The SQL execution frequency is high:

select * from service_provider s, feedback f
where f.customer_id=:1
and f.service_provider_id=s.service_provider_id
and s.status='ACTIVE'

Oracle gives following execution plan for the SQL (at time of writing, tested on all major versions from 10.2.0.1 to 11.2.0.3):

Execution Plan
----------------------------------------------------------
Plan hash value: 2452536980

---------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                         |     1 |   143 |   621   (1)| 00:04:05 |
|   1 |  NESTED LOOPS                 |                         |       |       |            |          |
|   2 |   NESTED LOOPS                |                         |     1 |   143 |   611   (1)| 00:04:05 |
|   3 |    TABLE ACCESS BY INDEX ROWID| FEEDBACK                |   300 |   527K|   310   (1)| 00:02:05 |
|*  4 |     INDEX RANGE SCAN	      | FDB_CUST_SERV_PROVR_SCR |   300 |       |    62   (0)| 00:00:01 |
|*  5 |    INDEX UNIQUE SCAN	      | SERVICE_PROVIDER_PK     |     1 |       |     1   (0)| 00:00:01 |
|*  6 |   TABLE ACCESS BY INDEX ROWID | SERVICE_PROVIDER        |     1 |    89 |     2   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("F"."CUSTOMER_ID"=:1)
   5 - access("F"."SERVICE_PROVIDER_ID"="S"."SERVICE_PROVIDER_ID")
   6 - filter("S"."STATUS"='ACTIVE')

And this plan if given regardless of selectivity of condition “s.status=’ACTIVE'”:
If its only 1% of row set in the query meet condition “s.status=’ACTIVE'”, or 90% – Oracle does not change the plan.

But what happens if selectivity is high and “s.status=’ACTIVE'” filters out a lot of data from the select?
Here would be execution steps:

Step 1.
(from plan above related to id=3)
Oracle will read index FDB_CUST_SERV_PROVR_SCR doing INDEX RANGE SCAN on condition “4 – access(“F”.”CUSTOMER_ID”=:1)”. With standard DB block size 8k, and lets assume customer having 300 feedbacks, it will result in 1-2 index leaf block reads and 3-4 total I/Os on the index. Index FDB_CUST_SERV_PROVR_SCR is large, so in the worst case scenario it can be all physical I/O. Assuming underling storage is RAID 10 FC disks, average Random read is 10ms. This step will take about 4*10=40 ms.
So as a result of the step 300 CUSTOMER_ID-SERVICE_PROVIDER_ID-ROWID sets will be available.

Step 2.
(from plan above related to id=4)
Now using 300 ROWIDs from step 1, rows from FEEDBACK table will be located. Assuming the index have below average clustering factor, and table is huge, it will be safe assumption to say almost each row will be in different table block. With FEEDBACK been large table, almost all I/O will be physical. It will be 300 random single block read physical I/Os.
So we are talking about – 10ms*300=3 seconds. For this step only.
As a result of the step 300 rows from FEEDBACK table will be available. They are needed for final result (“select *” was used).

Step 3.
(from plan above related to id=5)
using SERVICE_PROVIDER_ID from step 1 as an input for “INDEX UNIQUE SCAN” on SERVICE_PROVIDER_PK it will be one index lookup. With most probable index depth of 3 it will be 3 I/O operation. BUT 99% chance it will be logical I/O, as SQL is executed often and index SERVICE_PROVIDER_PK on several millions record table will be in lower GBs (in my case 1Gb).
So we are talking about (will take average memory access time 50ns) 50ns*3=0.00015ms. With 1% chance on physical I/O: (10ms*1+0.00015ms*99)/100=0.1 ms.
As a result of the step one ROWID for SERVICE_PROVIDER table will be available.

Step 4.
(from plan above related to id=6)
Using one ROWID from step 3 it will be one “TABLE ACCESS BY INDEX ROWID” to SERVICE_PROVIDER. But again, with SERVICE_PROVIDER been in lower GBs, and SQL been called often most of I/O will be logical.
With 1% chance of physical I/O this step will take (99*0.00005ms+1*10ms)/100=0.1 ms
As a result of the step, one row for SERVICE_PROVIDER table will be available for final output (“select *” was used).

So far, step 2 is heaviest/longest by far.

Step 5.
(from plan above related to id=6)
Now when row from SERVICE_PROVIDER is available “filter(“S”.”STATUS”=’ACTIVE’)” comes to the picture. As was mentioned above it very selective, lets say 90%. And so it means what we do not need 90% of rows what was selected from FEEDBACK table on step 2. And 90% of time spent in step 2 was waste of DBtime.

Steps 3-5 will repeat 300 times. Even after repeating steps 3-5 300 times, step 2 is still heaviest by far.

What if we touch table FEEDBACK when we really need it? Meaning after “filter(“S”.”STATUS”=’ACTIVE’)” throw out 90% of not-needed data.
We can save 90% of physical I/O and so – time, on step 2.

Physically this is very possible as we have SERVICE_PROVIDER_ID in “step 1”. So we can easily use the SERVICE_PROVIDER_ID to lookup SERVICE_PROVIDER via SERVICE_PROVIDER_PK, then run filter “filter(“S”.”STATUS”=’ACTIVE’)” and in very end, only if filter was positive, – lookup record in FEEDBACK table by ROWID what we already got from “step 1”.

What I am trying to achieve would express in following plan:

---------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                         |     1 |   163 |   303   (1)| 00:02:02 |
|   1 |  NESTED LOOPS                 |                         |     1 |   163 |   303   (1)| 00:02:02 |
|   2 |   NESTED LOOPS                |                         |     1 |   109 |   302   (1)| 00:02:02 |
|*  3 |    INDEX RANGE SCAN           | FDB_CUST_SERV_PROVR_SCR |   300 |   195K|    62   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS BY INDEX ROWID| SERVICE_PROVIDER        |     1 |    89 |     2   (0)| 00:00:01 |
|*  5 |     INDEX UNIQUE SCAN         | SERVICE_PROVIDER_PK     |     1 |       |     1   (0)| 00:00:01 |
|   6 |   TABLE ACCESS BY USER ROWID  | FEEDBACK                |     1 |    54 |     1   (0)| 00:00:01 |<<---Get records from FEEDBACK after running "filter("S"."STATUS"='ACTIVE')"
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("F"."CUSTOMER_ID"=:1)
   4 - filter("S"."STATUS"='ACTIVE')
   5 - access("F"."SERVICE_PROVIDER_ID"="S"."SERVICE_PROVIDER_ID")

I was not able to find any way to hint SQL, to adjust it to plan like this (if anybody knows the way it would be cool).
But with some regrouping its possible. This SQL will produce the plan above (looks nasty but works well):

select q2.*, q1.* from
(
select f.rowid ri, s.* from feedback f, service_provider s
where f.customer_id=:1
and f.service_provider_id=s.service_provider_id
and s.status='ACTIVE'
) q1,
feedback q2
where q1.ri=q2.rowid;

In some cases it worked for me just as shown above, in some different data distributions and/or optimiser versions I had to put couple hints:

select /*+ no_merge ordered */ q2.*, q1.* from
(
select f.rowid ri, s.* from feedback f, service_provider s
where f.customer_id=:1
and f.service_provider_id=s.service_provider_id
and s.status='ACTIVE'
) q1,
feedback q2
where q1.ri=q2.rowid;

And also.. – this technique does not work for optimiser after 11.2.0.1. When I would have more time for this, I will trace the SQL with 10053 event on 11.2 version and figure out why..(I promise 🙂 ). But before that setting optimizer_features_enable to 11.1.0.7 would make a trick:

alter session set optimizer_features_enable='11.1.0.7';

Couple findings on sql plan:

1.
CBO did agree – its cheaper to access table FEEDBACK in in the end.

2.
CBO weighted “TABLE ACCESS BY INDEX ROWID” for FEEDBACK almost same as NL on SERVICE_PROVIDER. This will not be true, if the SQL frequency is high, and FEEDBACK is much bigger. But to be fair, CBO cant take frequency in consideration.

The best performance measurement of purely index based queries like this I think would be DBtime/row, what would be almost linear to physical I/O row.

To compare apples-to-apples this is what I used to see performance before and after optimisation:

select st.sql_id, st.instance_number,
       st.EXECUTIONS_DELTA,
       st.BUFFER_GETS_DELTA /st.ROWS_PROCESSED_DELTA "logical IO per row",
       st.DISK_READS_DELTA  /st.ROWS_PROCESSED_DELTA "phisical IOs per row",
       st.IOWAIT_DELTA      /st.ROWS_PROCESSED_DELTA "time spent in IO (seconds/row)",
       st.ELAPSED_TIME_DELTA/st.ROWS_PROCESSED_DELTA "total time spent (seconds/row)"
from DBA_HIST_SQLSTAT st, dba_hist_snapshot s
 where
   st.sql_id='&sql_id1'
   and st.instance_number=s.instance_number
   and s.BEGIN_INTERVAL_TIME between '&1' and '&2'
   and s.snap_id=st.snap_id
 and st.ROWS_PROCESSED_DELTA!=0
order by s.snap_id desc, instance_number;

Here are numbers from busiest production system I implement this technique to (one 30 minutes interval with old and new SQL):

SQL_ID        EXECUTIONS_DELTA logical IO per row   phisical IOs per row    time spent in IO (ms/row)    total time spent (ms/row)
------------- ---------------- -------------------- ----------------------- ---------------------------- --------------------------
old_sql       26219            72                   8.11                    50.11                        57.1
new_sql       28936            71                   0.91                    5.900                        6.10

As you can see Logical I/O is not much better in optimised query, as “step 3” contribute the most into logical I/O, and we did not optimise anything in step 3.
But physical I/O and total DB time improved almost 10 times.

The argument what would cross everything said above over – it would be same effect if we rebuild FEEDBACK table and improve FDB_CUST_SERV_PROVR_SCR clustering factor.

Yes true, but couple things

1.
Rebuilding on regular basis table what is TBs in size will be a interesting challenge if availability is important.

2.
What if there is a need to access FEEDBACK table by different index/dimension? For example:

select * from customer c, feedback f
where f.service_provider_id=:1
and f.customer_id=c.customer_id
and c.status='ACTIVE';

Adjusting FEEDBACK table to FDB_CUST_SERV_PROVR_SCR(customer_id,service_provider_id) index clustering would mess-up clustering on FDB_SERV_PROVR_CUST_SCR(service_provider_id,customer_id) index. And its only 2 dimensions in this example, it could be more.
Same time, this optimisation would work regardless of clustering factor on any of them, as long as WHERE clause on table on the right is selective (“status=’ACTIVE'” in this example).

For cases where this optimisation is applicable, it can also resolve data archival problem. – If data archival is needed for performance purposes, and not to safe on storage, this will make it. The query performance will be unchanged whether FEEDBACK table be 1TB or 100TB.
Simple making status=’INACTIVE’ on CUSTOMER and SERVICE_PROVIDER would be an archival. No big data movement required.

I used for example here “CUSTOMER – FEEDBACK – SERVICE_PROVIDER” but idea of “intersection” table exists in almost any business model:
“PRODUCT – COMPATIBILITY – PRODUCT”
“MALE – MATCH – FEMALE”
“STUDENT – EXAM – EXAMINER”
“MERCHANT – TRANSACTION – BANK”
and so on.

In this example query is simplified to the limit, but does not mean it has to be that primitive.
To generalise, – this approach would work when:
There is query what join at least 2 tables. The table what is entry point (first step) for the query is very large and has index on at least 2 columns: fist column – entry condition on the query, second column – column what is foreign key (no constraint required) on second table in the query. And this query has where predicate on second table what is very selective. Second table is not too large. Query execution frequency is high.

It work very well for me in several production scenarios, I hope it can be useful to others.

Check us out on http://avmconsulting.net

Enter your email address to follow this blog and receive notifications of new posts by email.

Posted in SQL | Tagged , , , , | 2 Comments