Michal Simonik's Oracle blog: December 2015

Thursday, December 17, 2015

One does not simply create table with 300 columns

There was quite an interesting post on Oracle OTN, which is worth of this blog post and meme.

Problem was defined as follows:

I have a table which has 300 columns and keeps records 60,00,000 at this time. Records are increasing day by day. When I run a simple query like:

select * from table where store_no = 17

this query takes 30 to 60 seconds for fetching the data against this query which is to much time, while I'll use this table with more filters. Can anyone help me that how i can increase data fetching performance, i have also used a primary key indexing.

If you check supplied DDL against given SQL, you'll likely come to two points

There is no index on column store_no and so Oracle will have to do full table scan
... 300 columns!?

My suggestion was to reconsider the data model. 300 columns are pretty rough and columns like IT_CLERK_ID2. IT_CLERK_ID3, ... , IT_CLERK_ID5 tend to be suspicious.

Reason for my suggestion was that Oracle will have to split rows in this table in two pieces because they have more than 255 columns. Interestingly enough, the split will be done from the end of the row. So first piece of the row will have first 45 columns and second piece will contain 255 columns. What should come to your mind now is that order of columns in such table is very important and you should place columns you are going to select as close as possible to the start of the row.

Result of this split will be extra CPU usage when selecting rows from second piece. Secondary result might come in form of row chaining between database blocks which will introduce additional single block reads to walk your rows.

Now let's have a little test. I've created two tables where all columns are VARCHAR2 with value 'X' except first and last column which is NUMBER with value 1. First table has 250 columns and second has 300.

I've traced four following SQL statements:

SELECT SUM(c1) from table_250;
SELECT SUM(c250) from table_250;
SELECT SUM(c1) from table_300;
SELECT SUM(c300) from table_300;

Let's check important parts of trace files:

SELECT SUM(c1) from table_250


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.02          1          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      1.27      10.58      71553     142903          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      1.27      10.60      71554     142904          0           1


SELECT SUM(c250) from table_250


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.59        326        113          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      2.06      11.30      71554     142903          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      2.08      11.90      71880     143016          0           1


SELECT SUM(c1) from table_300


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          1          1          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      1.60      16.12      90933     181767          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      1.60      16.13      90934     181768          0           1


SELECT SUM(c300) from table_300


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.02       0.24        148         82          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      6.92      18.01      90934    2181767          0           1
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      6.94      18.26      91082    2181849          0           1

As you can see, sum of column c300 is quite CPU heavy. There are also other consequences of row chaining with connection to table full scan and buffer cache ... which is not as good as you can guess by now.

I would encourage you to check blog post by Jonathan Lewis which covers buffer cache and other important things.

So remember folks ... one does not simply create 300 columns table. You have to have very good reason for that and you have to think about order of columns and consequences.

Monday, December 7, 2015

2015 BGOUG Autumn Conference

Few photos from my visit at 2015 BGOUG Autumn Conference

Wednesday, December 2, 2015

Oracle hint ignore_row_on_dupkey_index - part 2

Last time we’ve seen that there is something really sneaky going on when we use hint ignore_row_on_dupkey_index. We have several clues for that:

1.85 seconds vs 1:27.96 with hint. That’s about 50x slower.
133 369 logical reads vs 1 156 645. That’s almost 9x more.

Another thing you would definitely find strange is the difference in sizes of trace files: 22 KB vs 35 MB … that’s quite huge.

So let’s open the larger one and see if we can spot anything strange:

…
*** 2015-11-19 12:08:07.979
*** SESSION ID:(95.53) 2015-11-19 12:08:07.979
*** CLIENT ID:() 2015-11-19 12:08:07.979
*** SERVICE NAME:(SYS$USERS) 2015-11-19 12:08:07.979
*** MODULE NAME:(sqlplus@localhost.localdomain (TNS V1-V3)) 2015-11-19 12:08:07.979
*** ACTION NAME:() 2015-11-19 12:08:07.979

WAIT #140056955617216: nam='SQL*Net message to client' ela= 2 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1447931287979703

*** 2015-11-19 12:08:18.806
WAIT #140056955617216: nam='SQL*Net message from client' ela= 10826864 driver id=1650815232 #bytes=1 p3=0 obj#=-1 tim=1447931298806851
CLOSE #140056955617216:c=0,e=11,dep=0,type=1,tim=1447931298806957
=====================
PARSING IN CURSOR #140056955763032 len=113 dep=0 uid=0 oct=2 lid=0 tim=1447931298808784 hv=688533685 ad='b89b3418' sqlid='5v0vfdcnhnc5p'
insert /*+ rule ignore_row_on_dupkey_index(table1, table1_pk) */ into mydbdba.table1 select * from mydbdba.table2
END OF STMT
PARSE #140056955763032:c=2000,e=1770,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=3,plh=1708808563,tim=1447931298808778
=====================
PARSING IN CURSOR #140056955737968 len=131 dep=1 uid=0 oct=3 lid=0 tim=1447931298809899 hv=1635361899 ad='b8e8eb30' sqlid='2jfqzrxhrm93b'
select /*+ rule */ c.name, u.name from con$ c, cdef$ cd, user$ u  where c.con# = cd.con# and cd.enabled = :1 and c.owner# = u.user#
END OF STMT
PARSE #140056955737968:c=0,e=74,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298809895
EXEC #140056955737968:c=0,e=85,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298810084
FETCH #140056955737968:c=1000,e=180,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298810294
STAT #140056955737968 id=1 cnt=1 pid=0 pos=1 obj=0 op='NESTED LOOPS  (cr=8 pr=0 pw=0 time=204 us)'
STAT #140056955737968 id=2 cnt=1 pid=1 pos=1 obj=0 op='NESTED LOOPS  (cr=6 pr=0 pw=0 time=154 us)'
STAT #140056955737968 id=3 cnt=1 pid=2 pos=1 obj=31 op='TABLE ACCESS BY INDEX ROWID CDEF$ (cr=3 pr=0 pw=0 time=102 us)'
STAT #140056955737968 id=4 cnt=1 pid=3 pos=1 obj=56 op='INDEX RANGE SCAN I_CDEF4 (cr=2 pr=0 pw=0 time=63 us)'
STAT #140056955737968 id=5 cnt=1 pid=2 pos=2 obj=28 op='TABLE ACCESS BY INDEX ROWID CON$ (cr=3 pr=0 pw=0 time=44 us)'
STAT #140056955737968 id=6 cnt=1 pid=5 pos=1 obj=52 op='INDEX UNIQUE SCAN I_CON2 (cr=2 pr=0 pw=0 time=28 us)'
STAT #140056955737968 id=7 cnt=1 pid=1 pos=2 obj=22 op='TABLE ACCESS CLUSTER USER$ (cr=2 pr=0 pw=0 time=34 us)'
STAT #140056955737968 id=8 cnt=1 pid=7 pos=1 obj=11 op='INDEX UNIQUE SCAN I_USER# (cr=1 pr=0 pw=0 time=12 us)'
CLOSE #140056955737968:c=0,e=105,dep=1,type=1,tim=1447931298810435
PARSE #140056955737968:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298810877
EXEC #140056955737968:c=0,e=43,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298810985
FETCH #140056955737968:c=0,e=73,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298811098
CLOSE #140056955737968:c=0,e=23,dep=1,type=3,tim=1447931298811162
PARSE #140056955737968:c=0,e=21,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298811475
EXEC #140056955737968:c=0,e=70,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298811601
FETCH #140056955737968:c=0,e=93,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298811722
CLOSE #140056955737968:c=0,e=12,dep=1,type=3,tim=1447931298811766
PARSE #140056955737968:c=0,e=22,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298812046
EXEC #140056955737968:c=0,e=40,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298812137
FETCH #140056955737968:c=0,e=80,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298812278
CLOSE #140056955737968:c=0,e=12,dep=1,type=3,tim=1447931298812321
PARSE #140056955737968:c=0,e=19,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298812625
EXEC #140056955737968:c=0,e=31,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298812720
FETCH #140056955737968:c=0,e=72,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298812819
CLOSE #140056955737968:c=0,e=11,dep=1,type=3,tim=1447931298812861
PARSE #140056955737968:c=0,e=34,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298813149
EXEC #140056955737968:c=0,e=41,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298813268
FETCH #140056955737968:c=0,e=82,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298813379
CLOSE #140056955737968:c=0,e=12,dep=1,type=3,tim=1447931298813422
PARSE #140056955737968:c=0,e=20,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298813774
EXEC #140056955737968:c=0,e=28,p=0,cr=0,cu=0,mis=0,r=0,dep=1,og=3,plh=1027684349,tim=1447931298813868
FETCH #140056955737968:c=0,e=71,p=0,cr=8,cu=0,mis=0,r=1,dep=1,og=3,plh=1027684349,tim=1447931298813966
CLOSE #140056955737968:c=0,e=11,dep=1,type=3,tim=1447931298814008
…

It’s cursor #140056955737968 craziness! It’s getting called again and again and ….

Ok, let’s have a look how many times it’s actually called

SQL ID: 2jfqzrxhrm93b Plan Hash: 1027684349

select /*+ rule */ c.name, u.name
from
 con$ c, cdef$ cd, user$ u  where c.con# = cd.con# and cd.enabled = :1 and
  c.owner# = u.user#

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse    99001      2.63       2.72          0          0          0           0
Execute  99001      5.44       4.02          0          0          0           0
Fetch    99001     11.73       8.65          0     792008          0       99001
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total   297003     19.81      15.40          0     792008          0       99001

Misses in library cache during parse: 0
Optimizer mode: RULE
Parsing user id: SYS   (recursive depth: 1)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max)  Row Source Operation
---------- ---------- ----------  ---------------------------------------------------
         1          1          1  NESTED LOOPS  (cr=8 pr=0 pw=0 time=204 us)
         1          1          1   NESTED LOOPS  (cr=6 pr=0 pw=0 time=154 us)
         1          1          1    TABLE ACCESS BY INDEX ROWID CDEF$ (cr=3 pr=0 pw=0 time=102 us)
         1          1          1     INDEX RANGE SCAN I_CDEF4 (cr=2 pr=0 pw=0 time=63 us)(object id 56)
         1          1          1    TABLE ACCESS BY INDEX ROWID CON$ (cr=3 pr=0 pw=0 time=44 us)
         1          1          1     INDEX UNIQUE SCAN I_CON2 (cr=2 pr=0 pw=0 time=28 us)(object id 52)
         1          1          1   TABLE ACCESS CLUSTER USER$ (cr=2 pr=0 pw=0 time=34 us)<
         1          1          1    INDEX UNIQUE SCAN I_USER# (cr=1 pr=0 pw=0 time=12 us)(object id 11)

********************************************************************************

Remember how many rows which table had? Let me remind you

table1 with 100 000 rows
table2 with 199 001 rows from which 99 001 have same primary key value like rows in table1

So this SQL is called EXACLY as many times as there are matching keys (duplicates)!

What this query seems to do is to select names of owners and names of constraints enabled for table1. Why would you do that and why for EVERY failed row is really beyond me …