Thursday, December 17, 2015

One does not simply create table with 300 columns


There was quite an interesting post on Oracle OTN, which is worth of this blog post and meme.

Problem was defined as follows:
I have a table which has 300 columns and keeps records 60,00,000 at this time. Records are increasing day by day. When I run a simple query like:

select * from table where store_no = 17

this query takes 30 to 60 seconds for fetching the data against this query which is to much time, while I'll use this table with more filters. Can anyone help me that how i can increase data fetching performance, i have also used a primary key indexing.

If you check supplied DDL against given SQL, you'll likely come to two points
  1. There is no index on column store_no and so Oracle will have to do full table scan
  2. ... 300 columns!?



My suggestion was to reconsider the data model. 300 columns are pretty rough and columns like IT_CLERK_ID2. IT_CLERK_ID3, ... , IT_CLERK_ID5 tend to be suspicious.

Reason for my suggestion was that Oracle will have to split rows in this table in two pieces because they have more than 255 columns. Interestingly enough, the split will be done from the end of the row. So first piece of the row will have first 45 columns and second piece will contain 255 columns. What should come to your mind now is that order of columns in such table is very important and you should place columns you are going to select as close as possible to the start of the row.

Result of this split will be extra CPU usage when selecting rows from second piece. Secondary result might come in form of row chaining between database blocks which will introduce additional single block reads to walk your rows.

Now let's have a little test. I've created two tables where all columns are VARCHAR2 with value 'X' except first and last column which is NUMBER with value 1. First table has 250 columns and second has 300.

I've traced four following SQL statements:

SELECT SUM(c1) from table_250;
SELECT SUM(c250) from table_250;
SELECT SUM(c1) from table_300;
SELECT SUM(c300) from table_300;

Let's check important parts of trace files:



As you can see, sum of column c300 is quite CPU heavy. There are also other consequences of row chaining with connection to table full scan and buffer cache ... which is not as good as you can guess by now.

I would encourage you to check blog post by Jonathan Lewis which covers buffer cache and other important things.

So remember folks ... one does not simply create 300 columns table. You have to have very good reason for that and you have to think about order of columns and consequences.

Monday, December 7, 2015

Wednesday, December 2, 2015

Oracle hint ignore_row_on_dupkey_index - part 2

Last time we’ve seen that there is something really sneaky going on when we use hint ignore_row_on_dupkey_index. We have several clues for that:

  1. 1.85 seconds vs 1:27.96 with hint. That’s about 50x slower.
  2. 133 369 logical reads vs 1 156 645. That’s almost 9x more.




Another thing you would definitely find strange is the difference in sizes of trace files: 22 KB vs 35 MB … that’s quite huge.

So let’s open the larger one and see if we can spot anything strange:



It’s cursor #140056955737968 craziness! It’s getting called again and again and ….

Ok, let’s have a look how many times it’s actually called



Remember how many rows which table had? Let me remind you
  • table1 with 100 000 rows 
  • table2 with 199 001 rows from which 99 001 have same primary key value like rows in table1

So this SQL is called EXACLY as many times as there are matching keys (duplicates)!

What this query seems to do is to select names of owners and names of constraints enabled for table1. Why would you do that and why for EVERY failed row is really beyond me …