Thursday, March 31, 2016

Beginner's mistakes in PL/SQL - Part 1

Before a month or so, I was given an interesting PL/SQL code for tuning. Its function was to read certain data from database and then export them into a flat file. I was told that customer is complaining that time to export usual batch of data (not very large) is unacceptable. So it was typical "make it faster" problem.

As I was reading through and analyzing the code, I've decided I'm going to make a blog from that as it had all the nice mistakes in one place.

All we'll do in our first post is that we are going to setup our example. After that, in the mean time, you can read the code and try to identify mistakes and slow parts. Also you can try to construct your own faster solution and then you can compare it with next blog posts.

The code below is given for demonstration purposes only and replicates all the mistakes and tuning issues original program had. I’ve also cut some code to make it more readable.

DDL script to setup our data model and populate it with data:

Package specification and body:

To test it, just run following code:

SET SERVEROUTPUT ON;
BEGIN
 data_to_ascii.load_tmp_table;
 data_to_ascii.create_blob;
END;
/

Continue to Part 2

Friday, March 4, 2016

Join predicate push down issue

This blog post is about very interesting issue with Oracle Optimizer which I do dare call a bug. To be honest with you I’ve spend couple of evenings on it since it was driving me crazy :). I knew what was going on but I wanted to force the Oracle to do the right thing. In the end I had to revert to backup solution.



Now, let’s start from the beginning. I’m providing all the things you need to be able to play with it on your own. Problem was spotted on Oracle 11.2.0.4.0. Here is the script to prepare data model for the demonstration:

CREATE OR REPLACE FORCE TYPE number_object IS OBJECT
(
 n NUMBER
);
/

CREATE OR REPLACE FORCE TYPE number_table AS TABLE OF number_object
/

CREATE TABLE list_agg_overflow
(
 n NUMBER,
 v VARCHAR2(1000)
);
/

CREATE INDEX list_agg_overflow_ix ON list_agg_overflow(n);
/

CREATE TABLE list_agg_overflow_output
(
 n NUMBER,
 v VARCHAR2(4000)
);
/

CREATE GLOBAL TEMPORARY TABLE list_agg_overflow_tmp
(
 n NUMBER
);
/

DECLARE
  v_dummy VARCHAR2(1000) := LPAD('X',1000,'X');
BEGIN
  --
  FOR i IN 1 .. 2
  LOOP
    INSERT INTO list_agg_overflow VALUES (1, v_dummy);
  END LOOP;
  --
  FOR i IN 1 .. 10
  LOOP
   INSERT INTO list_agg_overflow VALUES (2, v_dummy);
  END LOOP;
  --
  dbms_stats.gather_table_stats(null, 'list_agg_overflow');
  --
  COMMIT;
END;
/

As you can see, we have a list_agg_overflow table, which has 2 rows of id 1 and 10 rows of id 2. Each row contains a text of length 1000 characters. This is important since we are going to exploit it a little bit later. Now we run following anonymous PL/SQL block:

DECLARE
  v_output VARCHAR2(16384);
BEGIN
  --
  INSERT INTO list_agg_overflow_tmp VALUES (1);
  --
  INSERT INTO list_agg_overflow_output
  SELECT
    /*+ qb_name(main) leading(n_list a j) dynamic_sampling(0) */
    a.n,
    j.v
  FROM list_agg_overflow a,
       list_agg_overflow_tmp n_list,
       (SELECT
          /*+ qb_name(agg) push_pred */
          n,
          LISTAGG(v, ',') WITHIN GROUP (ORDER BY n) v
        FROM list_agg_overflow
        GROUP BY n
       ) j
  WHERE a.n = n_list.n
    AND a.n = j.n(+);
  ROLLBACK;
END;
/

We run the script with no error as expected. PL/SQL block is fairly simple. Basically we insert a list of IDs (column n) into our temporary table. Point of the query is to pick that list and join it on table a (as nested loop) and then join the result onto our subquery which does aggregate and creates a concatenation of text based on ID. We do want to push join predicate into subquery agg and it’s definitely possible, since we are aggregating by column we want push inside. Last join was outer and I left it there. It has no impact on query output (in original query it had) or our problem.

Here we see execution plan which looks as we want it to look:



Now let’s make a little change (and that’s how the query looked in 1st place). Let’s assume we have our list of IDs in an array. We can definitely do that and there might be a very good reason why to do so. Our block would look like this:

DECLARE
  v_list_tab number_table := number_table();
  v_output VARCHAR2(16384);
BEGIN
  --
  v_list_tab.EXTEND;
  v_list_tab(v_list_tab.FIRST) := number_object(1);
  --
  INSERT INTO list_agg_overflow_output
  SELECT
    /*+ qb_name(main) */
    a.n,
    j.v
  FROM list_agg_overflow a,
       TABLE(v_list_tab) n_list,
       (SELECT
          /*+ qb_name(agg) push_pred */
          n,
          LISTAGG(v, ',') WITHIN GROUP (ORDER BY n) v
        FROM list_agg_overflow
        GROUP BY n
       ) j
  WHERE a.n = n_list.n
    AND a.n = j.n(+);
  ROLLBACK;
END;
/

Now let’s run it and see this:

Error report -
ORA-01489: result of string concatenation is too long
ORA-06512: at line 9
01489. 00000 -  "result of string concatenation is too long"
*Cause:    String concatenation result is more than the maximum size.
*Action:   Make sure that the result is less than the maximum size.

Oook? Now only way this error could have happend is, that our join predicate was not pushed. LISTAGG can concatenate only up to 4000 characters and ID = 2 has 10 rows so it forces LISTAGG to overflow. Let’s check the execution plan:



Yeap … no push.  Now you can try to fight it but only way I found around it ether to say … screw array I’m going to use temporary table or do a partial step forward and change the query to this (IN can be ofcourse done as join):

DECLARE
  v_list_tab number_table := number_table();
  v_output VARCHAR2(16384);
BEGIN
  --
  v_list_tab.EXTEND;
  v_list_tab(v_list_tab.FIRST) := number_object(1);
  --
  INSERT INTO list_agg_overflow_output
  SELECT
    /*+ qb_name(main) */
    a.n,
    j.v
  FROM list_agg_overflow a,
       TABLE(v_list_tab) n_list,
       (SELECT
          /*+ qb_name(agg) push_pred */
          n,
          LISTAGG(v, ',') WITHIN GROUP (ORDER BY n) v
        FROM list_agg_overflow
        WHERE n IN (SELECT n FROM TABLE(v_list_tab))
        GROUP BY n
       ) j
  WHERE a.n = n_list.n
    AND a.n = j.n(+);
  ROLLBACK;
END;
/

Now there must be a reason for that. Why is Oracle pushing join predicate when we use temporary table and not if we use array? Now I believe it is an Optimizer bug. I’ve searched Oracle Support but I haven’t found anything about it. Might be that I used wrong key words. Anyway, to back my claim up I’m presenting except from Optimizer trace files which I made. 

Trace with temporary table:



Now with array:



Well of course it’s possible. He has done it before. The only reason that comes to my mind is, that Optimizer thinks it’s not possible, is a BUG in code ... or some nice FEATURE :).

However what you should take from this is not to not use arrays in SQL but to be more on guard when using them in some special cases.