EXPLAIN EXTENDED

How to create fast database queries

Archive for the ‘Oracle’ Category

Distinct pairs

Comments enabled. I *really* need your comment

From Stack Overflow:

I have this query:

SELECT  DISTINCT id, name
FROM    table1

For a given id, the name will always be the same.

Both fields are indexed. There's no separate table that maps the id to the name.

The table is very large (10,000,000 rows), so the query could take some time.

This query is very fast, since it's indexed:

SELECT  DISTINCT id
FROM    table1

Likewise for this query:

SELECT  DISTINCT id
FROM    table1

Assuming I can't get the database structure changed (a very safe assumption), what's a better way to structure the first query for performance?

The fastest way would of course be building an index on (id, name).

But since we (for whatever reason) cannot change the database structure, it's not an option.

Author, nevertheless, mentioned that for any given id, the name will always be the same. This can help us.

Let's create a sample table:
Read the rest of this entry »

Written by Quassnoi

June 2nd, 2009 at 11:00 pm

Posted in Oracle

Converting currencies

Comments enabled. I *really* need your comment

When developing a financial application, one needs to deal with currencies and convert from one currency to another.

There may be different scenarios, but for accounting purposes the currency rates set and published by appropriate official regulator once a day are used most often.

A typical task looks like this: given a set of transactions in one currency, get an appropriate amount in another currency.

Let's create sample tables and see how it may be done:
Read the rest of this entry »

Written by Quassnoi

May 27th, 2009 at 11:00 pm

Posted in Oracle

Genealogy: finding a descendant

Comments enabled. I *really* need your comment

In the previous article I described an efficient way to build a genealogy with a hierarchical query in Oracle:

Now, I'd like to describe a query to check if one entity in the table is a descendant on another one.

Let's create a sample table:
Read the rest of this entry »

Written by Quassnoi

May 25th, 2009 at 11:00 pm

Posted in Oracle

Genealogy query on both parents

Comments enabled. I *really* need your comment

From Stack Overflow:

I'm trying to fetch a genealogy tree of animals from my Oracle database.

Here's the table:

Animal_ID Parent_Male_ID Parent_Female_ID

If I specify an animal, I can get all of its descendants (on the male side) using something like this:

SELECT  *
FROM    animal
START WITH
animal_id = 123
CONNECT BY
PRIOR animal_id = parent_male_id

I'm trying to find a way to extend this in such a way that if I specify an animal, it will fetch both parents and then will fetch all of their descendants.

Any thoughts?

This is possible to do with a same CONNECT BY query by providing a little more comlex condition for START WITH and CONNECT BY:

SELECT  DISTINCT(animal_id) AS animal_id
FROM    animal
START WITH
animal_id IN
(
SELECT  father
FROM    animal
WHERE   animal_id = 9500
UNION ALL
SELECT  mother
FROM    animal
WHERE   animal_id = 9500
)
CONNECT BY
PRIOR animal_id IN (father, mother)
ORDER BY
animal_id

However, my gut feeling says this query need performance checking.

Let's create a sample table:
Read the rest of this entry »

Written by Quassnoi

May 24th, 2009 at 11:00 pm

Posted in Oracle

Oracle: matching SYS_GUID’s

with one comment

Answering questions asked on the site.

Habib asks:

I have a table with a PRIMARY KEY defined as ConsumerID RAW(16) NOT NULL.

However, when I issue a query like this:

SELECT  *
FROM     Consumers
WHERE   ConsumerID = '1DFFC45DEDEB42B8B79B060D544C2ACB'

, it uses a TABLE ACCESS FULL and is very slow.

How can I improve this query?

Defining a PRIMARY KEY column as RAW(16) almost always means using system generated GUID's returned by SYS_GUID as PRIMARY KEY's.

Let's create a sample table and see what's going on:
Read the rest of this entry »

Written by Quassnoi

May 20th, 2009 at 11:00 pm

Posted in Oracle

Finding loops in a tree

with 4 comments

From Stack Overflow:

I have a view which matches up the needs and the wants, and discards the requests that cannot be fulfilled by any user, that looks like this:

Item Need Have
1 Bob George
2 George Herbie
3 Herbie Bob
4 Larry Wally
5 Wally Larry
6 John George

I am now trying to write a query where I can calculate the number of permutations of each user having their requested needs fulfilled.

For instance, in the example above, Bob, George, and Herbie can together fulfill each other's needs, as can Larry and Wally, but John cannot, so the total count would be 2.

This task is actually about finding loops.

Whenever there is a set of users that fulfill each other's needs, there is a loop in the tree.

We can assume that either of the columns (need or have) is unique so there there is always a single path in the corresponding direction which ends with at most one loop.

It's not stated in the task, but otherwise it would be an equivalent of Clique problem which is NP complete and hardly solvable with SQL.

Now, let's create a sample table:
Read the rest of this entry »

Written by Quassnoi

May 18th, 2009 at 11:00 pm

Posted in Oracle

Oracle: returning deleted records

Comments enabled. I *really* need your comment

From Stack Overflow:

I need to return a rowset from an Oracle procedure, and then delete them in that same procedure.

Is there a neat way of doing this without temp tables?

Something like an in-memory cursor maybe?

Oracle has a nice feature of returning cursor variables from a stored procedure or an anonymous block.

As any cursor, the returned cursor is a handle to a stable rowset.

This means that it's not affected by any changes to database that were made after the cursor was open.

We can exploit this feature to achieve the required behavior.

Let's create a sample table:
Read the rest of this entry »

Written by Quassnoi

May 13th, 2009 at 11:00 pm

Posted in Oracle

Checking event dates

Comments enabled. I *really* need your comment

From Stack Overflow:

Suppose the following table structure in Oracle:

CREATE TABLE event (
id INTEGER,
start_date DATE,
end_date DATE
)

Is there a way to query all of the events that fall on a particular day of the week?

For example, I would like to find a query that would find every event that falls on a Monday.

Figuring out if the start_date or end_date falls on a Monday is easy, but I'm not sure how to find it out for the dates between.

This is one of the range predicates which are very unfriendly to plain B-Tree indexes.

But even if there would be a range friendly index (like R-Tree), that would hardly be an improvement. Monday's make up 14.3% of all days, that means that an index if there were any, would have very low selectivity even for one-day intervals.

And if the majority of intervals last for more than one day, the selectivity of the condition yet decreases: 86% of 6-day intervals have a Monday inside.

Given the drawbacks of index scanning and joining on ROWID, we can say that a FULL TABLE SCAN will be a nice access path for this query, and we just need to represent it as an SQL condition (without bothering for its sargability)

We could check that a Monday is between end_date's day-of-week number and the range length's offset from this number:

SELECT  *
FROM    "20090507_dates".event
WHERE   6 BETWEEN MOD(start_date - TO_DATE(1, 'J'), 7) AND MOD(start_date - TO_DATE(1, 'J'), 7) + end_date - start_date

This query converts each ranges into a pair of zero-based, Tuesday-based day of week offsets, and returns all records which have day 6 (a Monday) inside the range.

Note that we don't use Oracle's TO_DATE('D') function here: starting day of week depends on NLS_TERRITORY which only leads to confusion.

Now, this query works but looks quite ugly. And if we will check for more complex conditions, it will become even uglier.

What if we need to find all ranges that contain a Friday, 13th? Or a second week's Thursday? The conditions will become unreadable and unmaintainable.

Can we do it in some more elegant way?

What if we just iterate over the days of the range and check each day for the condition? This should be much more simple than inventing the boundaries.

Let's create a sample table and try it:
Read the rest of this entry »

Written by Quassnoi

May 7th, 2009 at 11:00 pm

Posted in Oracle

Oracle: ROW_NUMBER vs ROWNUM

with 6 comments

From Stack Overflow:

I have an SQL query that looks something like this:

SELECT  *
FROM    (
SELECT  t.*, row_number() OVER (ORDER BY ID) rn
FROM    mytable t
)
WHERE   rn BETWEEN :start and :end

Essentially, it's the ORDER BY part that's slowing things down. If I were to remove it, the EXPLAIN cost goes down by an order of magnitude (over 1,000 times).

I've tried this:

SELECT  t.*, row_number() OVER (ORDER BY ID) rn
FROM    mytable t
WHERE   rownum BETWEEN :start and :end

, but this doesn't give correct results.

Is there any easy way to speed this up? Or will I have to spend some more time with the EXPLAIN tool?

First, just a quick reminder on how ROWNUM works. From Oracle's documentation:

Conditions testing for ROWNUM values greater than a positive integer are always false. For example, this query returns no rows:

SELECT  *
FROM    employees
WHERE   ROWNUM > 1

The first row fetched is assigned a ROWNUM of 1 and makes the condition false. The second row to be fetched is now the first row and is also assigned a ROWNUM of 1 and makes the condition false. All rows subsequently fail to satisfy the condition, so no rows are returned.

That's why the second query should look like this:

SELECT  *
FROM    (
SELECT  t.*, ROWNUM AS rn
FROM    mytable t
ORDER BY
paginator, id
)
WHERE   rn BETWEEN :start and :end

Now, let's see the performance. To do this, we'll create a sample table:
Read the rest of this entry »

Written by Quassnoi

May 6th, 2009 at 11:00 pm

Posted in Oracle

Selecting lowest value

Comments enabled. I *really* need your comment

From Stack Overflow:

Say I have the following data:

Name Value
Small 10
Medium 100
Large 1000

Imagine that these represent the volumes of boxes.

I have some items that I want to put in the boxes, and I want the smallest box possible.

I need an SQL query that will:

  1. Return the row with the smallest row greater than my query parameter
  2. If there is no such row, then return the largest row

I'm on Oracle 11g, so any special Oracle goodness is OK.

There certainly is a special Oracle goodness, and I'll cover it in this article.

Let's create the sample table and fill it with 1,000,000 rows to illustrate the point:
Read the rest of this entry »

Written by Quassnoi

April 24th, 2009 at 11:00 pm

Posted in Oracle