From Stack Overflow:
There is a table that stores signal statuses that come from different devices.
- SS1 and SS2 signals are inserted to table in random times
- If either of SS1 and SS2 signal statuses is up, then the resulting signal should be up
- If both SS1 and SS2 signal statuses are down, then resulting signal should be down
I want to prepare a query that shows the result signal status changes according to SS1 and SS2 signals
Each record deals with only one signal type here: either SS1 or SS2. To obtain the signal statuses we should query the cumulative values of previous records.
If a record describes a change in SS2, we should query for the most recent change to SS1 that had been recorded so far to obtain the SS1's current status.
In systems other than Oracle, the previous value of a signal could be easily queried using subselects with
LIMIT clauses. But Oracle does not support correlated queries nested more than one level deep, and limiting a subquery result to a single record (which is required by a subquery) requires it (ORDER BY should be nested). This makes constructing such a subquery in Oracle quite a pain.
However, in Oracle, these things can be queries using analytics functions much more efficiently.
Let's create a sample table:
Table creation details
BEGIN DBMS_RANDOM.seed(20100120); END; / CREATE TABLE t_signal ( id NOT NULL, signal NOT NULL, status NOT NULL, ts NOT NULL ) AS SELECT level, CASE WHEN DBMS_RANDOM.value < 0.5 THEN 'SS1' ELSE 'SS2' END, ROUND(DBMS_RANDOM.value), TO_DATE('20.01.2010', 'dd.mm.yyyy') - level / 86400 FROM dual CONNECT BY level <= 1000000 / ALTER TABLE t_signal ADD CONSTRAINT pk_signal_id PRIMARY KEY (id) / CREATE INDEX ix_signal_ts_id ON t_signal (ts, id) /
This table contains 1,000,000 records filled with random states of random signals once per second.
Each record describes a state of a certain signal at a given moment of time. To find out the state of another signal at that moment of time we need to know the state held by the latest record for that signal.
To find out that state we can employ analytical function,
With a default
RANGE (that is
BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW), this function just gives the value of the expression held by the current row, and, therefore, is quite useless: it gives the same result as a plain expression would give, without any functions.
col is unique and defines the order of the rows, then
LAST_VALUE(expression) OVER (ORDER BY col) is just a quite expensive synonym for
However, this function's behavior can be changed by adding
IGNORE NULLS clause. This clause makes the function return the last value so far which is not a
Now, returning the cumulative value of any signal becomes quite simple. We should just just make two expressions which substitute
NULL instead of the signal status for the
wrong signals. The
LAST_VALUE (IGNORE NULLS) over these expressions will persist until rewritten by the new states of their corresponding signals.
Let's check it:
SELECT * FROM ( SELECT /*+ FIRST_ROWS */ s.*, DECODE(signal, 'SS1', status, NULL) AS exp1, DECODE(signal, 'SS2', status, NULL) AS exp2, LAST_VALUE(DECODE(signal, 'SS1', status, NULL) IGNORE NULLS) OVER (ORDER BY ts, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ss1, LAST_VALUE(DECODE(signal, 'SS2', status, NULL) IGNORE NULLS) OVER (ORDER BY ts, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ss2 FROM t_signal s ORDER BY ts, id ) s2 WHERE rownum <= 15
|15 rows fetched in 0.0012s (0.0007s)|
SELECT STATEMENT COUNT STOPKEY VIEW WINDOW NOSORT TABLE ACCESS BY INDEX ROWID, 20100120_signal.T_SIGNAL INDEX FULL SCAN, 20100120_signal.IX_SIGNAL_TS_ID
In the resultset above,
EXP2 show the changes in the signal states of
NULL's if the current record does not describe a change in the appropriate signals.
LAST_VALUE (IGNORE NULLS) over these expressions show their cumulative values. These values can be used to calculate the resulting signal state.
Note that using
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW made Oracle to use
WINDOW NOSORT and this query is instant, since the
LAST_VALUE can be buffered.
To calculate the resulting signal, we should just return the
ss2 (returned by
SELECT SUM(GREATEST(ss1, ss2)) FROM ( SELECT s.*, DECODE(signal, 'SS1', status, NULL) AS exp1, DECODE(signal, 'SS2', status, NULL) AS exp2, LAST_VALUE(DECODE(signal, 'SS1', status, NULL) IGNORE NULLS) OVER (ORDER BY ts, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ss1, LAST_VALUE(DECODE(signal, 'SS2', status, NULL) IGNORE NULLS) OVER (ORDER BY ts, id ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS ss2 FROM t_signal s ORDER BY ts, id ) s2
|1 row fetched in 0.0003s (3.0781s)|
SELECT STATEMENT SORT AGGREGATE VIEW WINDOW SORT TABLE ACCESS FULL, 20100120_signal.T_SIGNAL
For the sake of brevity, we select the
SUM of the resulting signal states which gives us the number of records with resulting signal up. As it should be, this number roughly amounts to 75% of the total number of records.
The query uses a single sort (which in case of the whole table is faster than traversing the index), buffers the results and completes in 3 seconds which is almost as fast as a plain query with
ORDER BY over the same dataset would complete.