Matching 3 of 4
Comments enabled. I *really* need your comment
From Stack Overflow:
Say I have a query like this:
SELECT * FROM my_table WHERE name = "john doe" AND phone = "8183321234" AND email = "johndoe@yahoo.com" AND address = "330 some lane"But say I only need 3 conditions out of the 4 to match.
I know I can write a very long query with several
OR
's, but I was wondering if there was a feature for this?
A nice and elegant solution proposed by Tomalak says:
SELECT * FROM my_table WHERE CASE WHEN name = "john doe" THEN 1 ELSE 0 END + CASE WHEN phone = "8183321234" THEN 1 ELSE 0 END + CASE WHEN email = "johndoe@yahoo.com" THEN 1 ELSE 0 END + CASE WHEN address = "330 some lane" THEN 1 ELSE 0 END >= 3
, but, as noted by the author, it does not use the indexes.
Since the values seem to be quite selective, it will be better to use indexes on them.
Let's create sample tables:
Table creation details
CREATE TABLE filler ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT ) ENGINE=Memory; CREATE TABLE t_info ( id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, name VARCHAR(200) NOT NULL, phone VARCHAR(20) NOT NULL, email VARCHAR(200) NOT NULL, address VARCHAR(200) NOT NULL, KEY ix_info_name (name), KEY ix_info_phone (phone), KEY ix_info_email (email), KEY ix_info_address (address) ) ENGINE=InnoDB DEFAULT CHARSET=utf8; DELIMITER $$ CREATE PROCEDURE prc_filler(cnt INT) BEGIN DECLARE _cnt INT; SET _cnt = 1; WHILE _cnt <= cnt DO INSERT INTO filler SELECT _cnt; SET _cnt = _cnt + 1; END WHILE; END $$ DELIMITER ; START TRANSACTION; CALL prc_filler(1000000); COMMIT; INSERT INTO t_info (id, name, phone, email, address) SELECT id, CONCAT( CASE FLOOR(RAND(20090417) * 10) WHEN 0 THEN 'Alice' WHEN 1 THEN 'Bob' WHEN 2 THEN 'Chris' WHEN 3 THEN 'Dave' WHEN 4 THEN 'Eve' WHEN 5 THEN 'Frank' WHEN 6 THEN 'Gill' WHEN 7 THEN 'Hugh' WHEN 8 THEN 'Ian' WHEN 9 THEN 'Jon' END, ' ', CASE FLOOR(RAND(20090417 << 1) * 10) WHEN 0 THEN 'Angela' WHEN 1 THEN 'Bracco' WHEN 2 THEN 'Chianese' WHEN 3 THEN 'de Matteo' WHEN 4 THEN 'Enzo' WHEN 5 THEN 'Falco' WHEN 6 THEN 'Gandolfini' WHEN 7 THEN 'Hadria' WHEN 8 THEN 'Imperioli' WHEN 9 THEN 'James' END ), CONCAT('+155580000', 10 + FLOOR(RAND(20090417 << 2) * 90)), CONCAT(FLOOR(RAND(20090417 << 3) * 100), '@example.com'), CONCAT(FLOOR(RAND(20090417 << 4) * 100), ' North Lane') FROM filler; [/sourcecode] </div> Here we have <strong>100</strong> distinct values on each of the fields (except phone, which has <strong>90</strong>). The straightforward query: SELECT * FROM t_info t WHERE CASE WHEN name = 'Eve Chianese' THEN 1 ELSE 0 END + CASE WHEN phone = '+15558000042' THEN 1 ELSE 0 END + CASE WHEN email = '42@example.com' THEN 1 ELSE 0 END + CASE WHEN address = '42 North Lane' THEN 1 ELSE 0 END >= 3
id | name | phone | address | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
176846 | Eve Chianese | +15558000066 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
445548 | Eve Chianese | +15558000099 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
988383 | Alice de Matteo | +15558000042 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 rows fetched in 0.0002s (3.0312s) |
uses full table scan:
EXPLAIN EXTENDED SELECT * FROM t_info t WHERE CASE WHEN name = 'Eve Chianese' THEN 1 ELSE 0 END + CASE WHEN phone = '+15558000042' THEN 1 ELSE 0 END + CASE WHEN email = '42@example.com' THEN 1 ELSE 0 END + CASE WHEN address = '42 North Lane' THEN 1 ELSE 0 END >= 3
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | t | ALL | 1000204 | 100.00 | Using where | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 row fetched in 0.0003s (0.0030s) |
and runs for more than 3 seconds.
To enable indexes we need:
- Make a
range scan
of each of the conditions UNION ALL
the resultsets (it'sUNION ALL
the resultsets, notUNION
all the resultsets!)COUNT(*)
the matches grouped byid
- Filter out those
id
's withCOUNT(*) < 3
- Join with the original table
Here's the query:
SELECT i.* FROM ( SELECT id, COUNT(*) FROM ( SELECT id FROM t_info t WHERE name = 'Eve Chianese' UNION ALL SELECT id FROM t_info t WHERE phone = '+15558000042' UNION ALL SELECT id FROM t_info t WHERE email = '42@example.com' UNION ALL SELECT id FROM t_info t WHERE address = '42 North Lane' ) q GROUP BY id HAVING COUNT(*) >= 3 ) dq JOIN t_info i ON i.id = dq.id
id | name | phone | address | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
176846 | Eve Chianese | +15558000066 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
445548 | Eve Chianese | +15558000099 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
988383 | Alice de Matteo | +15558000042 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 rows fetched in 0.0002s (0.1472s) |
This runs much faster (0.15 seconds) and of course uses all four indexes:
EXPLAIN EXTENDED SELECT i.* FROM ( SELECT id, COUNT(*) FROM ( SELECT id FROM t_info t WHERE name = 'Eve Chianese' UNION ALL SELECT id FROM t_info t WHERE phone = '+15558000042' UNION ALL SELECT id FROM t_info t WHERE email = '42@example.com' UNION ALL SELECT id FROM t_info t WHERE address = '42 North Lane' ) q GROUP BY id HAVING COUNT(*) >= 3 ) dq JOIN t_info i ON i.id = dq.id
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | PRIMARY | <derived2> | ALL | 3 | 100.00 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 | PRIMARY | i | eq_ref | PRIMARY | PRIMARY | 4 | dq.id | 1 | 100.00 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
2 | DERIVED | <derived3> | ALL | 40977 | 100.00 | Using temporary; Using filesort | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 | DERIVED | t | ref | ix_info_name | ix_info_name | 602 | 12450 | 100.00 | Using where; Using index | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
4 | UNION | t | ref | ix_info_phone | ix_info_phone | 62 | 8330 | 100.00 | Using where; Using index | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
5 | UNION | t | ref | ix_info_email | ix_info_email | 602 | 10048 | 100.00 | Using where; Using index | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
6 | UNION | t | ref | ix_info_address | ix_info_address | 602 | 9616 | 100.00 | Using where; Using index | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
UNION RESULT | <union3,4,5,6> | ALL | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
8 rows fetched in 0.0008s (0.1483s) |
As was proposed by Brent Baisley, it's actually possible to make this query more simple by using INDEX MERGE
:
EXPLAIN EXTENDED SELECT * FROM t_info t WHERE ( name = 'Eve Chianese' OR phone = '+15558000042' OR email = '42@example.com' OR address = '42 North Lane' ) AND CASE WHEN name = 'Eve Chianese' THEN 1 ELSE 0 END + CASE WHEN phone = '+15558000042' THEN 1 ELSE 0 END + CASE WHEN email = '42@example.com' THEN 1 ELSE 0 END + CASE WHEN address = '42 North Lane' THEN 1 ELSE 0 END >= 3
id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | SIMPLE | t | index_merge | ix_info_name,ix_info_phone,ix_info_email,ix_info_address | ix_info_name,ix_info_phone,ix_info_email,ix_info_address | 602,62,602,602 | 40444 | 100.00 | Using union(ix_info_name,ix_info_phone,ix_info_email,ix_info_address); Using where | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 row fetched in 0.0004s (0.0032s) |
Unfortunately, as I already mentioned in one of the previos articles, INDEX MERGE
implementation is not very efficient in MySQL, that's why this query is more slow than that with UNION ALL
:
SELECT * FROM t_info t WHERE ( name = 'Eve Chianese' OR phone = '+15558000042' OR email = '42@example.com' OR address = '42 North Lane' ) AND CASE WHEN name = 'Eve Chianese' THEN 1 ELSE 0 END + CASE WHEN phone = '+15558000042' THEN 1 ELSE 0 END + CASE WHEN email = '42@example.com' THEN 1 ELSE 0 END + CASE WHEN address = '42 North Lane' THEN 1 ELSE 0 END >= 3
id | name | phone | address | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
176846 | Eve Chianese | +15558000066 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
445548 | Eve Chianese | +15558000099 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
988383 | Alice de Matteo | +15558000042 | 42@example.com | 42 North Lane | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
3 rows fetched in 0.0002s (0.3427s) |
Summary: for a query that requries some of the indexed values to match, it's better to use UNION ALL
of single matches and filter by COUNT(*)
'ing the matched records.
Written by Quassnoi
April 17th, 2009 at 11:00 pm
Posted in MySQL
Leave a Reply
Subscribe
Subscribe by email
Contacts
Should I?
Yes. Feel free to ask questions and write me. An interesting question is a pleasure to answer, and I really enjoy receiving feedback
Recent articles
- Happy New Year: Diffusion Model image generator in about 700 lines of pure SQL
- Happy New Year: GPT in 500 lines of SQL
- Happy New Year: solving the Rubik’s Cube in SQL
- A good first word for Wordle
- Happy New Year: quantum computer emulator in SQL
Calendar
Archives
- December 2024
- December 2023
- December 2022
- January 2022
- December 2021
- December 2020
- December 2019
- December 2018
- December 2017
- December 2016
- December 2015
- December 2014
- July 2014
- December 2013
- October 2013
- September 2013
- August 2013
- July 2013
- June 2013
- May 2013
- March 2013
- January 2013
- December 2012
- December 2011
- June 2011
- April 2011
- March 2011
- February 2011
- December 2010
- November 2010
- October 2010
- September 2010
- August 2010
- July 2010
- June 2010
- May 2010
- April 2010
- March 2010
- February 2010
- January 2010
- December 2009
- November 2009
- October 2009
- September 2009
- August 2009
- July 2009
- June 2009
- May 2009
- April 2009
- March 2009
- February 2009