EXPLAIN EXTENDED

How to create fast database queries

Presentation layer

Comments enabled. I *really* need your comment

A question frequently asked by many web application developers:

Where do I build a presentation layer for my data?

Do I do it on SQL side or on PHP / ASP /whatever side?

This is a very interesting question.

A famous Chinese proverb (which like the majority of famous Chinese proverbs is invented by a Western journalist) says A picture is worth a thousand words.

A picture is worth a thousand words

A picture is worth a thousand words.
Image by zedzap

How do you think, why it is? What's so good in a picture?

Seems to be a strange question. A picture just tells more. If you look at a picture, you get an overall impression, which can be hardly expressed using words. Words just don't pull the right strings in your mind, those that a picture does. And the words are harder to read, they are more numerous. And a dozen of other reasons.

However, the main reason seems to be this: you cannot give a complete description of a picture using words.

But can it be so that a word is worth a thousand pictures?

Sure it can. And any book fan who had seen the screen version of his favorite work can confirm it.

From Douglas Adams's masterpiece The Hitchiker's Guide to the Galaxy:

A hole had just appeared in the Galaxy. It was exactly a nothingth of a second long, a nothingth of an inch wide, and quite a lot of million light years from end to end.

Or this one:

Five wild Event Maelstroms swirled in vicious storms of unreason and spewed up a pavement. On the pavement lay Ford Prefect and Arthur Dent gulping like half-spent fish.

Adams knew how to arrange his words so that they draw a perfect, vivid picture in the reader's mind.

The problem is that everyone gets his very own picture.

What do you imagine when reading of a storm of unreason? Can you show a nothingth of an inch wide on your fingers? Can you sketch some Vogon poetry to show how really bad it is?


Prostetnic Vogon Jeltz reads his poem. Scary, isn't it?

Prostetnic Vogon Jeltz reads his poem. Scary, isn't it?

Draw your own illustration of a Vogon's poem with a pencil, show it to your friends without telling them what did you draw and see if they can name it. Color it and see if it helps. Film it and get the Garth Jennings's pitiful excuse for a movie.

There is nothing you can do but to cite the author's message expressed in words. The picture won't help.

When describing a picture with words (or words with a picture), you recode. It's a one-way lossy recoding. You can neither describe a flower over the phone, nor tell a joke to a person who doesn't share a common language with you.

It's very hard to make a good screen version of Douglas Adams' works, it's perfectly plausible with, say, The Lord of the Rings. People just love to see the battles and the Nazguls and these kinds of stuff on a big screen with decent sound.

The difference is that a nothingth of an inch wide has no recognizable visual pattern, while Nazguls and battles do. Of course, Peter Jackson's version of a Nazgul differs much from the image you have built in your mind when first read the Tolkien's book, but you still can easily tell it by mere looking at it. This is also a lossy recoding, but at least it's two-way.

Peter Jackson can read a Tolkien's word and make a movie so that you can look at the movie and name this word.

OK. What does this all have to do with SQL and PHP?

The art of visual data presentation has very much in common with painting, acting and film making. You need to express information though the visual media. SQL databases operate with sets, human mind operates with visual images, and you need to make the latter out of the former.

Visual data is usually built using browsers these days, and browsers operate with HTML, which is a series of strings. So it's safe to say that to show people data in nice visual format, you just need to make a good series of strings. Just put your <table>'s, <img>'s and <div>'s in the right order, and you get a nice picture in you browser.

Sometimes you just need to make compromises. You cannot show millions of records at once and hope they will be understood. You need to aggregate your data and build monthly reports with detalization loaded on demand. You cannot fit hundred fields on one screen. You need to decide what will be shown in bold, what in red and what will be scrolled away. If you lost your database you wouldn't be able to restore it from the screenshot. A one-way lossy recoding.

Sometimes it's perfectly possible to show everything. Just like with Adams and Tolkien.

One thing, however, remains the same.

If your data are stored in a relational database, inevitably it is but a number of sets. To make a good visual data representation, you need to transform these sets into an appropriate series of strings, all these <table>'s, <img>'s and <div>'s.

To do this you need good tools.

  • When an actor acts, he reads the words LOVE, LUST or ENVY in the script, does poses and faces and makes you see the love, the lust or the envy. His face muscles are the tools he uses to make the transformation.
  • A sculptor uses a hammer and a chisel
  • A movie director's tools are cameramen, actors, a folding chair, a trumpet and a baseball cap

To transform the sets into the strings, you also have tools: SQL and PHP.

Take a chisel and remove everything that doesn't look like David

Take a chisel and remove everything that doesn't look like David
Image by fujibatocom

Let's see which language is good for which task.

SQL is a set-oriented language. It abstracts sets and operates on sets. A set in, a set out. There is almost nothing SQL cannot do with a set, if relational algebra allows that at all.

PHP can abstract a set too, as long as it is representable in form of hashes, arrays, scalar variables or their combinations. It can read sets, but it sucks at transforming them. A simple SQL one-liner that joins two sets is a whole page of PHP code, which is less readable, less flexible and less efficient.

SQL, on the other hand, sucks at string processing.

CamelCase a phrase in SQL Server? Oh, this is hard. You need to write a UDF, a loop in a UDF, then take the string, find all spaces, and no, there are no regular expressions, so you need too… well, enough, you get the idea.

Or just call PHP. CamelCase? PHP eats CamelCase for breakfast.

You should see it by now. Make SQL operate on sets and PHP operate on strings, arrays and hashes.

Now, some examples.

Example 1

I have John, Jim and Mary in my database. On my website, I need to output them as John, Jim and Mary

You have a set of John, Jim and Mary.

You need to get a string: <span class="names">John, Jim <em> and Mary</span> (you do use HTML and CSS, don't you?)

So you perform the following query on SQL side:

SELECT  name
FROM    names

, put the output as an array or hash, and implode the strings in PHP.

Example 2

I have 10,000 Johns, 5,000 Jims and 3,000 Marys in my database. On my website, I need to output them as 10,000 Johns, 5,000 Jims and 3,000 Marys

Converting 18,000 rows to three is a set operation. PHP would do it a single-threaded, interpreted, automatic-memory-managed way, which has to be coded manually. SQL does it in a blink of eye with a one-line query.

So you just make a query:

SELECT  name, COUNT(*)
FROM    names
GROUP BY
        name

, get an array of hashes, and let PHP cope with it.

Example 3

I have a name and a surname as table fields in my database. Name says John and surname says Smith. I need to output them as John Smith

Despite the fact all databases support string concatenation, you still output two fields and concatenate them in PHP.

Why? This is a very simple task. SQL and PHP can do it equally efficiently.

The data flows one way. SQLPHPHTMLscreen. And each act of data recoding adds some entropy.

Like in movies. One-way transformation, remember?

A script says: John gives Mary a thumb-up. Mary smiles and bows jokingly.

An actor playing John makes a fist, extends his thumb and displays the resulting finger figure to Mary. Mary smiles: she sees that John shows her a sign of approval, and she bends her head towards John to show him she likes that.

Now, the same action on Thai stage. A thumb up is an invitation to sex intercouse in Thai culture. Mary, being a decent girl, should express embarassement and indignation. A bow, on the other hand, means obedience and submissiveness.

The whole act fails.

Is it really an approval? Who knows!

Is it really an approval? Who knows!
Image by acrylica

It would be much better if the script would say something like: John displays approval of Mary's actions, Mary returns him a sign of gratitude, so that the actors and the director could themselves decide a best way to show.

Cultural differences apply to data presentation just as well as to common gestures. English speakers use a period as a decimal separator, while Russians use a comma. Germans put a given name before the surname, while Hungarians do the opposite. Italians append a question mark to the question, while Spanish speakers both append one and prepend an inverted one, etc.

Everything listed above is doable in SQL. However, it cannot be undone easily: an amount of entropy could have been added.

Who is John David Smith? Is he Smith, John Davis or Davis Smith, John?

This information is lost, just like in the script above.

Did the author prepare his play for American stage, so that John approved Mary's actions and she liked it? Or maybe John wanted to show Mary a full moon over the cornfield and wanted to make it clear for Thai people? The script says nothing of it, it just says a thumb-up and a bow. Everything else is a guess — an educated guess, probably, but still a guess.

Same thing, of course, applies to PHP to HTML conversion: you don't hardcode tags and styles, you rely on classes and CSS instead to add as few entropy as possible on the earlier stages of data transformation.

Let's summarize everything said above.

  • Use the right tool for right operation. Do set-based operations on SQL side, string transformations on PHP side.
  • When in doubt, use the tool which is closer to the end of the transformation path. Each transformation adds entropy. If in doubt, use PHP instead of SQL, unless you're absolutely sure PHP is a wrong tool for that.

Just pick a right tool and always delay things you can't undo.

Written by Quassnoi

August 16th, 2009 at 11:00 pm

Posted in Miscellaneous

Leave a Reply