For years you have asked it questions. For years it has answered, faster than seemed reasonable. Today, we are pleased to introduce a name that better reflects the relationship: ClickHows — formerly ClickHouse — your handy, columnar, in-pocket how-to guide for matters of data.
a.A column-oriented OLAP database, written in C++, capable of answering questions about a billion rows in roughly the time it takes to blink twice.
b.The most polite way to mispronounce "ClickHouse" within earshot of a partner, in-law, or board member who has stopped listening at "Click."
c.Cf. ClickHaus (architectural), ClickHaiku (poetic), ClickHowdy (Texan), ClickHowsoever (legal).
It began, as most useful things do, with someone needing to count something quickly. The year was 2009. The thing being counted was clicks. The place was a search engine in Moscow that no longer wishes to be named. The deadline, as usual, was Tuesday.
I wrote, at first, a small program. It read columns instead of rows, because rows seemed to me wasteful in the way a person who reads only the first sentence of every paragraph still has to turn every page. The program was fast. People asked if it could be faster. I said yes, and then I was obliged to make it so.
Sixteen years later, the program is still fast. It is, by some measures, between one hundred and one thousand times faster than the things it sits next to in benchmark tables. I take no special pride in this; I am simply unwilling to make it slower, having gone to the trouble.
People keep mispronouncing the name. I have come to find this charming. "ClickHouse" was always a slightly awkward construction — a verb wedded to a building — and we knew, in our hearts, that it was destined for misadventure on the lips of partners, parents, and customers in a hurry.
So here we are. ClickHows. An almanac, of sorts, for a database that is, when you really think about it, an almanac of sorts. Same engine. Same C++. Same merge tree. Same kettle. New cover.
If you would like to know how something is done in your data — how many, how often, how quickly, how come — we will be at the desk. The kettle is on.
My dashboards take eleven seconds to load. My CTO has begun sighing audibly. Have I done something wrong?
Eleven seconds is a long time to ask a colleague to wait. The good news: in this house, we measure in milliseconds, and we generally have time left over for a biscuit. Drop your row-store, take up a column-store, and your CTO will find something else to sigh about — possibly the weather, more likely the roadmap. With warm regards, The Editors.
A column-store reads only the columns it needs.
Your row-store reads the whole row, every time. Imagine reaching into the pantry for salt and being asked to take down the entire shelf, including the flour, the lentils, and a small jar of capers nobody remembers buying. That is a row-store. We are not that.
My investors keep asking what database we use. "ClickHouse" sounds like a furniture store. Will "ClickHows" sound any better in a board meeting?
It will sound considerably better, because it implies you know how things are done. "How" is the most board-meeting-friendly word in the English language. Pair it with a verb of your choosing — "ClickHows scales," "ClickHows queries," "ClickHows answers" — and you will find decks practically write themselves. Yours in advisory tone, The Editors.
"How" outperforms "House" in 14 of 16 board-deck word-clouds.
Source: a study we just made up while writing this column. Methodology available upon request, though we will not be answering any.
"We didn't change the engine. We just changed the verb."
| Place | Entrant | Class | Rows / sec | Time to finish |
|---|---|---|---|---|
| i. | ClickHows | Column-store, vectorized | 1,043,000,000 | 0.094 s |
| ii. | A Well-Regarded Warehouse | Cloud, columnar | 81,200,000 | 1.21 s |
| iii. | A Patient Postgres | Row-store, transactional | 9,800,000 | 10.2 s |
| iv. | A Spreadsheet, Bravely | Open in another tab | ~ 240,000 | 7m 04s ⚠ |
| v. | A Loop, Hand-Written | Python, sincere | 3,400 | — see Tuesday |
All times measured on the same modest server, sober, single kettle, room temperature. Your mileage may, of course, vary; ours rarely does.
Light ingest from the south-east, moderate scans through midday. Compression high. Possibility of a brief replication shower around tea-time. ZooKeeper has retired; Keeper takes the watch.
Serves: One analyst & their CTO.
Time: Less than the kettle.
Tip: a pinch of PRIMARY KEY never hurts. A pinch of OFFSET, however, often does.
Survey of 1,200 partners. Margin of error: one eye-roll either way.
All anagrams of "ClickHouse." None of them survived the meeting.
Same hostnames, same ports, same drivers. Your DSN does not need to know anything has changed; in fact, do not tell it.
Begin with what you'd like to know. Most queries, like most letters, are improved by knowing what they're for before you start.
During which time, you may stretch, sip tea, or remember an unrelated thing you meant to do on Wednesday. We will be done first.
Returned in tidy rows, exactly the size you asked for. Suitable for dashboards, for memos, and for ending a meeting fifteen minutes early.