Hey, Ash, where are we?

17 September 2008 21:50

I'm currently fighting with MySQL. The following takes too long:

mysql> SELECT COUNT(id) FROM q_archive;
+-----------+
| COUNT(id) |
+-----------+
|   2738048 |
+-----------+
1 row in set (17.95 sec)

I would like it to take significantly less time, even with memcached being in use it gets hit too often. I've added an index to the table - but I didn't expect that to help, and I wasn't disappointed.

Ho hum.

Maybe another case where flat-files are best. Sure counting them would take a while, but once I've counted them I can just `cat sum`.

This is probably a case where tweaking memory of MySQL would help. But I'm fairly certain if I start messing with that I'll get into trouble with other parts of my site.

ObFilm: The Evil Dead

Tags: mysql, random | 16 comments

Comments on this entry

Chris Gregors at 18:58 on 17 September 2008

Something is horribly wrong with your instance. Here is a comparable select from a database of mine.

mysql> select count(id) from r_links;
 -----------
| count(id) |
 -----------
|   2984572 |
 -----------
1 row in set (0.00 sec)

Table Type == MyISAM

mysql> desc r_links;
 ---------- ------------ ------ ----- --------- ----------------
| Field    | Type       | Null | Key | Default | Extra          |
 ---------- ------------ ------ ----- --------- ----------------
| id       | bigint(20) | NO   | PRI | NULL    | auto_increment |
| hn       | text       | NO   | MUL |         |                |
| sa       | text       | NO   | MUL |         |                |
| la       | text       | NO   | MUL |         |                |
| lp       | text       | NO   |     |         |                |
| ra       | text       | NO   | MUL |         |                |
| rp       | text       | NO   |     |         |                |
| state    | text       | NO   |     |         |                |
| datetime | text       | NO   |     |         |                |
 ---------- ------------ ------ ----- --------- ----------------
9 rows in set (0.00 sec)

Steve Kemp at 19:13 on 17 September 2008

You could be right Chris.

The system is a little heavily loaded, but even so I'd like it to be faster. I'm just a little unsure where to start..

Scott Ashcroft at 19:51 on 17 September 2008

If it's a frequent operation why not add some triggers to keep track of the number of rows.

David Ulevitch at 19:53 on 17 September 2008

Something is terribly wrong, the row count is stored in the meta data and the mysql query parser knows that.

Ronaldo at 19:56 on 17 September 2008

Did you try count(1). I dont't remember exactly now, but I think it is supposed to be faster than count(*).
I am not sure that would solve your problem, since you are using count(id)

Steve Kemp at 19:58 on 17 September 2008

A collection of triggers might be a good solution. Thanks.

I'm going to tweak the memory settings a little first to see if that helps improve performance generally - it seems that many operations are taking longer than they should, but this is a pathological case.

Obviously something is screwed up but I'm not too familiar with these kind of problems...

Steve Kemp at 19:59 on 17 September 2008

Scott: COUNT(1) is about the same, and what I'm doing. I know that is faster than COUNT(*).

David - I didn't realize the row-count was generally available in the meta-data, though I guess I might have guessed, that suggests something more serious is wrong with my database setup.

Giuseppe Sacco at 20:05 on 17 September 2008

If I remember right count(field) is slower compared to count(*) since the first one have to check that field is not null (so it will probably access the index and data) while the second one could avoid to access data. You should also "analyze table q_archive".
Would you try if anything changes in your case?

Steve Kemp at 20:09 on 17 September 2008

Giuseppe - I'm happy to try both ways. Once I've finished running " optimize table q_archive;" I'll post what the difference is.

Steve Kemp at 20:20 on 17 September 2008

mysql> select COUNT(id) FROM q_archive;
 ----------- 
| COUNT(id) |
 ----------- 
|   2752313 | 
 ----------- 
1 row in set (10.00 sec)


mysql> select COUNT(*) FROM q_archive;
 ---------- 
| COUNT(*) |
 ---------- 
|  2752598 | 
 ---------- 
1 row in set (7.73 sec)

I'm not sure if the speedup there is real though - it seems to be more a caching artifact.

Gustavo Noronha at 20:28 on 17 September 2008

Something is indeed seriously screwed up for a count to take that long.
Regarding your comment on using `cat sum', though, notice that you _can_ do something like that with a database; you only need to have a "static view", in more robust databases, or a real table that is updated from time to time, or by a trigger on every insert/delete, so no real gain in dropping the database for flat files there.

niol at 21:12 on 17 September 2008

Already said, but more discussion in there : http://www.mysqlperformanceblog.com/2007/04/10/count-vs-countcol/

Rob Wilderspin at 07:42 on 18 September 2008

Hi Steve, I'm inferring that it's an InnoDB table, which means that MySQL has to scan the whole table to do the row count, whereas if it were MyISAM that meta-information is stored at all times. If this is important enough then ALTER the table to make it use the MyISAM storage engine, bearing in mind the trade-offs that that brings. Some extra detail: http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/ Rob

Steve Kemp at 20:56 on 18 September 2008

Rob: I've converted from InnoDB to MyISAM and the performance has just improved considerably.

I will have to read more about the pros/cons of the two table types, but thanks very much for giving me the prod.

(The table in question gets about 50,000 inserts 4 times an hour. But otherwise is queried N times a second - so read performance, and the COUNT(id) query is more important to me than then insert performance.)

Alex at 20:01 on 19 September 2008

Just bear in mind that whilst you INSERT with MyISAM your SELECT queries will block. Not the same with InnoDB. Various tweaks can be applied to improve that such as low_priority_updates and concurrent_insert.
Sounds like MyISAM is more your cup of tea though.

Andrew Pollock at 20:42 on 28 September 2008

I went to a MySQL performance-hack related tech talk at Google once, and one thing I took away from it was that EXPLAIN SELECT COUNT(*) FROM foo; was significantly faster than SELECT COUNT(*) FROM foo;
I can't remember if it was as accurate though. Try it and see.

Hey, Ash, where are we?

Comments on this entry

Recent Posts