About Archive Tags RSS Feed

 

Hey, Ash, where are we?

17 September 2008 21:50

I'm currently fighting with MySQL. The following takes too long:

mysql> SELECT COUNT(id) FROM q_archive;
+-----------+
| COUNT(id) |
+-----------+
|   2738048 |
+-----------+
1 row in set (17.95 sec)

I would like it to take significantly less time, even with memcached being in use it gets hit too often. I've added an index to the table - but I didn't expect that to help, and I wasn't disappointed.

Ho hum.

Maybe another case where flat-files are best. Sure counting them would take a while, but once I've counted them I can just `cat sum`.

This is probably a case where tweaking memory of MySQL would help. But I'm fairly certain if I start messing with that I'll get into trouble with other parts of my site.

ObFilm: The Evil Dead

| 16 comments

 

Comments on this entry

icon Steve Kemp at 20:20 on 17 September 2008
mysql> select COUNT(id) FROM q_archive;
 ----------- 
| COUNT(id) |
 ----------- 
|   2752313 | 
 ----------- 
1 row in set (10.00 sec)
mysql> select COUNT(*) FROM q_archive; ---------- | COUNT(*) | ---------- | 2752598 | ---------- 1 row in set (7.73 sec)

I'm not sure if the speedup there is real though - it seems to be more a caching artifact.

icon Andrew Pollock at 20:42 on 28 September 2008
I went to a MySQL performance-hack related tech talk at Google once, and one thing I took away from it was that EXPLAIN SELECT COUNT(*) FROM foo; was significantly faster than SELECT COUNT(*) FROM foo;
I can't remember if it was as accurate though. Try it and see.
icon Alex at 20:01 on 19 September 2008
Just bear in mind that whilst you INSERT with MyISAM your SELECT queries will block. Not the same with InnoDB. Various tweaks can be applied to improve that such as low_priority_updates and concurrent_insert.
Sounds like MyISAM is more your cup of tea though.
icon Steve Kemp at 20:56 on 18 September 2008

Rob: I've converted from InnoDB to MyISAM and the performance has just improved considerably.

I will have to read more about the pros/cons of the two table types, but thanks very much for giving me the prod.

(The table in question gets about 50,000 inserts 4 times an hour. But otherwise is queried N times a second - so read performance, and the COUNT(id) query is more important to me than then insert performance.)

icon Rob Wilderspin at 07:42 on 18 September 2008
Hi Steve, I'm inferring that it's an InnoDB table, which means that MySQL has to scan the whole table to do the row count, whereas if it were MyISAM that meta-information is stored at all times. If this is important enough then ALTER the table to make it use the MyISAM storage engine, bearing in mind the trade-offs that that brings. Some extra detail: http://www.mysqlperformanceblog.com/2006/12/01/count-for-innodb-tables/ Rob
icon niol at 21:12 on 17 September 2008
Already said, but more discussion in there : http://www.mysqlperformanceblog.com/2007/04/10/count-vs-countcol/
icon Gustavo Noronha at 20:28 on 17 September 2008
Something is indeed seriously screwed up for a count to take that long.
Regarding your comment on using `cat sum', though, notice that you _can_ do something like that with a database; you only need to have a "static view", in more robust databases, or a real table that is updated from time to time, or by a trigger on every insert/delete, so no real gain in dropping the database for flat files there.
icon Chris Gregors at 18:58 on 17 September 2008

Something is horribly wrong with your instance. Here is a comparable select from a database of mine.

mysql> select count(id) from r_links;
 -----------
| count(id) |
 -----------
|   2984572 |
 -----------
1 row in set (0.00 sec)

Table Type == MyISAM

mysql> desc r_links;
 ---------- ------------ ------ ----- --------- ----------------
| Field    | Type       | Null | Key | Default | Extra          |
 ---------- ------------ ------ ----- --------- ----------------
| id       | bigint(20) | NO   | PRI | NULL    | auto_increment |
| hn       | text       | NO   | MUL |         |                |
| sa       | text       | NO   | MUL |         |                |
| la       | text       | NO   | MUL |         |                |
| lp       | text       | NO   |     |         |                |
| ra       | text       | NO   | MUL |         |                |
| rp       | text       | NO   |     |         |                |
| state    | text       | NO   |     |         |                |
| datetime | text       | NO   |     |         |                |
 ---------- ------------ ------ ----- --------- ----------------
9 rows in set (0.00 sec)
icon Steve Kemp at 19:13 on 17 September 2008

You could be right Chris.

The system is a little heavily loaded, but even so I'd like it to be faster. I'm just a little unsure where to start..

icon Scott Ashcroft at 19:51 on 17 September 2008
If it's a frequent operation why not add some triggers to keep track of the number of rows.
icon David Ulevitch at 19:53 on 17 September 2008
Something is terribly wrong, the row count is stored in the meta data and the mysql query parser knows that.
icon Ronaldo at 19:56 on 17 September 2008
Did you try count(1). I dont't remember exactly now, but I think it is supposed to be faster than count(*).
I am not sure that would solve your problem, since you are using count(id)
icon Steve Kemp at 19:58 on 17 September 2008

A collection of triggers might be a good solution. Thanks.

I'm going to tweak the memory settings a little first to see if that helps improve performance generally - it seems that many operations are taking longer than they should, but this is a pathological case.

Obviously something is screwed up but I'm not too familiar with these kind of problems...

icon Steve Kemp at 19:59 on 17 September 2008

Scott: COUNT(1) is about the same, and what I'm doing. I know that is faster than COUNT(*).

David - I didn't realize the row-count was generally available in the meta-data, though I guess I might have guessed, that suggests something more serious is wrong with my database setup.

icon Giuseppe Sacco at 20:05 on 17 September 2008
If I remember right count(field) is slower compared to count(*) since the first one have to check that field is not null (so it will probably access the index and data) while the second one could avoid to access data. You should also "analyze table q_archive".
Would you try if anything changes in your case?
icon Steve Kemp at 20:09 on 17 September 2008

Giuseppe - I'm happy to try both ways. Once I've finished running " optimize table q_archive;" I'll post what the difference is.