About Archive Tags RSS Feed

 

If this goes well I have a new blog engine

17 September 2014 21:50

Assuming this post shows up then I'll have successfully migrated from Chronicle to a temporary replacement.

Chronicle is awesome, and despite a lack of activity recently it is not dead. (No activity because it continued to do everything I needed for my blog.)

Unfortunately though there is a problem with chronicle, it suffers from a bit of a performance problem which has gradually become more and more vexing as the nubmer of entries I have has grown.

When chronicle runs it :

  • It reads each post into a complex data-structure.
  • Then it walks this multiple times.
  • Finally it outputs a whole bunch of posts.

In the general case you rebuild a blog because you've made a entry, or received a new comment. There is some code which tries to use memcached for caching, but in general chronicle just isn't fast and it is certainly memory-bound if you have a couple of thousand entries.

Currently my test data-set contains 2000 entries and to rebuild that from a clean start takes around 4 minutes, which is pretty horrific.

So what is the alternative? What if you could parse each post once, add it to an SQLite database, and then use that for writing your output pages? Instead of the complex data-structure in-RAM and the need to parse a zillion files you'd have a standard/simple SQL structure you could use to build a tag-cloud, an archive, & etc. If you store the contents of the parsed-blog, along with the mtime of the source file you can update it if the entry is changed in the future, as I sometimes make typos which I only spot once Ive run make steve on my blog sources.

Not surprisingly the newer code is significantly faster if you have 2000+ posts. If you've imported the posts into SQLite the most recent entries are updated in 3 seconds. If you're starting cold, parsing each entry, inserting it into SQLite, and then generating the blog from scratch the build time is still less than 10 seconds.

The downside is that I've removed features, obviously nothing that I use myself. Most notably the calendar view is gone, as is the ability to use date-based URLs. Less seriously there is only a single theme, which is what is used upon this site.

In conclusion I've written something last night which is a stepping stone between the current chronicle and chronicle2 which will appear in due course.

PS. This entry was written in markdown, just because I wanted to be sure it worked.

| 9 comments

 

Comments on this entry

icon Steve Kemp at 17:23 on 17 September 2014
http://steve.org.uk/.

This is a test comment.

icon Inigo at 17:47 on 17 September 2014
http://inigo.me

I did use for a time my own blog engine, written in bash, as a hobby exercise.

I did reach the same issue after add elements (functions to generate breadcrumb navigation, tags, etc), to many re-walks and re-computation.

As I was limited to bash, I did not consider SQLite, and did take other approach than you: a file with the sums of the source files.

If there is no change in the source, it's not recompiled.

Only a few elements where recompiled when a file did change (edit, deletion or new article). This is, the file itself, the archives (by date, and by tag) and the sitemap.xml.

Well, and the needed sums, in the sums file.

To force a compilation of the full site, I did just delete the sums file.

I did put my site offline not so long ago, but the improvements using this approach where high, thought I did never test with two thousand entries. I just did test a few faked entries to test the yearly/monthly archives.

Have fun with chronicle2. SQLite rocks.

icon Steve Kemp at 17:56 on 17 September 2014
http://steve.org.uk/.

Yes SQLite is awesome, I've been using it in the past as a temporary store for other date - asql - Apache logfile utility, and it worked well there.

Avoiding the need to rebuild pages that shouldn't have changed is an obvious optimization and with this new code it is working incredibly well.

There is a potential concern that tags on a new entry might not get included, but beyond that I think I've covered all the cases except for deletion of a blog entry. (In the case where a post has been published and then later deleted I think it is reasonable to insist the user runs "rm -rf output/" and rebuilds from a clean slate.)

icon Anonymous at 18:35 on 17 September 2014

Why not just use ikiwiki?

icon Steve Kemp at 18:37 on 17 September 2014
http://steve.org.uk/.

Although ikiwiki could do the job, and is themable, I still feel that a dedicated blog engine is better than adapting something else.

Nanoc, jekyll, etc, all cater to this niche, and don't seem to struggle, for example.

icon Anonymous at 01:39 on 18 September 2014

> dedicated blog engine

I'm curious what you mean by this, given that blogging was one of the main use cases for which ikiwiki was designed. It is a blog engine.

I'm not trying to push you in another direction; I'm genuinely asking what features ikiwiki doesn't have that you need.

icon Steve Kemp at 06:26 on 18 September 2014
http://steve.org.uk/.

It isn't a matter of features, it's a matter of purpose.

I suspect that you're right ikiwiki can probably do anything I want. But what I want is not a wiki that can be used in a flexible fashion, but something that can generate a blog:

  • Simple to theme.
  • Simple to update.
  • Simple to allow users to comment on.

ikiwiki is a wiki compiler. Almost every blog I've seen using it looks more like a wiki than a blog. Even nicely themed sites still have a non-blog feel.

I suspect I'm not being terribly convincing here, but I see a considerable difference between something that is capable of doing blogs (+ more) and something that solely does blogs - and uses builds the "standard links" that people would expect (tags, archive, etc).

icon unuseless at 08:52 on 22 September 2014

Have you seen http://github.com/jgm/yst from pandoc fame? Also a static site generator using sqlite.

icon Steve Kemp at 10:04 on 23 September 2014
http://steve.org.uk/.

I've not seen yst before, but it does look interesting.

Most of the reason for avoiding changing to anything else was because I've obviously got lots of entries written, and comments stored, so I wanted to have something compatible with that existing system.

Happily it only took a couple of days to go from prototype to replacement so I'm happy to leave it at that now.