About Archive Tags RSS Feed

 

International character sets and encodings are hard.

26 July 2013 21:50

Today I've made the 0.15 release of lumail, which has several fixups and cleanups.

The previous release included a rewrite of the scrolling code, courtesy of kain88-de. This release fixes a few corner cases in that update which caused empty messages/Maildirs to be highlighted - operating on such ghost-entries would cause a segfault. Oops.

I've received several more great contributions from 7histle, and trou and I'm very happy with the state of the code and the usefulness of the application.

The biggest outstanding issue is RFC 2047 header decoding. Converting subject/to/from fields to readable versions of their encoded form:

Subject: =?utf-8?Q?Blipfoto=20=2D=20Introducing=20the=20all=2Dnew=20Bli

This is annoying because I'm using mimetic for handling all MIME-related code, and this doesn't seem to offer the facilities that I need.

The current plan is to use the RFC-2047 handling from vmime, but I've fought with that library unsucessfully for two days now - and a further complication is that the library is included in Squeeze/Sid, but not the stable release of Debian.

In conclusion I still regard the client as complete, because I'm using it exclusively and I rarely get "foreign" mails. But there is one more push required to fix all the outstanding bugs which generall boil down to:

  • Decode headers properly.
  • Ensure all our input/output is in UTF-8.

Randomly I'm wondering if I can call out to Lua to do the header decoding. Add "on_header_field()" and display the results. So today I'll be looking at how sensible that is, probably not very.

| 4 comments

 

Comments on this entry

icon Michael Stapelberg at 16:29 on 26 July 2013

I see the issues with emails are starting to trickle in. In my experience, this will only get worse. I was involved with sup-mail, very briefly its successor heliotrope, somewhat more with alot, and I am glad I don’t have to deal with all these issues any more :).

Don’t say nobody warned you, but best of luck with your mail client.

icon Steve Kemp at 16:42 on 26 July 2013
http://www.steve.org.uk/

I'm optimistic with the right framework for MIME and these kind of encoding issues, that most of the hard problems should have been solved for me.

That might be naive, so long as it is "powerful" or "useful" enough for the differences to outweight the drawbacks I'm content.

icon Nux at 11:09 on 27 July 2013
http://www.nux.ro

Thanks for the updates!
RPMs for RHEL 6 and clones in my repo, e.g.
http://li.nux.ro/download/nux/dextop/el6/x86_64/lumail-0.15-1.el6.nux.x86_64.rpm

icon Marcos Dione at 07:26 on 29 July 2013
http://www.grulic.org.ar/~mdione/glob/

Well, actually you have to make sure that all your output is in utf-8, but that you have to check the input for it's encoding. Most of the time you will find it in the headers (f.i., Content-Type: text/plain; charset=ISO-8859-1) and sometimes you will recieve a mail from a badly implemented webmail and you'll have to guess it; there are libraries for that.