About Archive Tags RSS Feed

 

node.js is kicking me

10 September 2013 21:50

Today I started hacking on a re-implementation of my BlogSpam service - which tests that incoming comments are SPAM/HAM - in node.js (blogspam.js)

The current API uses XML::RPC and a perl server, along with a list of plugins, to do the work.

Having had some fun and success with the HTTP+JSON mstore toy I figured I'd have a stab at making BlogSpam more modern:

  • Receive a JSON body via HTTP-POST.
  • Deserialize it.
  • Run the body through a series of Javascript plugins.
  • Return the result back to the caller via HTTP status-code + text.

In theory this is easy, I've hacked up a couple of plugins, and a Perl client to make a submission. But sadly the async-stuff is causing me .. pain.

This is my current status:

shelob ~/git/blogspam.js $ node blogspam.js
Loaded plugin: ./plugins/10-example.js
Loaded plugin: ./plugins/20-ip.js
Loaded plugin: ./plugins/80-sfs.js
Loaded plugin: ./plugins/99-last.js
Received submission: {"body":"

This is my body ..

","ip":"109.194.111.184","name":"Steve Kemp"} plugin 10-example.js said next :next plugin 20-ip.js said next :next plugin 99-last.js said spam SPAM: Listed in StopForumSpam.com

So we've loaded plugins, and each has been called. But the end result was "SPAM: Listed .." and yet the caller didn't get that result. Instead the caller go this:

shelob ~/git/blogspam.js $ ./client.pl
200 OK 99-last.js

The specific issue is that I iterate over every loaded-plugin, and wait for them to complete. Because they complete asynchronously the plugin which should be last, and just return "OK" , has executed befure the 80-sfs.js plugin. (Which makes an outgoing HTTP request).

I've looked at async, I've looked at promises, but right now I can't get anything working.

Meh.

Surprise me with a pull request ;)

| 6 comments

 

Comments on this entry

icon Jérémy Lal at 19:03 on 10 September 2013

I'm pretty sure what you do with "complete" variable is flawed...

icon Steve Kemp at 19:28 on 10 September 2013
http://www.steve.org.uk/

Yes, that's a symptom of the same problem.

The testJSON method can return before the actual testing has completed - so setting that, which should skip further plugins, just doesn't work.


icon Steven C. at 19:59 on 10 September 2013

This seems a slight misuse of HTTP status codes. If you deny someone access to your service, you'd want to return 403 Forbidden. If you processed their POST successfully, I think you should always return 200 OK. The body can then indicate the status with a human-readable description. Something more structured would also allow you to add more fields in the future (e.g. confidence score).

But then text/plain doesn't seem ideal. Since this is node.js I would think JSON is the obvious choice:

{ "code": 0, "comment": "plugin 10-example.js said..." }

If you can make promises about the output formatting (e.g. status code will always be the first field, and on the first line), the parser can quite dumb and not have to understand JSON; something like:

$code = $response->decoded_content =~ m/\d+,/;

You're still able to add more fields, or structure it more, e.g. maybe plugin output could be an array of strings.

icon Steve Kemp at 20:14 on 10 September 2013
http://www.steve.org.uk/

Good points, both of them.

The main XML::RPC API at the moment just returns a string of the form:

[OK|SPAM|ERR]:Text ..

I liked the idea of using the HTTP-status code because it is trivial to test against, and clearly I should return 500 on error, or 405 on invalid submission method.

I guess I should return JSON, as you suggest:

{"result":"SPAM|OK","reason":"Missing field 'foo'","version":"0.1" }

I will make the appropriate changes. I'm just getting the plugins written to see how hard it can be, then I'll fire incoming comments at both servers in parallel and compare CPU/RAM usage and the validity of the results.

icon Steven C. at 17:39 on 11 September 2013

Just curious if changing the output format inadvertently fixed whatever the problem was?

icon Steve Kemp at 17:45 on 11 September 2013
http://www.steve.org.uk/

No, the initial problem with async stuff was kindly fixed by Vincent Meurisse - he submitted a pleasant surprise!