Detecting fraudulent signups?

Monday, 21 November 2016

I run a couple of different sites that allow users to sign-up and use various services. In each of these sites I have some minimal rules in place to detect bad signups, but these are a little ad hoc, because the nature of "badness" varies on a per-site basis.

I've worked in a couple of places where there are in-house tests of bad signups, and these usually boil down to some naive, and overly-broad, rules:

  • Does the phone numbers' (international) prefix match the country of the user?
  • Does the postal address supplied even exist?

Some places penalise users based upon location too:

  • Does the IP address the user submitted from come from TOR?
  • Does the geo-IP country match the users' stated location?
  • Is the email address provided by a "free" provider?

At the moment I've got a simple HTTP-server which receives a JSON post of a new users' details, and returns "200 OK" or "403 Forbidden" based on some very very simple critereon. This is modeled on the spam detection service for blog-comments server I use - something that is itself becoming less useful over time. (Perhaps time to kill that? A decision for another day.)

Unfortunately this whole approach is very reactive, as it takes human eyeballs to detect new classes of problems. Code can't guess in advance that it should block usernames which could collide with official ones, for example allowing a username of "admin", "help", or "support".

I'm certain that these systems have been written a thousand times, as I've seen at least five such systems, and they're all very similar. The biggest flaw in all these systems is that they try to classify users in advance of them doing anything. We're trying to say "Block users who will use stolen credit cards", or "Block users who'll submit spam", by correlating that behaviour with other things. In an ideal world you'd judge users only by the actions they take, not how they signed up. And yet .. it is better than nothing.

For the moment I'm continuing to try to make the best of things, at least by centralising the rules for myself I cut down on duplicate code. I'll pretend I'm being cool, modern, and sexy, and call this a micro-service! (Ignore the lack of containers for the moment!)

| No comments

 

If your code accepts URIs as input..

Monday, 12 September 2016

There are many online sites that accept reading input from remote locations. For example a site might try to extract all the text from a webpage, or show you the HTTP-headers a given server sends back in response to a request.

If you run such a site you must make sure you validate the schema you're given - also remembering to do that if you're sent any HTTP-redirects.

Really the issue here is a confusion between URL & URI.

The only time I ever communicated with Aaron Swartz was unfortunately after his death, because I didn't make the connection. I randomly stumbled upon the html2text software he put together, which had an online demo containing a form for entering a location. I tried the obvious input:

file:///etc/passwd

The software was vulnerable, read the file, and showed it to me.

The site gives errors on all inputs now, so it cannot be used to demonstrate the problem, but on Friday I saw another site on Hacker News with the very same input-issue, and it reminded me that there's a very real class of security problems here.

The site in question was http://fuckyeahmarkdown.com/ and allows you to enter a URL to convert to markdown - I found this via the hacker news submission.

The following link shows the contents of /etc/hosts, and demonstrates the problem:

http://fuckyeahmarkdown.example.com/go/?u=file:///etc/hosts&read=1&preview=1&showframe=0&submit=go

The output looked like this:

..
127.0.0.1 localhost
255.255.255.255 broadcasthost
::1 localhost
fe80::1%lo0 localhost
127.0.0.1 stage
127.0.0.1 files
127.0.0.1 brettt..
..

In the actual output of '/etc/passwd' all newlines had been stripped. (Which I now recognize as being an artifact of the markdown processing.)

UPDATE: The problem is fixed now.

| 9 comments.

 

Using the compiler to help you debug segfaults

Friday, 5 August 2016

Recently somebody reported that my console-based mail-client was segfaulting when opening an IMAP folder, and then when they tried with a local Maildir-hierarchy the same fault was observed.

I couldn't reproduce the problem at all, as neither my development host (read "my personal desktop"), nor my mail-host had been crashing at all, both being in use to read my email for several months.

Debugging crashes with no backtrace, or real hint of where to start, is a challenge. Even when downloading the same Maildir samples I couldn't see a problem. It was only when I decided to see if I could add some more diagnostics to my code that I came across a solution.

My intention was to make it easier to receive a backtrace, by adding more compiler options:

  -fsanitize=address -fno-omit-frame-pointer

I added those options and my mail-client immediately started to segfault on my own machine(s), almost as soon as it started. Ultimately I found three pieces of code where I was allocating C++ objects and passing them to the Lua stack, a pretty fundamental part of the code, which were buggy. Once I'd tracked down the areas of code that were broken and fixed them the user was happy, and I was happy too.

Its interesting that I've been running for over a year with these bogus things in place, which "just happened" to not crash for me or anybody else. In the future I'll be adding these options to more of my C-based projects, as there seems to be virtually no downside.

In related news my console editor has now achieved almost everything I want it to, having gained:

  • Syntax highlighting via Lua + LPEG
  • Support for TAB completion of Lua-code and filenames.
  • Bookmark support.
  • Support for setting the mark and copying/cutting regions.

The only outstanding feature, which is a biggy, is support for Undo which I need to add.

Happily no segfaults here, so far..

| 2 comments.

 

A final post about the lua-editor.

Saturday, 23 July 2016

I recently mentioned that I'd forked Antirez's editor and added lua to it.

I've been working on it, on and off, for the past week or two now. It's finally reached a point where I'm content:

  • The undo-support is improved.
  • It has buffers, such that you can open multiple files and switch between them.
    • This allows this to work "kilua *.txt", for example.
  • The syntax-highlighting is improved.
    • We can now change the size of TAB-characters.
    • We can now enable/disable highlighting of trailing whitespace.
  • The default configuration-file is now embedded in the body of the editor, so you can run it portably.
  • The keyboard input is better, allowing multi-character bindings.
    • The following are possible, for example ^C, M-!, ^X^C, etc.

Most of the obvious things I use in Emacs are present, such as the ability to customize the status-bar (right now it shows the cursor position, the number of characters, the number of words, etc, etc).

Anyway I'll stop talking about it now :)

| No comments

 

Adding lua to all the things!

Thursday, 14 July 2016

Recently Antirez made a post documenting a simple editor in 1k of pure C, the post was interesting in itself, and the editor is a cute toy because it doesn't use curses - instead using escape sequences.

The github project became very popular and much interesting discussion took place on hacker news.

My interest was piqued because I've obviously spent a few months working on my own console based program, and so I had to read the code, see what I could learn, and generally have some fun.

As expected Salvatore's code is refreshingly simple, neat in some areas, terse in others, but always a pleasure to read.

Also, as expected, a number of forks appeared adding various features. I figured I could do the same, so I did the obvious thing in adding Lua scripting support to the project. In my fork the core of the editor is mostly left alone, instead code was moved out of it into an external lua script.

The highlight of my lua code is this magic:

  --
  -- Keymap of bound keys
  --
  local keymap = {}

  --
  --  Default bindings
  --
  keymap['^A']        = sol
  keymap['^D']        = function() insert( os.date() ) end
  keymap['^E']        = eol
  keymap['^H']        = delete
  keymap['^L']        = eval
  keymap['^M']        = function() insert("\n") end

I wrote a function invoked on every key-press, and use that to lookup key-bindings. By adding a bunch of primitives to export/manipulate the core of the editor from Lua I simplified the editor's core logic, and allowed interesting facilities:

  • Interactive evaluation of lua.
  • The ability to remap keys on the fly.
  • The ability to insert command output into the buffer.
  • The implementation of copy/past entirely in Lua_.

All in all I had fun, and I continue to think a Lua-scripted editor would be a neat project - I'm just not sure there's a "market" for another editor.

View my fork here, and see the sample kilo.lua config file.

| No comments

 

I've been moving and updating websites.

Friday, 8 July 2016

I've spent the past days updating several of my websites to be "responsive". Mostly that means I open the site in firefox then press Ctrl-alt-m to switch to mobile-view. Once I have the mobile-view I then fix the site to look good in small small space.

Because my general design skills are poor I've been fixing most sites by moving to bootstrap, and ensuring that I don't use headers/footers that are fixed-position.

Beyond the fixes to appearances I've also started rationalizing the domains, migrating content across to new homes. I've got a provisional theme setup at steve.fi, and I've moved my blog over there too.

The plan for blog-migration went well:

  • Setup a redirect to from https://blog.steve.org.uk to https://blog.steve.fi/
  • Replace the old feed with a CGI script which outputs one post a day, telling visitors to update their feed.
    • This just generates one post, but the UUID of the post has the current date in it. That means it will always be fresh, and always be visible.
  • Updated the template/layout on the new site to use bootstrap.

The plan was originally to setup a HTTP-redirect, but I realized that this would mean I'd need to keep the redirect in-place forever, as visitors would have no incentive to fix their links, or update their feeds.

By adding the fake-RSS-feed, pointing to the new location, I am able to assume that eventually people will update, and I can drop the dns record for blog.steve.org.uk entirely - Already google seems to have updated its spidering and searching shows the new domain already.

| 2 comments.

 

So I've been busy.

Thursday, 30 June 2016

The past few days I've been working on my mail client which has resulted in a lot of improvements to drawing, display and correctness.

Since then I've been working on adding GPG-support. My naive attempt was to extract the signature, and the appropriate body-part from the message. Write them both to disk then I could validate via:

gpg --verify msg.sig msg

However that failed, and it took me a long to work out why. I downloaded the source to mutt, which can correctly verify an attached-signature, then hacked lib.c to neuter the mutt_unlink function. That left me with a bunch of files inside $TEMPFILE one of which provided the epiphany.

A message which is to be validated is indeed written out to disk, just as I would have done, as is the signature. Ignoring the signature the message is interesting:

Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

On Mon, 27 Jun 2016 08:08:14 +0200

...

--=20
Bob Smith

The reason I'd failed to validate my message-body was because I'd already decoded the text of the MIME-part, and I'd also lost the prefixed two lines "Content-type:.." and Content-Transfer:.... I'm currently trying to work out if it is possible to get access to the RAW MIME-part-text in GMIME.

Anyway that learning aside I've made a sleazy hack which just shells out to mimegpg, and this allows me to validate GPG signatures! That's not the solution I'd prefer, but that said it does work, and it works with inline-signed messages as well as messages with application/pgp-signature MIME-parts.

Changing the subject now. I wonder how many people read to the end anyway?

I've been in Finland for almost a year now. Recently I was looking over websites and I saw that the domain steve.fi was going to expire in a few weeks. So I started obsessively watching it. Today I claimed it.

So I'll be slowly moving things from beneath steve.org.uk to use the new home steve.fi.

I also setup a mini-portfolio/reference site at http://steve.kemp.fi/ - which was a domain I registered while I was unsure if I could get steve.fi.

Finally now is a good time to share more interesting news:

  • I've been reinstated as a Debian developer.
  • We're having a baby.
    • Interesting times.

| 7 comments.

 

So I should document the purple server a little more

Wednesday, 15 June 2016

I should probably document the purple server I hacked together in Perl and mentioned in my last post. In short it allows you to centralise notifications. Send "alerts" to it, and when they are triggered they will be routed from that central location. There is only a primitive notifier included, which sends data to the console, but there are sample stubs for sending by email/pushover, and escalation.

In brief you create alerts by sending a JSON object via HTTP-POST. These objects contain a bunch of fields, but the two most important are:

  • id
    • A human-name for the alert. e.g. "disk-space", "heartbeat", or "unread-mail".
  • raise
    • When to raise the alert. e.g. "now", "+5m", "1466006086".

When an update is received any existing alert has its values updated, which makes heartbeat alerts trivial. Send a message with:

{ "id": "heartbeat", "raise": "+5m", .. }

The existing alert will be updated each time such a new event is submitted, which means that the time at which that alert will raise will be pushed back by five minutes. If you send this every 60 seconds then you'll get informed of an outage five minutes after your server explodes (because the "+5m" will have been turned into an absolute time, and that time will eventually become in the past - triggering a notification).

Alerts are keyed on the source IP which sent the submission and the id field, meaning you can send the same update from multiple hosts without causing any problems.

Notifications can be viewed in a reasonably pretty Web UI, so you can clear raised-alerts, see the pending ones, and suppress further notifications on something that has been raised. (By default notifications are issued every sixty seconds, until the alert is cleared. There is support for only raising an alert once, which is useful for services you might deliver events via, such as pushover which will repeat themselves.)

Anyway this is a fun project, which is a significantly simplified and less scalable version of a project which is open-sourced already and used at Bytemark.

| 4 comments.

 

A mixed weekend

Monday, 30 May 2016

This past seven days have been a little mixed:

  • I updated documentation on my simple object store.
  • I created a simplified alerting system.
    • Heavily inspired by something we use at work.
    • My version is much much simpler, but still useful enough to alert me of outages (via hearbeats) and unread email. (Both of which are sent via pushover notifications.)
  • I bought a pair of cheap USB "game controllers"
    • And have spend several hours playing SNES games such as Bomberman 2, and Super Mario Brothers 3.
    • I'm using mednafan, as it supports cheats, fullscreen, sound, and is pretty easy to drive.

Finally I spent the tail end of the weekend being a little red, sore, and itchy. . I figured this was a surprising outbreak of Dyshidrosis on my hands, and eczema on my body. Instead I received a diagnosis of Scarlet Fever. So now I feel somewhat Dickensian!

Apparently this infection is on the rise!

| 2 comments.

 

Accidental data-store .. is go!

Thursday, 19 May 2016

A couple of days ago I wrote::

The code is perl-based, because Perl is good, and available here on github:

..

TODO: Rewrite the thing in #golang to be cool.

I might not be cool, but I did indeed rewrite it in golang. It was quite simple, and a simple benchmark of uploading two million files, balanced across 4 nodes worked perfectly.

https://github.com/skx/sos/

| 2 comments.