Hosted Spam Filtering@mail-scanning.com

  • About
    About our service
  • Documentation
    Using our service
  • FAQ
    Common enquiries
  • Pricing
    Affordable pricing
  • Support
    Do you have a problem?

Mail-Scanning Blog

A slightly cleaner experience

31st May 2009

Towards the end of this month I've been working upon improving the navigation of the mail-scanning website.

This work has now been completed, and it should be easier to navigate around the website now.

There is always room for improvement though, and over the coming days & weeks I'll be looking at improving the quarantine area.

Tags: user interface.

 

A new record on SPAM rejection

20th April 2009

Over the past few days our rejection rates have reached a new high. We're blocking more junk mail, virus mail, and spam mail than ever before.

Over the previous 30 days we've rejected just short of 7 million emails saving our users time, effort, and money.

The current rejection rate break down like this:

Time PeriodRejected Mail Count
30 Days 6944660
1 Day 231488
1 Hour 9645
1 Minute160

In brief we've rejected an average of 160 messages a minute, non-stop, for the past 30 days. That's quite an impressive feat, as I'm sure you'd agree if you were recieving that amount of unwanted email!

Tags: quarantine, statistics.

 

Improved throughput due to lack of repeated IO

21st March 2009

Recently I noticed that one of our MX machines was struggling to perform as well as it had done in the past, and after a lot of debugging I determined it was IO-bound.

The ultimate cause of the slowdown was repeated disk access which didn't need to be made, and this has now been fixed.

Each message arriving at our server has a number of tests applied to it. Some of these tests are very lightweight, for example looking at the IP address which made the connection, whilst others require access to the complete incoming message body and are more resource-intensive.

The tests which require access to the complete incoming message include things like the virus-scanning, and Bayesian testing.

What should happen when a test requires access to the message body is :

  1. Message is spooled to a temporary file on disk.
  2. Message is tested.

Then the next test which requires access to the body of the message can use the file which is already present on the disk (this spooled file is removed when the remote server disconnects).

Unfortunately one of the tests I was performing was making a second copy of the message on disk, rather than using the pre-existing copy.

The extra overhead of spooling to disk, having the command read it, and then deleting the output and the copy of the message was just enough to slow the system down.

I've now updated things such that all the tests which require access to the message body make use of a single temporary copy on disk, if the lightweight tests all succeed and the message looks good.

(When filtering for spam the general practise is to perform the lightweight tests first - if they flag a message as spam then the heavier tests don't even need to be executed.)

Tags: performance.

 

Sending bounces is now optional

4th March 2009

Our service receives incoming email and does one of two things to it:

  • Decides that it is valid mail, and forwards it to the intended receipient.
  • Decides that the mail is spam, and rejects it.

When mail is rejected it is stored in the quarantine area for the relevant domain and a bounce is generated. The intention here is two-fold:

The sender receives a bounce, so they know their message was not received. (This allows them to raise a support ticket with us, and we can force the delivery of the message).

Because the "rejected" message is also available in the quarantine the domain owner can also view the message, and choose to deliver it themselves.

However there are cases where sending bounced messages could cause confusion and problems. So there is now the facility to reject messages silently if you wish.

This means a rejected message will still be stored in the quarantine, but the sender will not receive a bounce message. Essentially the message will be silently quarantined and then discarded.

In general domain owners will probably elect to send out bounces, and this is the default behaviour. But for some users, and some domains, this new feature will be useful.

This change was suggested by one of our users, and site/facility suggestions are always appreciated.

Tags: bounces, new-features, quarantine.

 

You may now whitelist based upon subjects.

29th January 2009

I've overhauled the whitelisting user-interface, and facilities such that it is now possible to whitelist emails based upon their subject.

If you visit your domain control panel you'll see the whitelisting options are now broken down into separate tabs, thanks to jQuery, and there is a "subject" section.

If, as an example, you were to enter:

steve-kemp
allow-me

Any incoming email addressed to your domain with the string "steve-kemp" in the subject, or the string "allow-me" will be allowed through even if it wouldn't have been previously. (Assuming it was addressed to a valid recipient.)

(The values should be one per line, and are really case-insensitive regular expressions.)

I hope this makes a nice addition to the existing whitelisting options.

Tags: new-features, user interface, whitelisting.

 

An additional MX machine is being introduced

27th January 2009

We've rented and configured an additional MX machine, to cope with the increased volume of mail we're processing.

If you're running any kind of MX security restrictions then please update your ACLs to allow connections from this new machine (which is listed on our MX list):

Hostname IP Address
incoming3.mail-scanning.com 91.121.26.217

With the introduction of this machine we now have MX boxes located in:

  • Chicago
  • Manchester.
  • The Netherlands
  • France

Isn't redundancy wonderful?!

Tags: mx.

 

Testing blog and forum comments for spam

7th January 2009

Although this isn't strictly related to the mail-scanning business I've been interested in collecting, analysing, and handling different types of spam for a while.

A couple of websites that I maintain recently started succumbing to an increasing amount of spam comments, so I decided to setup a centralised location to test and reject them.

This is the idea behind blogspam.net - a simple service which will allow you to test forum, blog, and other types of comments, for spam characteristics.

Currently the service is a little basic, but there are plugins available for a couple of packages :

  • Drupal Plugin
  • Wordpress Plugin

The statistics I've collected are pretty horrific. In one 24-hour period I've seen :

40/681 94.45% SPAM

That is 40 "good" comments, and "681" spam comments, or 94% spam. I had no idea it was that bad..

Tags: blog, blogspam, general.

 

An unexpected outage

25th December 2008

Today our service suffered an outage lasting from approximately 01:00 to 19:05.

The cause of the outage was the corruption of our quarantine storage indexes - which were ultimately broken by a mistake that I made.

Unfortunately shortly after causing this corruption I realised that the backups I had maintained were not 100% complete. (Now that was a scary moment!)

I restored the available database contents from backups, but these backups were missing the index of all quarantined mail. The quarantine index had to be rebuilt from scratch using the actual quarantined emails which were stored offsite. (Our master machine, and backup MXs only store mails locally for five days. Rejected emails for the previous thirty days are stored offsite.)

The downtime was so protracted because the recovery process of transferring all the emails to the master machine, parsing them all, and then building a fresh index took approximately 15 hours to complete.

The Good News

For the duration of the outage all mail continued to be processed correctly:

  • SPAM was still rejected, as expected.
  • Good mail was still delivered.

The outage was entirely restricted to:

  • The storage and display of quarantined emails.
  • The per-domain statistics.
  • The global statistics.

In short I think most users wouldn't have noticed, although our status page did reflect the unavailability

The Bad Points

Except from spending too much of my Christmas day picking up the pieces and rebuilding the indexes there was really only one bad point to note:

  • Our database backups were inadequate.

At some point a few months ago it was realised that dumping the database to disk once an hour was causing too much locking - so we switched to a lazier backup strategy:

  • Perform a full dump of the database every day at 02:02AM. (Time chosen at random!)
  • Perform a dump of everything except the quarantine data once an hour.

Unfortunately my co-maintainer had not spotted this and had disabled the full dump - seeing that it was once a day and that other backups happened on an hourly basis. This meant that all the indexing information was completely lost.

After discussing this very carefully we're now going to continue to dump in stages but we will certainly avoid a repeat of this problem by having a well-defined backup policy internally:

  • A full database dump twice a day.
  • Hourly dumps of all tables except the quarantine indexes.
Tags: downtime, mysql, quarantine.

 

Advertising & Referrals

21st December 2008

Our service continues to work very nicely for our users, and I'm continuing to let things tick over as they have done for the past few months. Only making minimal changes to improve the capture rates as they present themselves.

A recent request for a referral program did get me thinking about advertising again though.

There are two reasons for my current "non-advertising" practise:

  • I'm not 100% sure how and where I should advertise this service.
  • I don't think I really need to.

The latter point is the most interesting. I've seen current users recommend our service, without any real prompting, on a couple of websites I frequent. I think that theres a lot to be said letting things work on word of mouth, because it is a very real form of advertising.

If people genuinely feel willing to say nice things about you, your product, or your service without you bribing them then you're doing well.

If you have to resort to massive expenditure, slogans, and marketing rather than actually making your product work there has to be something wrong there.

Perhaps I'm not really cut out for the big business, but I like the way things are going. I like the users we've got, and I'm sure I'd like their friends too!

If there is any interest in a referral program I'd be happy to hear it though - say 50% off the cost of hosting for a domain, for each new user domain signed up.

Drop us a line, or leave a comment if interested.

Tags: advertising, referrals.

 

A new limit on incoming message sizes

31st October 2008

I was recently asked about the maximum message size that our service could cope with.

Initially I believed there to be no limit in place, because I'd never explicitly configured one. However under testing it became apparent that there had been a size limit in place since our launch of 50Mb.

I've increased this limit from 50Mb to 300Mb - although I hope few people will send messages of that size. Best practise for distributing large files is to use HTTP, or FTP, or something other than email.

There are several services that will allow you to email links to files and host them for you such as these two:

  • YouSendIt
  • Send Space

When you send a large message there are things to be aware of - the problems you're most likely to experience are:

  • Your mailserver has a limit on the size of outgoing messages.
  • The recipients mailserver has a limit on the size of incoming messages.

Although our service has a 300Mb limit in place, which you're unlikely to ever hit, your sending machine and the mailserver of your recipient may not be so generous.

In short just you can send a large file by email most of the time, but it probably isn't the distribution method you should be choosing.

Tags: general, message sizes.

 

Archive

  • 2009
    • January (3)
    • March (2)
    • April (1)
    • May (1)
  • 2008
    • February (7)
    • March (7)
    • April (1)
    • May (3)
    • June (3)
    • July (1)
    • August (5)
    • September (3)
    • October (2)
    • December (2)

Tags

  • advertising (2)
  • anniversary (1)
  • antispam (1)
  • bayasian (1)
  • blacklists (1)
  • blog (2)
  • blogspam (1)
  • bounces (1)
  • documentation (2)
  • downtime (5)
  • email submission (1)
  • general (4)
  • gpg (1)
  • message sizes (1)
  • meta (1)
  • mx (5)
  • mysql (1)
  • new-domains (1)
  • new-features (22)
  • performance (1)
  • pricing (1)
  • quarantine (6)
  • referrals (1)
  • rejections (1)
  • searching (1)
  • secondary (5)
  • spam (1)
  • spamtrapping (1)
  • ssl (2)
  • statistics (6)
  • status (1)
  • user interface (2)
  • whitelisting (2)
© 2007-2009 Mail-Scanning.comRSS Feed