Starting your own “black hole” email server

In this article, I’ll describe an customizable yet simple “black hole” email server which you can deploy to your personal site (as long as you have superuser/administrator access to the server your @domain site is backed by). The phrase “black hole” encapsulates the server’s ability to asynchronously read hundreds of inbound emails per second (all without being able to process outbound email).

I’m going to start of by describing exactly how email works. To many people (myself included), email feels like ad-hoc, point-to-point communication:

However, it’s better represented as a service:

Note the clear distinction here: IMAP and POP3 are used by an email client to read emails from a server, while SMTP is used to send emails to a server (this includes client-to-server and server-to-server communication). Let’s work through a quick toy scenario. For this example, I’ll assume that you’ve already set up Microsoft Outlook on your local machine as the client, with GMail as the target email service. When you refresh your inbox, Outlook fetches the email from imap.gmail.com (using IMAP) or pop.gmail.com (using POP3), depending on the settings that you’ve selected. When you compose a reply to one of your unread emails and hit send, Outlook communicates the body of the email and appropriate headers via SMTP to smtp.gmail.com. GMail’s SMTP servers then lookup the destination email server associated with the recipient’s email address and subsequently send the email, again via SMTP.

If you wanted to, you could specify a different outgoing mail server. Local SMTP servers, such as sendmail and postfix, can be used to send emails directly from your @domain machine, as long as port 25 has been opened by your ISP. Python has its own SMTP implementation via the smtpd module. A quick skim over the docs shows that it’s laughably simple to use:

# smtp_example.py

import asyncore
from smtpd import SMTPServer

class SMTPExample(SMTPServer, object):

    def __init__(self, *args, **kwargs):
        super(Courier, self).__init__(*args, **kwargs)

    def process_message(self, peer, mailfrom, rcpttos, data):
        print(data)

if __name__ == "__main__":
    SMTPExample(("0.0.0.0", 25), None)
    asyncore.loop()

The process_message function is called every time the server intercepts an inbound or outbound message. In this case, the entire intercepted message is simply dumped to stdout – no other actions are taken. Needless to say, this is relatively useless; let’s improve the code so that it does something a bit more useful:


from email import message_from_string

....

    def process_message(self, peer, mailfrom, rcpttos, data):
        headers = message_from_string(data)
        body = headers.get_payload(decode=True).strip()
        print(body)
....

Now, instead of just printing the entire message (which includes headers), the code grabs the body of the email and prints it to stdout. For more information on how exactly this works, check out the email module in the Python docs.

Armed with a better understanding of smtpd, we can greatly expand the code. I’ve created a very simple callback-based SMTP server implementation using the smtpd package, which I’ve dubbed Courier:

#!/usr/bin/env python

import asyncore
from email import message_from_string
from email.iterators import typed_subpart_iterator
from email.utils import formataddr
import logging
from smtpd import SMTPServer
import sys


class Courier(SMTPServer, object):
    """
        Courier class, built on top of Python's SMTPServer.
    """

    def __init__(self, cb_fn, cb_data, *args, **kwargs):
        super(Courier, self).__init__(*args, **kwargs)
        self._cb_fn = cb_fn
        self._cb_data = cb_data

    def process_message(self, peer, mailfrom, rcpttos, data):
        """
            Process a single inbound message.
        """
        headers = message_from_string(data)
        body = ""

        # multipart
        if headers.is_multipart():
            for part in typed_subpart_iterator(headers, "text", "plain"):
                body += part.get_payload(decode=True).strip()

        # message body
        else:
            body += headers.get_payload(decode=True).strip

        # compile the data
        mail_data = {
            "sender": headers["from"],
            "recipient": headers["to"],
            "subject": headers["subject"],
            "body": body
        }
        logging.debug(mail_data)
        
        return self._cb_fn(mail_data, self._cb_data)

if __name__ == "__main__":
    args = parser.parse_args()
    start_courier(args)

To use it, simply define a callback function which processes the message defined by mail_data. This callback can be set up to perform a wide array of different functions, such as saving the email to a MySQL database or text file, understanding its contents through NLP, or just dumping its contents to stdout/stderr. Note that this implementation is unable to process email attachments – I’ll leave that as an exercise for you. Hint: you’ll probably have to make use of base64 encoding; python just happens to have its own built-in library to help with that.

Of course, there’s one huge catch to all this: smtpd doesn’t have an inbuilt mail relay (also known as mail transfer agent, or MTA), which is required to lookup destination IP addresses and send email. This means that “outbound” emails sent from your @domain server don’t actually get sent to the recipient; they simply get intercepted by Courier and processed by whatever callback you set. To create a fully functional and customizable SMTP server in Python is a much bigger project. slimta and TwistedMail are strong resources for better understanding how a fully functional local SMTP server is implemented.

For simplicity, I’ve pasted all of the Courier code with some example callbacks into a Github Gist. Since my machine is only accessible through SSH, I’ve set the server to run continuously in a screen session:

screen -dR courier
sudo python courier.py -c bounce

If you take a closer look at the code, you’ll see that the “bounce” command will append attach a nice blurb to the beginning of the email sent to it. Try it yourself! Send an email to courier [at] frankzliu [dt] com to receive an automated reply from my experimental email address1. YMMV – I’ve already disabled CAPTCHA and less secure apps on that Google account, so using GMail as an outgoing server works fine for me. Most public mail services such as Google and Yahoo put restrictions on the sender email address, which prevents me from using custom email addresses.

1It’s likely that it’ll get put into your spam folder, so be sure to check that!