That really is the simplest answer.
On Stack Overflow, xRobot asked for guidance on setting up a system which would send 100,000 e-mails every week to a variety of addresses. This is, actually, quite tricky, as was demonstrated in Piskvor‘s rather awesome answer. Here it is:
Short answer: While it’s technically possible to send 100k e-mails each week yourself, the simplest, easiest and cheapest solution is to outsource this to one of the companies that specialize in it (I did say “cheapest”: there’s no limit to the amount of development time (and therefore money) that you can sink into this when trying to DIY).
Long answer: If you decide that you absolutely want to do this yourself, prepare for a world of hurt (after all, this is e-mail/e-fail we’re talking about). You’ll need:
- e-mail content that is not spam (otherwise you’ll run into additional major roadblocks on every step, even legal repercussions);
- in addition, your content should be easy to distinguish from spam — that may be a bit hard to do in some cases (I heard that a certain pharmaceutical company had to all but abandon e-mail, as their brand names are quite common in spam mailings);
- a configurable SMTP server of your own — one which won’t buckle when you dump 100k e-mails onto it (your ISP’s upstream server won’t be sufficient here and you’ll make the ISP violently unhappy; we used two dedicated boxes);
- some mail wrapper (e.g. PhpMailer if PHP’s your poison of choice; using PHP’s
mail() is horrible enough by itself);
- your own sender function to run in a loop, create the mails and pass them to the wrapper (note that you may run into PHP’s memory limits if your app has a memory leak; you may need to recycle the sending process periodically, or even better, decouple the “creating e-mails” and “sending e-mails” altogether).
Surprisingly, that was the easy part. The hard part is actually sending it:
- Some servers will ban you when you send too many mails close together, so you need to shuffle and watch your queue (e.g. send one mail to firstname.lastname@example.org, then three to other domains, only then another to email@example.com).
- You need to have correct PTR, SPF, DKIM records.
- You need to handle remote server timeouts, misconfigured DNS records, and other network pleasantries.
- You need to handle invalid e-mails (and no, regex is the wrong tool for that).
- You need to handle unsubscriptions (many legitimate newsletters have been reclassified as spam due to many frustrated users who couldn’t unsubscribe in one step and instead chose to “mark as spam” — the spam filters do learn, especially with large e-mail providers).
- You need to handle bounces and rejects (“no such mailbox firstname.lastname@example.org”; “mailbox email@example.com full”).
- You need to handle blacklisting and removal from blacklists. (Sure, you’re not sending spam. Some recipients won’t be so sure — with such large list, it will happen sometimes, no matter what precautions you take. Some people (e.g., your not-so-scrupulous competitors) might even go as far to falsely report your mailings as spam — it does happen. On average, it takes weeks to get yourself removed from a blacklist.)
And to top it off, you’ll have to manage the legal part of it (various federal, state, and local laws; and even different tangles of laws once you send outside the U.S. (note: you have no way of finding out whether firstname.lastname@example.org lives in Southwest Elbonia, the country with world’s most draconian antispam laws)).
I’m pretty sure I missed a few heads of this hydra — are you still sure you want to do this yourself? If so, there’ll be another wave, this time merely the annoying problems inherent in sending an e-mail. (You see, SMTP is a store-and-forward protocol, which means that your e-mail will be shuffled across many SMTP servers around the Internet, in the hope that the next one is a bit closer to the final recipient. Basically, the e-mail is sent to an SMTP server, which puts it into its forward queue; when time comes, it will forward it further to a different SMTP server, until it reaches the SMTP server for the given domain. This forward could happen immediately, or in a few minutes, or hours, or days, or never.) Thus, you’ll see the following issues — most of which could happen en route as well as at the destination:
- The remote SMTP servers don’t want to talk to your SMTP server.
- Your mails are getting marked as spam (
<blink> is not your friend here, nor is
- Your mails are delivered days, even weeks late (contrary to popular opinion, SMTP is designed to make a best effort to deliver the message sometime in the future — not to deliver it now).
- Your mails are not delivered at all (already sent from e-mail server on hop #4, not sent yet from server on hop #5, the server that currently holds the message crashes, data is lost).
- Your mails are mangled by some poorly designed server en route (this one is somewhat solvable with base64 encoding, but then the size goes up and the e-mail looks more suspicious).
- Your mails are delivered and the recipients seem not to want them (“I’m sure I didn’t sign up for this, I remember exactly what I did a year ago” (of course you do, sir)).
- There are problems with users with various versions of Microsoft Outlook and its unique handling of Internet mail.
- You hit wizard’s apprentice mode (a self-reinforcing positive feedback loop — in other words, automated e-mails as replies to automated e-mails as replies to…; you really don’t want to be the one to set this off, as you’d anger half the internet at yourself).
And it’ll be your job to troubleshoot and solve this (hint: you can’t, mostly). The people who run a legit mass-mailing businesses know that in the end you can’t solve it, and that they can’t solve it either — and they have the reasons well researched, documented and outlined (maybe even as a PowerPoint presentation — complete with sounds and cool transitions — that your bosses can understand), as they’ve had to explain this a million times before. Plus, for the problems that are actually solvable, they know very well how to solve them.
If, after all this, you are not discouraged and still want to do this, go right ahead: it’s even possible that you’ll find a better way to do this. Just know that the road ahead won’t be easy — sending e-mail is trivial, getting it delivered is hard.
I’ve rewritten that slightly to tweak the grammar and to avoid a couple of unnecessary and potentially triggering metaphors. As good as it is, it’s not the last word on the subject. Here’s more advice, from splattne, on how not to be marked as a spammer:
Be sure that your e-mails don’t look like typical spam e-mails: don’t insert only a large image; check that the character-set is set correctly; don’t insert “IP-address only” links. Write your communication as you would write a normal e-mail. Make it really easy to unsubscribe or opt-out. Otherwise, your users will unsubscribe by pressing the “spam” button, and that will affect your reputation.
On the technical side: if you can choose your SMTP server, be sure it is a “clean” SMTP server. IP addresses of spamming SMTP servers are often blacklisted by other providers. If you don’t know your SMTP servers in advance, it’s a good practice to provide configuration options in your application for controlling batch sizes and delay between batches. Some mail servers don’t accept large sending batches or continuous activity.
Use e-mail authentication methods, such as SPF, and Domain Keys to prove that your emails and your domain name belong together. The nice side-effect is you help in preventing that your email domain is spoofed. Also check your reverse DNS to make sure the IP address of your mail server points to the domain name that you use for sending mail.
Make sure that the reply-to address of your emails are a valid, existing addresses. Use the full, real name of the addressee in the To field, not just the email-address (e.g.
"John Doe" <email@example.com> ) and monitor your abuse accounts, such as firstname.lastname@example.org and email@example.com.
Of course, on the other end, it’s also important to protect against spam coming in.
The copyright for the two essays quoted above rests with their original authors. They were originally published on Stack Overflow and Super User, respectively. Both of those essays, and this entire blog post, are under the license CC BY-SA 3.0. Feel free to repost elsewhere.