Campaigns App: Why did my email test go to my SPAM folder?
Introduction
It is common, before executing a large send, to test the message content by sending it to a small list of recipients, in order to make sure that the message looks how you want it, and that everything is set up correctly. Often times these smaller test batch messages go to the SPAM folder, and this can be a cause for concern. However, receiving mail servers handle incoming messages in different ways, depending on if it is a small message going to a limited number of recipients or if it is a large message going to hundreds, thousands, or millions of recipients. The message is treated differently, and different algorithmic profiles are used to determine placement.
Definitions
First, it's important to understand the difference between anti-spam filtration and greylisting filtering. You can think of anti-spam filters like bouncers at a night club. They primarily are checking the back-end mechanics of the message, ensuring that all of the necessary records to designate a sender as not SPAM are in place, and correct. This comprises SPF records, which allow a domain to specify which servers are allowed to send on its behalf and DKIM records, which are a more complex form of verification record, tied exclusively to a single sending server. These records combined allow a receiving server to determine if their DMARC is aligned, by testing and verifying these various records. They may also check the message content at a high level. They are looking for links to websites that are on a known bad or dangerous list, suspicious images or attachments, and any major red flags that would identify the message as SPAM. Once this passes, the message is allowed to the recipient's mailbox, however, that is just half of the fight. Once a message is allowed into the mailbox and form of soft filtration, known generally as Greylisting takes over.Greylist filtering is a different mechanism that many large providers now utilize. Greylisting is an algorithmic set of rules based on different criteria, and it lets the hosting server, Gmail for example, know where to place the message in the user's mailbox, once the anti-spam filter has cleared it. This can mean the message is placed in the inbox, social or promotions, or even the Junk or Bulk folders. For small scale sends, normal email messages, Greylisting is weighted heavily on the individual recipient's patterns. The receiving server makes a determination based on a profile that it has determined for the recipient and places the mail in the folder it thinks is correct. This updates over time, and marking messages as not spam or interacting with the messages will let the receiving server know where to place those messages in the future.
Greylist Filtering for mass mailings or large scale sends on the other hand relies much less, if any, on the individual recipient and instead develops a profile based on previous interactions with similar messages, comparing subject, content, and most importantly the sending domain, across the entire server. If Gmail users, for example, tend to interact with messages that come from our shared domain historically, then Gmail assumes that with future messages, users will also want to receive them in their inbox. This is one of the big benefits of the shared sending domain. By banding together senders with volumes that are too low to establish a proper profile with servers like Google, we are able to ensure a high rate of deliverability without having the added cost of configuring custom domains, feedback loops, DNS Records, and all of the monitoring that comes with a custom domain.
The problem you are facing is that by doing a test a small number of recipients the receiving servers aren't treating the message as a mass mail message, and instead is just seeing it as an email with few recipients, and therefore is using the perceived profile of the individual recipient to determine placement. This is why small-scale tests aren't usually very accurate at predicting mailbox delivery, the rules are different in how small sends and large sends are treated, and the test results are often skewed.