Skip to main content

Trusted Email Connection Signing (rev 0.2)

IMPORTANT: This blog post deprecates my previous posting on this subject. The blog post Proposal for connection signing reputation system for email is deprecated.


Sign the medium, not the message



The motivation behind TECS (Trusted Email Connection Signing) is that what managers of MX servers on the public Internet really care about is the ability to distinguish a good connection (coming from a legitimate sender and which will be used to send wanted email) from a bad connection (coming from a spammer). If you can identify a bad connection (today, you do that using an RBL or other reputation service based on the IP address of the sender) you can tarpit or drop it, or subject the mails sent on the connection to extra scrutiny. If you can identify a good connection it can bypass spam checks and help reduce the overall false positive rate.

If you are a legitimate bulk mailer (an email marketer, for example) then you care deeply that you reputation being recognizable and that mail sent from you be delivered. Currently, you have to carefully tend your IP addresses to make sure that they don't appear on blacklists, and you have to ensure that new IP addresses are clean.

If you are running a large email service (e.g. Yahoo! Mail) then you are currently trying to build white and blacklists of IP addresses, when what you really want are white and black lists of entities.

Currently, the options used to identify a bad connection are rather limited (RBLs, paid reputation services and grey listing), and good connections are hard to manage (whitelists on a per-recipient basis, or pay-per-mail services). What's needed is a different approach.

The idea is to identify and determine the reputation of the entity connecting to a mail server in real-time without resorting to a blacklist or whitelist. This is done by signing the connection itself. With the signature on a per-connection basis a mail server is able to determine who is responsible for the connection, and then look up that entity's reputation in a database.

Current reputation databases are based on IP addresses. This is a very inflexible system: IP addresses must be added to blacklists very fast as spammers churn through zombie machines, and any legitimate emailer needs to make sure their mail servers are whitelisting with multiple email providers (e.g. Yahoo!, Gmail, Brightmail, ...) to ensure delivery. And if a legitimate mailer wants to bring on line new servers, with new IP addresses they have to run through the entire whitelisting process again.

This is inefficient. The mapping between IP address and entities (e.g. knowing that Google's Gmail services uses a specific set of IP addresses) is unwieldy to manage and the wrong level of granularity. Google should be free to add and remove email servers at will, while carrying their good reputation with them.

That's what TECS gives you.

Connection Signing

TECS is an extension to the existing SMTP AUTH mechanism (see RFC 2554) and implements an authentication mechanism that I'll refer to as TECS-1 (the 1 here acts as a version number on the protocol). TECS-1 would need to be registered as a SASL (see RFC 2222) authentication mechanism.

When a mail sender connects to an SMTP server wishing to sign its connection it issues the EHLO command and if that SMTP server is capable of handling AUTH the mail sender then signs the connection using the AUTH command, with the TECS-1 mechanism followed by an initial response (which contains the TECS signature) as defined in RFC 2554.

Here's an example session:

S: 220 smtp.example.com ESMTP server ready
C: EHLO jgc.example.com
S: 250-smtp.example.com
S: 250 AUTH CRAM-MD5 DIGEST-MD5 TECS-1
C: AUTH TECS-1 PENCeUxFREJoU0Nnbmh1YWVlMDljNDBhZjJiODRhMGMyYjNiYmFlNzg2ZQ==
S: 235 Authentication successful.

The 'initial response' section of the of the AUTH command is a base-64 encoded string containing the following structure (this is deliberately similar to the DKIM fields):

a=rsa-sha256; q=dns; d=jgc.org; b=oU0Nnbmh1YWVlMDljNDBhZjJiO==

a= is the cryptographic method used (default would be RSA/SHA-256 with suitable padding as described in PKCS#1 version 1.5 RFC 3447).

d= is the name of the domain signing the connection. In the example above I am showing a connection that is being signed (and hence claimed by) jgc.org.

q= is a query type with the default being the use of a DNS TXT record. This query method is used to obtain the public key associated with the signing domain. The public key would be obtained by looking up _tecs.jgc.org and getting the associated TXT record.

b= is the binary signature for the connection generated using the method in a= by the d= domain.

The connecting server signs the tuple consisting of ( destination IP/port, source IP/port and epoch ); that way they sign the current connection and verify that they are responsible for the mail sent across it.

Each entity has an RSA key public/private key pair. When signing a connection the entity generates a SHA-256 hash of the tuple. The destination IP/port pair is the IP address and port on the mail server that the mail sender is currently connected to; similarly the source IP/port pair is the IP address and port of the connection being used by the
mail sender. The epoch is the standard Unix epoch rounded to the nearest 30 seconds.

The entity making the connection then encrypts the hash with their private key.

Update (February 8, 2007): A number of people have suggested getting the public key from the _domainkey (i.e. DKIM) label of the d= domain. This seems like a good idea since there's no need to reinvent the wheel.

Update (February 9, 2007). A few people pointed me in the direction of the MARID CSV proposal (see CSV). I've addressed this below.

I don't want to sign my outbound SMTP connections
Well don't then. If TECS were implemented you'd probably find you'd have trouble getting your mail delivered as non-signing (and hence not-taking-responsibility) would be looked upon very poorly.

I'm big ISP and I don't want to sign as myself, can I sign as a customer?
Sure, for example, jgc.org's mail is actually handled by lists.herald.co.uk. When that server was sending mail for jgc.org it could sign as jgc.org as long as it has access to jgc.org's private key. It would simply specify d=jgc.org in the TECS data.

Why don't you just use STARTTLS with certificates?
Because that's a very heavyweight system, designed for something else. A SASL extension using SMTP AUTH is simple and clean.

Why do you think connection signing is useful?
Because SMTP server resources are precious. Being able to make a decision about a connection before any mail is delivered is very useful. An SMTP server owner could use reputation data from a public or private source to decide whether to accept or reject a connection, slow down a connection, apply little or extra scrunity to a connection, etc. Being able to do this before receiving a ton of mail and tying up a server is very valuable.

Won't spammers just sign their connections?
No doubt, but that's hardly a worry, being able to identify the good senders fast is the most important goal.

Why don't you just signed the destination IP/port pair? That's known before the connection is made and avoids problems with NAT
TECS could just sign the tuple ( destination IP, port, epoch ) but I think it's a bit weaker than my proposal. Since the destination IP and port are fixed for a given MTA the signature is really a signature on the time. An eavesdropper could reply the signature within 30 seconds (or other timeone on the epoch value) and get an authenticated connection from any source IP address.

What about the MARID CSV proposal?
The CSV proposal is a lightweight (DNS-based), non-cryptographic method of estabilishing whether a host claiming a certain domain name in HELO/EHLO is authorized to be an SMTP client. Clearly, CSV aims to provide a simple method of determining whether a connecting SMTP client is authorized to be an SMTP client with the claimed name. This seems like a useful extension, but is very different from TECS. TECS operates at the level of a specific connection, and with an entity that is distinct from the domain of the SMTP client. This is valuable for two reasons: it allows the identity to be moved from SMTP service provider to provider, and it means that shared SMTP servers can operate claiming different 'responsible parties' for each connection. This latter point is important for ISPs that provide SMTP services to email marketers where the same SMTP server may be shared across many clients. This can result in a clean emailer being blacklisted because the IP of the shared server was blacklisted because of some other unrelated misbehaviour.

Whilst CSV is a useful extension which would help with the zombie problem, it does not address the needs at the connection level where I believe the problem needs to be addressed.

CSV also provides specific services for checking domain names against accreditation services. That is outside the scope of TECS, although the assumption is that such services would exist for TECS signed connections against the domain name claiming responsbility. The bottom line is that TECS deals with the party responsible for a connection, CSV the party responsible for the server.

What about mailing lists that forward mail?
By signing their connections they take responsibility for the mails they are sending. So mailing lists would need to have appropriate email policies in place for unsubscriptions, and deal themselves with spam to the list. Since the connection is signed any concern about munging of From: addresses for VERP handling, or adding headers/footers to email are irrelevant.

Is this compatible with SPF, Sender-ID, DomainKeys?
They are orthogonal. There's no direct interaction. Although, it might be sensible to use the _domainkey record from DKIM to obtain a public key thus sharing the same key between DKIM and TECS.

Will this reduce spam?
I'm not going to make any predictions. The goal would be to build a database that makes it easier to recognize someone who is legitimate, and scrutinize those who abuse the system or who choose not to sign.

What about anonymity?
Anoymous remailers are unaffected. They could sign their outbound connections with the system but that would not affect any changes they make to anonymize messages since its the conneciton, not the message content that's signed.

What if I change the mail servers or IP addresses I am using?
There's no effect. Keep signing the connections and you can take responsibility for any IP address you want to.

I think you are wrong, right, stupid, a genius.
Please comment here, or write to me directly.


Many thanks to all members of the REDACTED discussion forum, and to Toby DiPasquale.

Comments

Justin Mason said…
I dunno John -- I would say the massive deployment of SMTP+TLS is still a major plus in its favour, over a new scheme. getting changes made to SMTP infrastructure is *extremely* slow.

Also, the fact that the connection winds up encrypted with SMTP+TLS is, if anything, a bonus!
I don't agree about the encryption part. That's a major minus from my perspective because of the cost of encrypted connections.

John.
Unknown said…
Sorry I didn't get to reply to your questions about my original comment.
I was initially thinking you could have the string added here so that it could be ignored or used without any new states/commands in the SMTP conversation:

220 mx.google.com ESMTP 18si4862540nzo
EHLO example.com TECS:*TECS-string-here*
250-mx.google.com at your service, [w.x.y.z]
250-SIZE 20971520
250-8BITMIME
250 ENHANCEDSTATUSCODES
QUIT
221 2.0.0 mx.google.com closing connection 18si4862540nzo

But using existing standards seems better probably.

The things I'd like to see are:

* Use the exact same key as DK/DKIM that way you have the bond between the two sigs; connection level and individual email level and you maybe can do just one dns lookup. In order to easily use the DK dns records, maybe have the selector string be a part of TECS AUTH.
Just use s:_blah.example.com instead of the d:example.com field.
That way you have a huge amount of flexibility in specifying the signing key and room for the standard to easily evolve.

* For the signed fields, have them all be things that the client knows prior to making the connection (so that the TECS can be built by an async process just prior to initating the connection), maybe: your server IP, the time, (not sure what else). The question here is do the fields that are signed beyond those two really matter? What value is added by signing certain fields vs others? The client doesn't know the ports beforehand, but those can be/are tracked by the server. You (the server) have all those fields src IP/port, dest IP/port, something signed by the client, the time of the connection - what spoofing is prevented or record keeping is enhanced with the additional fields encrypted? By signing only the dest IP and time, it would allow for TECS strings to be cached in a SMTP connection cache, since it could be reused for 30 or whatever seconds.

About the outgoing IP address - People will argue a "perfect world" scenario in regards to this - but do mail servers really always know what their external outgoing IP is? I say no with all the load balancing, NATing, proxying that goes on, but...

Other than those two things, I think its a great idea.

Why is the discussion forum a secret?
Unknown said…
What exactly is the cost? 5% CPU? Are you 100% maxed out on CPU already?
Unknown said…
"The idea is to identify and determine the reputation of the entity.... This is done by signing the connection itself."

Signing the connection does not give you reputation. It gives you an entity whose reputation can be measured.

These seem to be highly conflicting statements:
-[problem]...determine the reputation of the entity connecting to a mail server in real-time without resorting to a blacklist or whitelist.
-[solution]...determine who is responsible for the connection, and then look up that entity's reputation in a database.

White and Blacklists *are* reputations stored in a database. They are generally not very granular though.

It seems to me your root problem is that you want better ways to measure reputation than the blackbox algorithms RBLs/accreditation entities use. But, this protocol just changes the entity that gets measured.

Assume everyone in the world changes their MTAs to do the AUTH TECS-1 command today. Why won't RBLs begin to generate domain blacklists using the same mechanisms to determine the list as today?

I do think that domain is a better entity to measure than IP. But, that's exactly why the industry has been working on sender authentication mechanisms for several years. With DKIM in IESG last call, why not just use that domain to do your reputation lookup?
Unknown said…
A novel approach, and I like the simplicity of it. Have you tried to quantify the impact from the added step of a DNS lookup request? You're asking the MTA to rely on another process for a timely response (one that's more than likely on a completely different computer, seperate network segment, etc).

On a similar note, if you want to use the DKIM format for the TXT records then it's important to note some flavours of DNS don't support the leading underscore the DKIM record requires. BIND seems okay with it, but the MS implementation seems to silently ignore it (which makes it devillishy difficult to diagnose if you don't know what to look for).
Unknown said…
@Michael
"but the MS implementation seems to silently ignore it"
which makes this pretty ironic:
dig _spf-a.microsoft.com TXT
...
;; ANSWER SECTION:
_spf-a.microsoft.com. 3600 IN TXT "v=spf1 ip4:213.199.128.139 ip4:213.199.128.145 ip4:207.46.50.72 ip4:207.46.50.82 a:delivery.pens.microsoft.com a:mh.microsoft.m0.net ~all"
Unknown said…
Well, the MS server that replied to my query for the SPF record was fingerprinted as:
fingerprint (207.68.160.190, 207.68.160.190): Microsoft Windows 2003
Unknown said…
How *does* this address the problems for MTAs behind a NAT? The MTA has no knowledge of what the public IP will be.

Popular posts from this blog

Your last name contains invalid characters

My last name is "Graham-Cumming". But here's a typical form response when I enter it:


Does the web site have any idea how rude it is to claim that my last name contains invalid characters? Clearly not. What they actually meant is: our web site will not accept that hyphen in your last name. But do they say that? No, of course not. They decide to shove in my face the claim that there's something wrong with my name.

There's nothing wrong with my name, just as there's nothing wrong with someone whose first name is Jean-Marie, or someone whose last name is O'Reilly.

What is wrong is that way this is being handled. If the system can't cope with non-letters and spaces it needs to say that. How about the following error message:

Our system is unable to process last names that contain non-letters, please replace them with spaces.

Don't blame me for having a last name that your system doesn't like, whose fault is that? Saying "Your last name …

All the symmetrical watch faces (and code to generate them)

If you ever look at pictures of clocks and watches in advertising they are set to roughly 10:10 which is meant to be the most attractive (smiling!) position for the hands. They are actually set to 10:09.14 if the hands are truly symmetrical. CC BY 2.0image by Shinji
I wanted to know what all the possible symmetrical watch faces are and so I wrote some code using Processing. Here's the output (there's one watch face missing, 00:00 or 12:00, because it's very boring):



The key to writing this is to figure out the relationship between the hour and minute hands when the watch face is symmetrical. In an hour the minute hand moves through 360° and the hour hand moves through 30° (12 hours are shown on the watch face and 360/12 = 30).
The core loop inside the program is this:   for (int h = 0; h <= 12; h++) {
    float m = (360-30*float(h))*2/13;
    int s = round(60*(m-floor(m)));
    int col = h%6;
    int row = floor(h/6);
    draw_clock((r+f)*(2*col+1), (r+f)*(row*2+1), r, h, floor(m…

Importing an existing SSL key/certificate pair into a Java keystore

I'm writing this blog post in case anyone else has to Google that. In Java 6 keytool has been improved so that it now becomes possible to import an existing key and certificate (say one you generated outside of the Java world) into a keystore.

You need: Java 6 and openssl.

1. Suppose you have a certificate and key in PEM format. The key is named host.key and the certificate host.crt.

2. The first step is to convert them into a single PKCS12 file using the command: openssl pkcs12 -export -in host.crt -inkey host.key > host.p12. You will be asked for various passwords (the password to access the key (if set) and then the password for the PKCS12 file being created).

3. Then import the PKCS12 file into a keystore using the command: keytool -importkeystore -srckeystore host.p12 -destkeystore host.jks -srcstoretype pkcs12. You now have a keystore named host.jks containing the certificate/key you need.

For the sake of completeness here's the output of a full session I performe…