Developer Samwho has built a 21-question quiz that tests your knowledge of what is and is not allowed in an email address. It starts easy but gets hard fast. Familiarity with RFC 822, RFC 2822, RFC 5322, and RFC 6532 is useful, but you’ll still be surprised (or annoyed) by some of the answers, such as @.
16/21, however I verified that all the claimed address-literal examples (“[…]” domain part) are categorically wrong, which irritated me because I reviewed RFC 5321, where they’re specified, before publication of that document (I was astonished to find that John Klensin was kind enough to acknowledge me in the document for some of my comments about implicit MX logic on IPv6). As an example, IPv6 address literals always use “ipv6:”, so “magic@[::1]” should actually be “magic@[ipv6:::1]”. I imagine his testing was purely based on whether the library validated in a generic way, but it certainly would not be delivered as indicated.
But, still. I was gratified to get a lot of the trickier 822/2822/5322 syntax right. I’m surprised he didn’t torment us all with group syntax (“group:one@example.com,two@example.net;”, or, simply, “undisclosed-recipients:;”) which is, incredibly, still used by Apple Mail when sending to distribution lists, though a lot of modern software just chokes on it in unhelpful ways. Or, there’s the good old RFC 821 (original SMTP spec) source routing: “@central.hub,@connected.gateway:final-recipient@leaf.site”, which was very handy for working around IP blocks even in the early 00s, but inevitably, it faded into obsolescence as the Internet and direct MX routing became the only sensible way to route mail.
Anyway, great fun, thanks for sharing. Email is a wee obsession of mine, as I’m sure you’ll have guessed by now. )
Okay, this is going to be bad.
…
I got 13/n, which is about what I’d expect.
I once tried to Google up a regex to validate email addresses, and the one that I finally found that claimed to be fully compliant wrapped around to like six lines.
This requirement is a willful violation of RFC 5322, which defines a syntax for email addresses that is simultaneously too strict (before the “@” character), too vague (after the “@” character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.
It seems like HTML5 overruled the specs, and it’s widespread enough that practically, that might be a better “real” definition of a valid email address.
I love the fact that the HTML 5 spec defines a “willful violation.”
See, that’s the issue. What definition of an allowable email address is the server using? Clearly different than what the RFCs outline. If a tree falls in the…if an email address can’t actually be used, is it actually “valid”?
Enough with the back-and-forth. Yes, people have quibbles with some of the examples and answers, but the quiz is just meant to be a fun illustration of how wonky things can get while still theoretically adhering to a spec.
Ironically, since this is all about what’s valid and invalid, my ISP (Astound) is blocking the domain e-mail.wtf, or maybe the whole HLD, and I had to whitelist it.