From Concern #158
June 2007
The world wide web Engineering Process Force (IETF) document, RFC 3696, “Application
Techniques for Checking and Transformation of Names” by John
Klensin,
offers many legitimate e-mail addresses which can be rejected by a lot of PHP
validation routines. The addresses:
Abc\@def@example.com,
customer/department=shipping@example.com and
!def!xyz%abc@example.com
are all valid. Among the much more well-known typical expressions identified within the
literature rejects all of them:
"^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)
↪*(\.[a-z]2,3)$"
This standard expression allows only the underscore (_) and hyphen
(-) characters, numbers and lowercase alphabetic characters. Even
assuming a preprocessing step that converts uppercase alphabetic
characters to lowercase, the expression rejects addresses with
legitimate characters, such as the slash (/),
Microsoft Office Professional 2010, equal sign (=), exclamation
point (!) and percent (%). The expression also requires that the
highest-level domain component has only two or three characters,
Windows 7 Home Premium, thus
rejecting legitimate domains,
Office 2010 Key, such as .museum.
Another favorite typical expression solution is the following:
"^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$"
This normal expression rejects all the legitimate examples from the preceding paragraph.
It does have the grace to allow uppercase alphabetic characters, and
it doesn't make the error of assuming a high-level domain name has only
two or three characters. It allows invalid domain names, such as
example..com.
Listing 1 shows an example from PHP Dev Shed (www.devshed.com/c/a/PHP/Email-Address-Verification-with-PHP/2).
The code contains (at least) three errors. First,
Microsoft Office 2010 Professional, it fails to recognize
numerous valid e-mail address characters, such as percent (%). Second, it
splits the e-mail address into user name and domain parts at the at sign
(@). E-mail addresses that contain a quoted at sign,
Office 2007 Keygen, such as
Abc\@def@example.com will break this code. Third, it fails to check
for host address DNS records. Hosts with a type A DNS entry will accept
e-mail and may not necessarily publish a type MX entry. I'm not
picking on the author at PHP Dev Shed. Far more than 100 reviewers gave
this a four-out-of-five-star rating.