From Issue #158,
Office 2007 Product Key
June 2007
The world wide web Engineering Task Power (IETF) document,
Windows 7 Professional, RFC 3696, “Application
Techniques for Checking and Transformation of Names” by John
Klensin,
presents many legitimate e-mail addresses that are rejected by a lot of PHP
validation routines. The addresses:
Abc\@def@example.com,
customer/department=shipping@example.com and
!def!xyz%abc@example.com
are all legitimate. Among the much more popular normal expressions discovered in the
literature rejects all of them:
"^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)
↪*(\.[a-z]2,3)$"
This typical expression allows only the underscore (_) and hyphen
(-) characters,
Office 2007 Professional, numbers and lowercase alphabetic characters. Even
assuming a preprocessing step that converts uppercase alphabetic
characters to lowercase,
Microsoft Office 2010 Professional, the expression rejects addresses with
valid characters,
Microsoft Office Enterprise 2007, such as the slash (/), equal sign (=), exclamation
point (!) and percent (%). The expression also requires that the
highest-level domain component has only two or three characters, thus
rejecting valid domains, such as .museum.
Another favorite standard expression solution is the following:
"^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$"
This regular expression rejects all the valid examples inside the preceding paragraph.
It does have the grace to allow uppercase alphabetic characters, and
it doesn't make the error of assuming a high-level domain name has only
two or three characters. It allows invalid domain names, such as
example..com.
Listing 1 shows an instance from PHP Dev Shed (www.devshed.com/c/a/PHP/Email-Address-Verification-with-PHP/2).
The code contains (at least) three errors. First, it fails to recognize
several legitimate e-mail address characters, such as percent (%). Second, it
splits the e-mail address into user name and domain parts at the at sign
(@). E-mail addresses that contain a quoted at sign, such as
Abc\@def@example.com will break this code. Third, it fails to check
for host address DNS records. Hosts with a type A DNS entry will accept
e-mail and may not necessarily publish a type MX entry. I'm not
picking on the author at PHP Dev Shed. A lot more than 100 reviewers gave
this a four-out-of-five-star rating.