During my search for a good regex to match and validate email addresses, I went trough a lot of failures. Every possible Murphy law I could breach, I did! Main reasons why my regexes failed?
- IE7 crashed.
- It wasn’t valid for all email addresses, it wasn’t valid for all email addresses, it wasn’t valid for all email addresses, ..
For those who are not sure what a regex or regular expression is: a regex is a pattern for text. You can use it to search trough big textes, or use the pattern to validate a string or text. If you are not completely familiar with them, there are a great bunch of aetixles out there so have a read!
To continue the story of my email validating conquest: I ended up believe that the validation should be as fast and simple possible. We do not need to validate if the extension (.com, .be) is an actual existing extension. What import is, is that there is an @, and then a ., and in between just a bunch of ‘any’ characters.
I ended up using this pattern (Javascript):
var pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,4}$/;
In short, this regex tells the following:
- ^[a-zA-Z0-9._%+-]: Start with as many of this characters.
- +@: Have an @.
- [a-zA-Z0-9.-]: Continue with any of these.
- \.: Have a dot.
- [a-zA-Z]{2,4}$: Continue with any characters out of this range, minimum 2, maximum 4 at the end of the text.
As you can see, ranges can be identified by using a-z, or A-Z, which says, all signs of the alphabet, in lower or uppercase.
Also, I would advice not to use any wildcards! This can cause older browsers to go nuts, when the text is to long!