Email Hostname Check
I just happened to read a new thread on comp.lang.python. It asks about checking TLD. It is an interesting question, if you also knew this news, "AFP: New internet domain names in 2009: ICANN". That means, probably, in 2009, there are many TLD would be shown up. You can no long just check .com, .net, ..., or other ccTLD.
So, I think better way is to make a query of DNS. Here is my code:
1 # Email address validator
2 #
3 # This module is in Public Domain
4 #
5 # This module was written for replying to
6 # http://groups.google.com/group/comp.lang.python/browse_thread/thread/80d7d31bebc09190
7 # * It requires dnspython (http://www.dnspython.org/).
8 # * It is a simple prototype.
9 # * It would be slow if query mass email addresses, having cache mechanism
10 # would be very helpful.
11 # * It only checks hostname of email address.
12 #
13 # Author : Yu-Jie Lin
14 # Creation Date: 2008-07-02T20:09:07+0800
15
16
17 import dns.resolver
18
19
20 def CheckEmail(email):
21 """This function directly extracts the hostname and query it"""
22 email_parts = email.split('@')
23 if len(email_parts) != 2:
24 return False
25
26 # Start querying
27 try:
28 answers = dns.resolver.query(email_parts[1], 'MX')
29 except dns.resolver.NoAnswer:
30 # This host doesn't have MX records
31 return False
32 except dns.resolver.NXDOMAIN:
33 # No such hostname
34 return False
35
36 # Possible a valid hostname
37 return True
This is just a quick snippet. It's not handling many things, but it's in Public Domain, so you can do whatever you need to it. You also need dnspython. There seems to be another option of DNS query, PyDNS, I guess you can modify the code to use PyDNS.
Eventually, it's not efficient in performance. You have to query for each email address. I think it can be improved by adding cache or bypassing .com, .net, etc, if it's ok to not to check those hosts.

