Archive
Posts Tagged ‘regex’
extract e-mails from a file
October 10, 2017
Leave a comment
Problem
You have a text file and you want to extract all the e-mail addresses from it. For research purposes, of course.
Solution
#!/usr/bin/env python3
import re
import sys
def extract_emails_from(fname):
with open(fname, errors='replace') as f:
for line in f:
match = re.findall(r'[\w\.-]+@[\w\.-]+', line)
for e in match:
if '?' not in e:
print(e)
def main():
fname = sys.argv[1]
extract_emails_from(fname)
##############################################################################
if __name__ == "__main__":
if len(sys.argv) == 1:
print("Error: provide a text file!", file=sys.stderr)
exit(1)
# else
main()
I had character encoding problems with some lines where the original program died with an exception. Using “open(fname, errors='replace')” will replace problematic characters with a “?“, hence the extra check before printing an e-mail to the screen.
The core of the script is the regex to find e-mails. That tip is from here.
Python Regular Expression Testing Tool
April 11, 2012
Leave a comment
See http://www.pythonregex.com/. Cool stuff, it also generates source code. Happiness!
For a screenshot, click on the image on the right side.
Categories: python
regex, regexp, regexp online, regular expressions

You must be logged in to post a comment.