Python Script: Credit Card / SSN Hunter

    A common practice I see from time to time that makes me cringe -- documents titled "passwords" which contains passwords.  It's fairly simple to hunt those down though.  Files containing sensitive data such as social security numbers and credit card numbers are a harder due to not so obvious filenames and the numeric formatting possibilities.  I was originally intending to go with two different scripts but ended up combing them.

    This test script searches recursively for .txt files, hunts for both social security numbers and credit card numbers, with dashed and non-dashed variations, and then it spits out the number with the corresponding filename and path.    

    #!/usr/bin/python3
    import re
    import sys
    import glob
    folder_path = './'
    for filename in glob.iglob(folder_path + '**/*.txt', recursive=True):
        file = open(filename, 'r',errors='ignore')
        for line in file:
            if re.match(r'\b(?:\d[ -]*?){13,16}\b', line):
                sys.stdout.write(filename+':'+line)
            elif re.match(r'^\d{3}-?\d{2}-?\d{4}$|^XXX-XX-XXXX$', line):
                sys.stdout.write(filename+':'+line)


    © 2020 sevenlayers.com