重新格式化一些邮件列表取决于一个Excel文件

我想修改一堆邮件列表。 每个邮件列表包含一个电子邮件地址列表(每行一个),我称之为“旧”地址。 对于一个给定的电子邮件地址,旧的与旧的一个被引用.xlsx文件。 如果旧地址未被引用,则意味着它已经过时并且必须被删除。 有时邮件列表中的电子邮件地址已经是好的了。 在这种情况下,它必须保持不变。

我在Python中做到了。 我真的没有问题,但是我意识到这并不是那么明显,所以我想分享我的工作。 首先,因为它看起来像我已经看到的一些职位,这可能是有帮助的; 第二,因为我的代码是绝对没有优化的(我不需要优化它,因为它需要大约0.5s,就像我的情况), 我会好奇,看看你会怎么做优化我的代码10 ^ 8个邮件列表。

这里是我最终实现的Python代码:

import xlrd import os path_old = 'toto' path_new = 'tata' mailing_lists = os.listdir(path_old) good_domain = 'gooddomain.fr' printing_level = 3 # reading of the excel file xlsfilename = 'adresses.xlsx' xlsfile = xlrd.open_workbook(xlsfilename) number_of_persons = 250 number_column_old_mail = 7 number_column_new_mail = 5 newmail = [] oldmail = [] for count in range(number_of_persons): oldmail.append(xlsfile.sheets()[0].cell(count,number_column_old_mail).value) newmail.append(xlsfile.sheets()[0].cell(count,number_column_new_mail).value) ############ for mailinglist_name in mailing_lists: if printing_level > 0: print('* dealing with mailing list ',mailinglist_name) new_mailinglist = [] new_name = mailinglist_name + '_new' with open(path_old+'/'+mailinglist_name,'r') as inputfile: for line in inputfile: if len(line)<2: # to ignore blank lines. This length of 2 is completly arbitrary continue line = line.rstrip('\n') ok = False # case 1: the address inside the old mailing list is ok ==> copied in the new mailing list if '@' in line: if line[line.index('@')+1:] == good_domain: new_mailinglist.append(line) if printing_level > 1: print(' --> address ',line,' already ok ==> kept unmodified') ok = True # case 2: the address inside the old mailing list is not ok ==> must be treated if not ok: if printing_level > 1: print(' --> old address ',line,' must be treated') try: # case 2a: the old address is in the excel file ==> replaced ind = oldmail.index(line) if printing_level > 2: print(' --> old address found in the excel file and replaced by ',newmail[ind]) new_mailinglist.append(newmail[ind]) except ValueError: # case 2b: the old address is obsolete ==> removed if printing_level > 2: print(' --> old address removed') with open(path_new+'/'+new_name,'w') as outputfile: for address in new_mailinglist: outputfile.write(address+'\n')