在Python中进行所有可能的组合,并使用谷歌API的CSV / XLSX文件

我必须在Python中编写一个脚本,将执行以下操作我有一个xlsx / csv文件,其中有一个列中列出300个城市

  1. 我必须使他们之间的所有对,也是在谷歌API的帮助下,我必须增加他们的距离和旅行时间在第二列

我的CSV文件是这样的:

======= SOURCE ======= Agra Delhi Jaipur 

并预期在csv / xlsx文件中的输出是这样的

 ============================================= SOURCE | DESTINATION | DISTANCE | TIME_TRAVEL ============================================= Agra | Delhi | 247 | 4 Agra | Jaipur | 238 | 4 Delhi | Agra | 247 | 4 Delhi | jaipur | 281 | 5 Jaipur | Agra | 238 | 4 Jaipur | Delhi | 281 | 5 

等等..如何做到这一点。
:距离和旅行时间来自谷歌。

为了使对,你可以使用itertools.permutations来获得所有可能的对。 代码相同将如下所示:

 import csv # imports the csv module import sys # imports the sys module import ast import itertools source_list = [] destination_list = [] type_list = []list f = open(sys.argv[1], 'rb') g = open(sys.argv[2], 'wb') # opens the csv file try: reader = csv.reader(f) my_list = list(reader) # creates the reader object for i in my_list: source_list.append(i[0]) a = list(itertools.permutations(source_list, 2)) for i in a: source_list.append(i[0]) destination_list.append(i[1]) mywriter=csv.writer(g) rows = zip(source_list,destination_list) mywriter.writerows(rows) g.close() finally: f.close() 

除此之外,为了从谷歌获得距离和时间,这个示例代码可能适用于完整的debugging。

 import csv # imports the csv module import sys # imports the sys module import urllib2,json import ast api_google_key = '' api_google_url = 'https://maps.googleapis.com/maps/api/distancematrix/json?origins=' source_list = [] destination_list = [] distance_list = [] duration_list = [] f = open(sys.argv[1], 'rb') g = open(sys.argv[2], 'wb') # opens the csv file try: reader = csv.reader(f) my_list = list(reader) # creates the reader object for i in my_list: if i: s = (i[0]) src = s.replace(" ","") d = (i[1]) dest = d.replace(" ","") source = ''.join(e for e in src if e.isalnum()) destination = ''.join(e for e in dest if e.isalnum()) print 'source status = '+str(source.isalnum()) print 'dest status = '+str(destination.isalnum()) source_list.append(source) destination_list.append(destination) request = api_google_url+source+'&destinations='+destination+'&key='+api_google_key print request dist = json.load(urllib2.urlopen(request)) if dist['rows']: if 'duration' in dist['rows'][0]['elements'][0].keys(): duration_dict = dist['rows'][0]['elements'][0]['duration']['text'] distance_dict = dist['rows'][0]['elements'][0]['distance']['text'] else: duration_dict = 0 distance_dict = 0 else: duration_dict = 0 distance_dict = 0 distance_list.append(distance_dict) duration_list.append(duration_dict) mywriter=csv.writer(g) rows = zip(source_list,destination_list,distance_list,duration_list) mywriter.writerows(rows) g.close() finally: f.close() 

你可以通过使用itertools.product来做到这一点,但这意味着你也会得到像(Agra, Agra)这样的重复(Agra, Agra)其实际距离为0。

 import itertools cities = ["Agra","Delhi","Jaipur"] cities2 = cities p = itertools.product(cities, cities2) print(list(p)) 

在这种情况下,你会得到

 [('Agra', 'Agra'), ('Agra', 'Delhi'), ('Agra', 'Jaipur'), ('Delhi', 'Agra'), ('Delhi', 'Delhi'), ('Delhi', 'Jaipur'), ('Jaipur', 'Agra'), ('Jaipur', 'Delhi'), ('Jaipur', 'Jaipur')] 

你可以在这个forlist中循环,并要求谷歌获得旅行时间和距离。

 >>> for pair in list(p): ... print (pair) ... ('Agra', 'Agra') ('Agra', 'Delhi') ('Agra', 'Jaipur') ('Delhi', 'Agra') ('Delhi', 'Delhi') ('Delhi', 'Jaipur') ('Jaipur', 'Agra') ('Jaipur', 'Delhi') ('Jaipur', 'Jaipur') 

你可以像itertools.permutations()一样获得所有的组合:

 from itertools import permutations with open(cities_file, 'r') as f, open(newfile, 'w') as f2: for pair in (permutations([a.strip() for a in f.read().splitlines()], 2)): print pair response = googleapi.get(pair) f2.write(response+'\n') 

print pair输出

 ('Agra', 'Delhi') ('Agra', 'Jaipur') ('Delhi', 'Agra') ('Delhi', 'Jaipur') ('Jaipur', 'Agra') ('Jaipur', 'Delhi') 

然后,您可以从列表元素1到1中按api,并将结果保存在文件中。