在Python中进行所有可能的组合,并使用谷歌API的CSV / XLSX文件
我必须在Python中编写一个脚本,将执行以下操作我有一个xlsx / csv文件,其中有一个列中列出300个城市
- 我必须使他们之间的所有对,也是在谷歌API的帮助下,我必须增加他们的距离和旅行时间在第二列
我的CSV文件是这样的:
======= SOURCE ======= Agra Delhi Jaipur
并预期在csv / xlsx文件中的输出是这样的
============================================= SOURCE | DESTINATION | DISTANCE | TIME_TRAVEL ============================================= Agra | Delhi | 247 | 4 Agra | Jaipur | 238 | 4 Delhi | Agra | 247 | 4 Delhi | jaipur | 281 | 5 Jaipur | Agra | 238 | 4 Jaipur | Delhi | 281 | 5
等等..如何做到这一点。
注 :距离和旅行时间来自谷歌。
为了使对,你可以使用itertools.permutations来获得所有可能的对。 代码相同将如下所示:
import csv # imports the csv module import sys # imports the sys module import ast import itertools source_list = [] destination_list = [] type_list = []list f = open(sys.argv[1], 'rb') g = open(sys.argv[2], 'wb') # opens the csv file try: reader = csv.reader(f) my_list = list(reader) # creates the reader object for i in my_list: source_list.append(i[0]) a = list(itertools.permutations(source_list, 2)) for i in a: source_list.append(i[0]) destination_list.append(i[1]) mywriter=csv.writer(g) rows = zip(source_list,destination_list) mywriter.writerows(rows) g.close() finally: f.close()
除此之外,为了从谷歌获得距离和时间,这个示例代码可能适用于完整的debugging。
import csv # imports the csv module import sys # imports the sys module import urllib2,json import ast api_google_key = '' api_google_url = 'https://maps.googleapis.com/maps/api/distancematrix/json?origins=' source_list = [] destination_list = [] distance_list = [] duration_list = [] f = open(sys.argv[1], 'rb') g = open(sys.argv[2], 'wb') # opens the csv file try: reader = csv.reader(f) my_list = list(reader) # creates the reader object for i in my_list: if i: s = (i[0]) src = s.replace(" ","") d = (i[1]) dest = d.replace(" ","") source = ''.join(e for e in src if e.isalnum()) destination = ''.join(e for e in dest if e.isalnum()) print 'source status = '+str(source.isalnum()) print 'dest status = '+str(destination.isalnum()) source_list.append(source) destination_list.append(destination) request = api_google_url+source+'&destinations='+destination+'&key='+api_google_key print request dist = json.load(urllib2.urlopen(request)) if dist['rows']: if 'duration' in dist['rows'][0]['elements'][0].keys(): duration_dict = dist['rows'][0]['elements'][0]['duration']['text'] distance_dict = dist['rows'][0]['elements'][0]['distance']['text'] else: duration_dict = 0 distance_dict = 0 else: duration_dict = 0 distance_dict = 0 distance_list.append(distance_dict) duration_list.append(duration_dict) mywriter=csv.writer(g) rows = zip(source_list,destination_list,distance_list,duration_list) mywriter.writerows(rows) g.close() finally: f.close()
你可以通过使用itertools.product
来做到这一点,但这意味着你也会得到像(Agra, Agra)
这样的重复(Agra, Agra)
其实际距离为0。
import itertools cities = ["Agra","Delhi","Jaipur"] cities2 = cities p = itertools.product(cities, cities2) print(list(p))
在这种情况下,你会得到
[('Agra', 'Agra'), ('Agra', 'Delhi'), ('Agra', 'Jaipur'), ('Delhi', 'Agra'), ('Delhi', 'Delhi'), ('Delhi', 'Jaipur'), ('Jaipur', 'Agra'), ('Jaipur', 'Delhi'), ('Jaipur', 'Jaipur')]
你可以在这个forlist中循环,并要求谷歌获得旅行时间和距离。
>>> for pair in list(p): ... print (pair) ... ('Agra', 'Agra') ('Agra', 'Delhi') ('Agra', 'Jaipur') ('Delhi', 'Agra') ('Delhi', 'Delhi') ('Delhi', 'Jaipur') ('Jaipur', 'Agra') ('Jaipur', 'Delhi') ('Jaipur', 'Jaipur')
你可以像itertools.permutations()
一样获得所有的组合:
from itertools import permutations with open(cities_file, 'r') as f, open(newfile, 'w') as f2: for pair in (permutations([a.strip() for a in f.read().splitlines()], 2)): print pair response = googleapi.get(pair) f2.write(response+'\n')
print pair
输出
('Agra', 'Delhi') ('Agra', 'Jaipur') ('Delhi', 'Agra') ('Delhi', 'Jaipur') ('Jaipur', 'Agra') ('Jaipur', 'Delhi')
然后,您可以从列表元素1到1中按api,并将结果保存在文件中。