在 Python 中进行所有可能的组合,并对 csv/xlsx 文件使用 google API

making all possible combination in Python and also use of google API for csv/xlsx file

我必须在 python 中编写一个脚本来执行以下操作 我有一个 xlsx/csv 文件,其中一列

列出了 300 个城市
  1. 我必须在他们之间制作所有配对,并且在 google api 的帮助下,我必须在第二列中添加他们的距离和旅行时间

我的 CSV 文件 看起来像这样:

=======
SOURCE 
=======
Agra 
Delhi 
Jaipur 

在csv/xlsx文件中的预期输出是这样的

=============================================
SOURCE | DESTINATION | DISTANCE | TIME_TRAVEL
=============================================
Agra   |    Delhi    |    247   |      4    
Agra   |    Jaipur   |    238   |      4    
Delhi  |    Agra     |    247   |      4    
Delhi  |    jaipur   |    281   |      5
Jaipur |    Agra     |    238   |      4    
Jaipur |    Delhi    |    281   |      5        

等等..如何做到这一点?
注意:距离和旅行时间来自google。

你可以用 itertools.permutations() 得到所有的组合,像这样:

from itertools import permutations

with open(cities_file, 'r') as f, open(newfile, 'w') as f2:
    for pair in (permutations([a.strip() for a in f.read().splitlines()], 2)):
        print pair
        response = googleapi.get(pair)
        f2.write(response+'\n')

print pair

的输出
('Agra', 'Delhi')
('Agra', 'Jaipur')
('Delhi', 'Agra')
('Delhi', 'Jaipur')
('Jaipur', 'Agra')
('Jaipur', 'Delhi')

然后您可以从列表元素中逐个点击 api 并将结果存储在文件中。

您可以使用 itertools.product 来做到这一点,但这意味着您还会得到像 (Agra, Agra) 这样的重复,其距离实际上为 0。

import itertools
cities = ["Agra","Delhi","Jaipur"]
cities2 = cities
p = itertools.product(cities, cities2)
print(list(p))

在这种情况下你会得到

[('Agra', 'Agra'), ('Agra', 'Delhi'), ('Agra', 'Jaipur'), ('Delhi', 'Agra'), ('Delhi', 'Delhi'), ('Delhi', 'Jaipur'), ('Jaipur', 'Agra'), ('Jaipur', 'Delhi'), ('Jaipur', 'Jaipur')]

您可以在此 forlist 中循环并向 google 发出请求以获取行程时间和距离。

>>> for pair in list(p):
...     print (pair)
...
('Agra', 'Agra')
('Agra', 'Delhi')
('Agra', 'Jaipur')
('Delhi', 'Agra')
('Delhi', 'Delhi')
('Delhi', 'Jaipur')
('Jaipur', 'Agra')
('Jaipur', 'Delhi')
('Jaipur', 'Jaipur')

要制作配对,您可以使用 itertools.permutations 获取所有可能的配对。 相同的代码如下:

import csv     # imports the csv module
import sys      # imports the sys module
import ast
import itertools    
source_list = []
destination_list = []
type_list = []list
f = open(sys.argv[1], 'rb')
g = open(sys.argv[2], 'wb')
 # opens the csv file
try:
    reader = csv.reader(f)
    my_list = list(reader) # creates the reader object
    for i in my_list:
        source_list.append(i[0])
    a = list(itertools.permutations(source_list, 2))
    for i in a:
        source_list.append(i[0])
        destination_list.append(i[1])
    mywriter=csv.writer(g)
    rows = zip(source_list,destination_list)
    mywriter.writerows(rows)
    g.close()

finally:
    f.close() 

除了从 google 获取距离和时间之外,此示例代码可能适用于全面调试。

import csv     # imports the csv module
import sys      # imports the sys module
import urllib2,json
import ast  
api_google_key = ''
api_google_url = 'https://maps.googleapis.com/maps/api/distancematrix/json?origins='
source_list = []
destination_list = []
distance_list = []
duration_list = []
f = open(sys.argv[1], 'rb')
g = open(sys.argv[2], 'wb')
 # opens the csv file
try:
    reader = csv.reader(f)
    my_list = list(reader) # creates the reader object
    for i in my_list:
    if i:
            s = (i[0])
        src = s.replace(" ","")
            d = (i[1])
        dest = d.replace(" ","")
        source = ''.join(e for e in src if e.isalnum())
        destination = ''.join(e for e in dest if e.isalnum())
        print 'source status = '+str(source.isalnum())
        print 'dest status = '+str(destination.isalnum())
        source_list.append(source)
            destination_list.append(destination)
            request = api_google_url+source+'&destinations='+destination+'&key='+api_google_key
        print request
            dist = json.load(urllib2.urlopen(request))
        if dist['rows']:
                if 'duration' in dist['rows'][0]['elements'][0].keys():
                        duration_dict = dist['rows'][0]['elements'][0]['duration']['text']
                        distance_dict = dist['rows'][0]['elements'][0]['distance']['text']
                else:
                    duration_dict = 0
                    distance_dict = 0
        else:
                duration_dict = 0
                distance_dict = 0

            distance_list.append(distance_dict)
            duration_list.append(duration_dict)
    mywriter=csv.writer(g)
    rows = zip(source_list,destination_list,distance_list,duration_list)
    mywriter.writerows(rows)
    g.close()

finally:
    f.close()