Fixing Geographic Travelling Salesman Issues utilizing Python | by Mike Jones | Jul, 2023



Utilizing pyconcorde to search out optimum options to real-world routing issues

An optimum automobile driving route between 79 UK cities. Picture by writer. Map information from OpenStreetMap.

The well-known Travelling Salesman Drawback (TSP) is about discovering an optimum route between a set of nodes (cities) and returning to the place you began. It sounds easy, however is not possible to resolve by brute pressure for giant numbers of nodes, for the reason that variety of potential orderings of n cities is n!. Which means that for even simply 30 cities, the variety of journeys you would need to verify is 265,252,859,812,191,058,636,308,480,000,000. Giant TSP issues are impractical to resolve by brute pressure, even by highly effective computer systems.

Randomly generated TSP with 10 nodes: 3,628,800 potential routes to verify. Picture by writer.

Luckily, some algorithms have been developed that dramatically cut back the quantity of compute wanted to resolve massive TSPs. One such piece of software program, Concorde, was developed a few a long time in the past to be used within the educational neighborhood. Though it’s fairly technical to make use of as stand-alone software program, and is meant for specialists solely, pyconcorde has been developed as a Python wrapper for Concorde. A proof of the algorithms utilized in Concorde is exterior the scope of this text. Nevertheless, we are going to delve into the code wanted to breed these issues and their options in Python.

How would somebody go about fixing a real-world, geographical travelling salesman downside? Actual-world factors usually are not related by easy 2D strains like within the picture above. As a substitute, geographical options are related by numerous potential routes, and people routes will change relying on whether or not somebody is strolling, biking or driving.

Why would possibly a knowledge scientist or software program engineer need to resolve a real-world TSP? Listed here are just a few examples of use instances:

  1. An organization using couriers wants a approach of calculating optimum routes by means of a metropolis, minimizing the time spent on the highway for every of its drivers.
  2. A tour operator wants to search out the shortest route connecting a set of locations, inside a constrained period of time.
  3. A waste disposal firm or native authority must allocate its assets to make sure pickups are ordered in as environment friendly a fashion as potential.

With a purpose to resolve a real-world TSP, the routingpy library can be utilized to search out routes, distances (in metres) and durations (in seconds) between geographical factors in [longitude, latitude] pairs. On this article we are going to describe the tactic that can be utilized for such an issue.

A information to fixing a geographic TSP utilizing Python is printed right here. The broad construction of the problem-solving course of is as follows:

  1. Get hold of a listing of n coordinates as [longitude, latitude] pairs.
  2. Use a routing service to acquire a matrix (n x n) of real-world durations between every of those coordinates, for the suitable profile (strolling, biking, driving a automobile, driving an HGV, and so on). This matrix will probably be uneven (driving from A to B will not be the precise reverse of B to A).
  3. Rework (n x n) matrix right into a symmetric matrix (2n x 2n).
  4. Feed this matrix into the Concorde solver to search out an optimum ordering of coordinates.
  5. Create the real-world route utilizing the routing service.
  6. Visualize the outcomes on a map.
  7. Optionally, create a GPX file of the ultimate route.

Every of those steps will probably be coated intimately.

Step 1: Acquiring coordinates

For our instance, we are going to think about the issue of driving a automobile between 79 cities within the UK. Proven under is a map of the UK cities in blue. An information scientist can discover coordinates in a lot of methods. If required, they are often discovered manually utilizing Google Maps or Google Earth.

79 cities within the UK. Picture by writer. Map information from OpenStreetMap.

The code construction and information used on this instance can also be accessible in this GitHub repository.

Here’s a CSV file containing the coordinates of the cities (gb_cities.csv within the repo), and under it the code required to import it utilizing pandas.

Place Title,Latitude,Longitude
import pandas as pd
df = pd.read_csv('gb_cities.csv')
coordinates = df[['Longitude', 'Latitude']].values
names = df['Place Name'].values

Step 2: Utilizing a routing service to acquire length matrix

There are a number of routing providers accessible by means of the routingpy library. The API from Graphhopper features a free tier which permits rate-limited use. Different routers which are accessible by means of routingpy are listed within the documentation.

import routingpy as rp
import numpy as np

api_key = # get a free key at
api = rp.Graphhopper(api_key=api_key)
matrix = api.matrix(areas=coordinates, profile='automobile')
durations = np.matrix(matrix.durations)

Right here is durations, a 79 x 79 matrix of the driving time in seconds between coordinates:

matrix([[    0, 10902, 30375, ..., 23380, 25233, 19845],
[10901, 0, 23625, ..., 16458, 18312, 13095],
[30329, 23543, 0, ..., 8835, 9441, 12260],
[23397, 16446, 9007, ..., 0, 2789, 7924],
[25275, 18324, 9654, ..., 2857, 0, 9625],
[19857, 13071, 12340, ..., 8002, 9632, 0]])

The driving time between cities could be decided as follows:

  1. Every row and column corresponds to a metropolis: Aberdeen is the primary row and column, Ayr the second, Basildon the third, and so forth.
  2. To seek out the time between Aberdeen and Ayr, take a look at the first row, 2nd column: 10,902 seconds. The reverse time (Ayr to Aberdeen) is 10,901 seconds.
  3. Usually, the time from the i-th to the j-th metropolis is on the intersection between the i-th row and j-th column.

Discover that the matrix, as anticipated, has zeros alongside the diagonal, since every level is related to itself with zero distance or length. Additionally, the matrix will not be fairly symmetric: driving durations between cities are unlikely to be similar in reverse instructions, attributable to completely different highway layouts and site visitors hotspots. They’re broadly related, although, as could be anticipated.

Step 3: Reworking uneven matrix to symmetric

Earlier than utilizing this matrix to generate an optimum ordering in pyconcorde, we have to make the matrix symmetric. A technique for remodeling an uneven TSP into symmetric TSP is described by Jonker and Volgenant (1983): Reworking uneven into symmetric touring salesman issues, Operations Analysis Letters, 2(4), 161–163. What follows is the idea behind this transformation. If desired, this part could be skipped (scroll right down to the part titled Reworking the geographic uneven TSP).

Jonker/Volgenant uneven to symmetric transformation

Beneath is a visualization of an uneven TSP with 3 nodes, and its distance matrix.

Uneven TSP with 3 nodes. Picture by writer.
matrix([[0, 5, 2],
[7, 0, 4],
[3, 4, 0]])

Here’s a sketch of the tactic used to rework this right into a symmetric TSP:

  1. Create new ghost nodes, A’, B’ and C’. Be part of A to A’, B to B’ and C to C’ with distance zero.
  2. Join the nodes with weights as follows:
    A to B is now represented by A’ to B; B to A is now B’ to A.
    B to C is now B’ to C; C to B is now C’ to B.
    C to A is now C’ to A; A to C is A’ to C.
  3. Set all different edge weights to be infinite, so any algorithm doesn’t try and journey between them. Since this will probably be impractical later when utilizing pyconcorde, as a substitute set all different weights to be a lot increased than the very best weight we now have. On this case, we are going to set them to equal 99.
Equal symmetric TSP with (3 x 2) nodes. Picture by writer.

Right here is the ensuing distance matrix. The ordering of the nodes within the matrix is: A, B, C, A’, B’, C’.

matrix([[ 0, 99, 99,  0,  7,  3],
[99, 0, 99, 5, 0, 4],
[99, 99, 0, 2, 4, 0],
[ 0, 5, 2, 0, 99, 99],
[ 7, 0, 4, 99, 0, 99],
[ 3, 4, 0, 99, 99, 0]])

Be aware once more that the diagonal is zero, as could be anticipated, and that the matrix is now symmetric. The unique matrix is within the bottom-left nook of the brand new matrix, and its transpose is within the top-right. In the meantime, the top-left and bottom-right elements comprise very excessive weights between nodes.

A, B and C (top-left) are not related to one another (strictly talking, they’re related however with very excessive as a substitute of infinite weight, for sensible functions). Which means that any algorithm is not going to search to discover a path between these nodes. Likewise, A’, B’ and C’ (bottom-right) usually are not related to one another. As a substitute, the directional nature of the unique uneven community is represented right here by the weights on the unique nodes A, B and C, along with their ghosts A’, B’ and C’.

There’s a one-to-one mapping between options of the unique uneven downside and the brand new, symmetric TSP:

  • A — B — C — A corresponds to A — A’ — B — B’ — C — C’ — A
  • A — C — B — A corresponds to A — A’ — C — C’ — B — B’ — A

In every case the ghost nodes A’, B’ and C’ alternate with the unique nodes A, B and C, and every unique node is adjoining to its ‘associate’ ghost node (A is adjoining to A’, and so forth).

Reworking the geographic uneven TSP

Again to our sensible instance. We are able to create a operate to rework an uneven TSP matrix right into a symmetric one:

def symmetricize(m, high_int=None):

# if high_int not supplied, make it equal to 10 instances the max worth:
if high_int is None:
high_int = spherical(10*m.max())

m_bar = m.copy()
np.fill_diagonal(m_bar, 0)
u = np.matrix(np.ones(m.form) * high_int)
np.fill_diagonal(u, 0)
m_symm_top = np.concatenate((u, np.transpose(m_bar)), axis=1)
m_symm_bottom = np.concatenate((m_bar, u), axis=1)
m_symm = np.concatenate((m_symm_top, m_symm_bottom), axis=0)

return m_symm.astype(int) # Concorde requires integer weights

symmetricize(durations) returns:

matrix([[     0, 461120, 461120, ...,  23397,  25275,  19857],
[461120, 0, 461120, ..., 16446, 18324, 13071],
[461120, 461120, 0, ..., 9007, 9654, 12340],
[ 23397, 16446, 9007, ..., 0, 461120, 461120],
[ 25275, 18324, 9654, ..., 461120, 0, 461120],
[ 19857, 13071, 12340, ..., 461120, 461120, 0]])

This 158 x 158 matrix comprises a replica of durations within the backside left and a transposed copy within the prime proper. The excessive worth of 461,120 (10 instances the utmost worth in durations) implies that, for sensible functions, nodes with this length usually are not related.

This matrix can lastly be fed into pyconcorde to calculate an optimum path.

Step 4: Utilizing the Concorde solver

Putting in pyconcorde

Run the next instructions to put in pyconcorde (set up is accessible in Linux or Mac OS, however not in Home windows at current):

virtualenv venv                                  # create digital atmosphere
supply venv/bin/activate # activate it
git clone # clone git repo
cd pyconcorde # change listing
pip set up -e . # set up pyconcorde

Fixing the TSP in Python

Now we will import from concorde in a Python script.

from concorde.downside import Drawback
from concorde.concorde import Concorde

def solve_concorde(matrix):
downside = Drawback.from_matrix(matrix)
solver = Concorde()
answer = solver.resolve(downside)
print(f'Optimum tour: {answer.tour}')
return answer

Our symmetric durations matrix could be fed into solve_concorde().

durations_symm = symmetricize(durations)
answer = solve_concorde(durations_symm)

Right here is the print output:

Optimum tour: [0, 79, 22, 101, 25, 104, 48, 127, 68, 147, 23, 102, 58, 137, 7, 86, 39, 118, 73, 152, 78, 157, 36, 115, 42, 121, 62, 141, 16, 95, 20, 99, 51, 130, 40, 119, 19, 98, 59, 138, 50, 129, 54, 133, 27, 106, 10, 89, 4, 83, 66, 145, 33, 112, 14, 93, 2, 81, 45, 124, 32, 111, 11, 90, 29, 108, 34, 113, 24, 103, 8, 87, 17, 96, 56, 135, 64, 143, 61, 140, 75, 154, 52, 131, 71, 150, 18, 97, 3, 82, 9, 88, 74, 153, 55, 134, 72, 151, 28, 107, 12, 91, 70, 149, 65, 144, 35, 114, 31, 110, 77, 156, 63, 142, 41, 120, 69, 148, 6, 85, 76, 155, 67, 146, 15, 94, 44, 123, 47, 126, 60, 139, 57, 136, 38, 117, 13, 92, 5, 84, 43, 122, 49, 128, 46, 125, 21, 100, 1, 80, 30, 109, 53, 132, 37, 116, 26, 105]

This answer reveals the ordering of nodes within the optimum tour. Be aware that this answer, as anticipated above, comprises unique nodes (numbered 0 to 78) alternating with their associate ghost nodes (79 to 157):

  • 0 is partnered with 79,
  • 22 with 101,
  • 25 with 104, and so forth…

This implies that the answer has labored appropriately.

Step 5: Creating the real-world route

The following step is to select alternate parts of the answer (the nodes akin to the unique 79 cities), then order the coordinates accordingly.

# choose alternate parts: these correspond to the originals
tour = answer.tour[::2]

# order the unique coordinates and names
coords_ordered = [coordinates[i].tolist() for i in tour]
names_ordered = [names[i] for i in tour]

Listed here are the primary few metropolis names in names_ordered, (the true ordering of the cities within the optimum tour):

'Newcastle Upon Tyne',

Now we add again within the first metropolis to make a whole looped tour, and eventually get hold of the ultimate route utilizing the Graphhopper instructions API.

# add again within the first for an entire loop
coords_ordered_return = coords_ordered + [coords_ordered[0]]

# get hold of full driving instructions for the ordered loop
instructions = api.instructions(areas=coords_ordered_return, profile='automobile')

Step 6: Visualization on a map

Seeing the ultimate route on a map will allow us to be assured within the consequence, in addition to permitting us to make use of the answer in a sensible setting. The next code will show a folium map which could be saved to HTML.

import folium
def generate_map(coordinates, names, instructions):

# folium wants lat, lengthy
coordinates = [(y, x) for (x, y) in coordinates]
route_points = [(y, x) for (x, y) in directions.geometry]
lat_centre = np.imply([x for (x, y) in coordinates])
lon_centre = np.imply([y for (x, y) in coordinates])
centre = lat_centre, lon_centre

m = folium.Map(location=centre, zoom_start=1, zoom_control=False)

# plot the route line
folium.PolyLine(route_points, shade='crimson', weight=2).add_to(m)

# plot every level with a hover tooltip
for i, (level, title) in enumerate(zip(coordinates, names)):
tooltip=f'{i}: {title}',

custom_tile_layer = folium.TileLayer(
attr='CartoDB Positron',
opacity=0.7 # Alter opacity to regulate the extent of greying out


sw = (np.min([x for (x, y) in coordinates]), np.min([y for (x, y) in coordinates]))
ne = (np.max([x for (x, y) in coordinates]), np.max([y for (x, y) in coordinates]))
m.fit_bounds([sw, ne])

return m

generate_map(coords_ordered, names_ordered, instructions).save('gb_cities.html')

The result’s proven on the prime of this text. Click on right here to view as an interactive map. It’s potential to zoom in to the map to see extra element and to hover over particular person cities which is able to reveal their quantity within the tour sequence. Beneath is a zoomed-in a part of the map displaying the route passing by means of Sheffield (between Lincoln and Chesterfield on the optimum tour).

Picture by writer. Map information from OpenStreetMap.

Step 7: Non-obligatory: Making a GPX file

If the calculated route must be adopted in real-life, as an illustration on a tool with a GPS (comparable to a telephone or automobile navigation system), a GPX could be created. This isn’t a part of the optimization downside, however is an elective further step accessible if you wish to save the path to a file. The GPX file is created from the instructions variable:

def generate_gpx_file(instructions, filename):
gpx_template = """<?xml model="1.0" encoding="UTF-8"?>
<gpx model="1.1" xmlns=""

trkseg_template = """
<trkpt lat="{}" lon="{}"/>

trkseg_elements = ""
for level in instructions.geometry:
trkseg_elements += trkseg_template.format(level[1], level[0])

gpx_data = gpx_template.format(trkseg_elements)

with open(filename, 'w') as file:

generate_gpx_file(instructions, 'gb_cities.gpx')

The GPX file for this downside could be downloaded right here.

Now we have seen how we will mix the next parts to resolve real-world geographic travelling salesman issues:

  1. Instructions and length matrices from the routingpy library, specifying an acceptable profile (mode of transport).
  2. Environment friendly and highly effective Concorde solver by way of the pyconcorde wrapper, to supply an optimum route.
  3. Visualization utilizing folium to create a map.

The driving tour proven above is a convincing answer to the 79-city travelling salesman downside, and based on the Concorde solver is provably ‘optimum’. Since we’re working with real-world information, nonetheless, the top result’s solely nearly as good because the enter. We’re relying that the point-to-point durations matrix obtained from routingpy is consultant of the real-world. In actuality, the time taken to stroll, cycle or drive between factors will rely upon the time of day, or day of the week. It is a limitation of the tactic that we’ve used. A technique of being extra assured in the long run consequence could be to attempt the identical technique with an various routing service. Every routing service (Graphhopper, ORS, Valhalla, and so forth) has its personal API which might be used for a TSP downside such because the one described right here, and the outcomes might be in contrast from completely different providers.

Regardless of the real-world limitations of fixing an issue comparable to this, the methodology above gives a superb place to begin for a salesman or courier needing to make their approach spherical a metropolis in as environment friendly a fashion as potential or a vacationer hoping to catch as many sights as potential on their tour. By visualizing the outcomes on an interactive map and storing the route as a GPX file, the answer is helpful by the top person, not simply the information scientist who carried out the code.