This coursework considers a simplified version of the travelling salesman problem.
- To get skills for solving more complex problems with programs.
- To make you consider different search algorithms and data structures.
- To give you more practice with writing tests.
- To gain experience with Test-Driven-Development (TDD).
Suppose there are a number of "cities", as in shown in Figure 1:
The distance between any two cities with the coordinates (x1, y1) and (x2,y2) is the standard Euclidean distance, that is,
sqrt((x1-x2)^2 + (y1-y2)^2)
Thus, we assume that the Earth is "flat" for the purposes of this assignment. A traveling salesman wishes to visit every city exactly once, then return to their starting point. (It doesn't matter what city is the starting point.) Such a path is called a circuit, as in Figure 2:
However, the salesman also wishes to minimise the total distance that must be travelled.
This is a classic computer science problem, known as the Traveling Salesman problem. You can find algorithms for finding reasonably good solutions on the web, and you are welcome to look at those algorithms. However, we want you use a *hill climbing& approach, where you start with "any" solution, and try to progressively improve it until you can't improve it any more.
We are providing a file, city-data.txt
, containing the
latitudes and longitudes of the fifty state capitals of the U.S.A.
Each line contains:
- the name of the state,
- the name of the city,
- the latitude, and
- the longitude.
These four items are separated by tabs (\t symbol). Read this file in as a list of four-tuples. The list tuples will be referred to as a "road map". It represents the path the salesman follows, starting with the first city in the list and ending back at the first city in the list. You can assume that the format of the input file as above.
Note however that the file name as well as its contents can be arbitrary (e.g.,
it can be a file Brazil.txt
with the info on Brazilian states/cities).
While we will require these particular data representations as function parameters and function results, this does not imply that you have to work with these representations as you solve the problems. What you do inside the functions is up to you, as long as they work as expected. (Python makes it easy to convert from one kind of sequence to another.)
def read_cities(file_name)
Read in the cities from the given
file_name
, and return them as a list of four-tuples:
[(state, city, latitude, longitude), ...]
Note that the list above will be used as your initial
road_map
, that is,
Alabama -> Alaska -> Arizona -> ... -> Wyoming.
as in the example of the file city-data.txt
def print_cities(road_map)
Prints a list of cities, along with their locations. Print up to two digits after the decimal point.
def compute_total_distance(road_map)
Returns, as a floating point number, the sum of the distances of all the connections in the
road_map
. Remember that it's a cycle, so that for example in the initialroad_map
above, Wyoming connects to Alabama...
def swap_cities(road_map, index1, index2)
Take the city at location
index
in theroad_map
, and the city at locationindex2
, swap their positions in theroad_map
, compute the new total distance, and return the tuple
(new_road_map, new_total_distance)
Allow for the possibility that index1=index2
,
and handle this case correctly.
def shift_cities(road_map):
For every index i in the
road_map
, the city at the position i moves to the position i+1. The city at the last position moves to the position 0. Return the the new road map.
def find_best_cycle(road_map)
Using a combination of
swap_cities
andshift_cities
, try10000
swaps/shifts, and each time keep the best cycle found so far. After10000
swaps/shifts, return the best cycle found so far. Use randomly generated indices for swapping.
def print_map(road_map)
Prints, in an easily understandable textual format, the cities and their connections, along with the cost for each connection and the total cost.
def main()
Requests to specify the file to load (make sure the file exists and valid), reads in and prints out the city data, creates the "best" cycle, prints it out.
Programs should be tested as thoroughly as possible using
the pytest
unit testing framework. Functions that do input or output
are difficult to test. Therefore, those functions, should do as little
computation as possible, and functions that do computation, should do no
input or output.
In this assignment main
, read_cities
, print_cities
, and
print_map
result in input or output, so you do not need to
write unit tests for these. Also, you do not need to test find_best_cycle
because of random results.
Provide unit tests for all the other functions, as well as any additional
computational functions you might write.
- You don't need any global variables.
- Import
random
at the top of your program; thenwill give you a floating-point number in the rangenumber = N * random.random()
0.0 <= number < N
. - If you want to treat a list
lst
as circular (the first item follows the last item), the item afterlst[i]
is not justlst(i + 1)
, but islst[(i + 1) % len(lst)]
.
The assignment as stated above will be graded on the basis of 70% total:
- Correctness and adherence to specification (weight 30%). We will have our own unit tests, which your functions must pass. This means that the above functions must take parameters and produce results exactly as specified, or our test will fail and cost you points.
- Completeness of code, tests, and commit history (weight 30%). You must implement all the required functions and the implementation must handle all the permitted inputs correctly. You must provide good code coverage to catch any errors before we do. We expect to see at least 5 tests for each function that needs to be tested (see above). We will assess the regularity of your commit history on GitHub and would like to see multiple commits (at least 20) spread over a reasonable period of time (at least 1 month). The solutions that appear fully in one commit (out-of-nowhere) a day before the deadline will lose many points. You will also lose some points if you do not do Initial Work (see below) on time by the deadline of Session 7.
- Style (weight 10%).
- Use proper indentation and spacing.
- Don't repeat code if you can put it in a function and just call the function.
- Try to avoid redundancy (such as the beginner's
if better == True
).
To obtain additional 30% points, you will need to extend your implementation with functionality to visualize road maps. There are two ways to do that:
A) Use textual printing to visualise the map. For example, you can print
the symbols | and - to "draw" the grid and spread the numbers 1,2,3... on
that grid in such a way that they imitate with a correct scale geographical
position of the cities. In that case, you would place 1 in the position that
corresponds to the first city visited, 2 of the second, etc. (You are not
expected to print the names of the cities, as it would need too much space.)
For example, the road map
[("Kentucky", "Frankfort", 38.197274, -84.86311),
("Delaware", "Dover", 39.161921 -75.526755),
("Minnesota", "Saint Paul", 44.95 -93.094)]
can be "drawn" as follows:
Note that your implementation should correctly handle all the possible road maps with coordinates ranging between 90 and -90 for latitude and -180 to 180 for longitude.
B) Implement a proper GUI window using Tkinter module. In that case, you can do the visualisation using a similar approach to that described above. But you should properly draw the coordinate lines, mark cities by circles, and possibly label them not only with numbers but also with city/state names. A quick tutorial on Tkinter can be found, e.g., in https://www.datacamp.com/community/tutorials/gui-tkinter-python. A proper documentation is at https://docs.python.org/3/library/tkinter.html#module-tkinter
Should you pursue option B), be aware that you will be studying Tkinter on your own. None of the lectures or TAs have expertise in this module to answer technical questions.
Presumably, in B) you will spend much time learning a new tool but less time working on the implementation because of the available drawing functions of Tkinter.
Independently of whether you choose A) or B), you will need to provide an additional function
def visualise(road_map)
that would either graphically print the given road_map or will open a GUI window with
the drawing of the road_map. Also, extend the functionality of your main
function so
that it provides visualisation of the best route when found. You do not need to test
the visualise
function.
With either A) or B) are will able to obtain up to 30%. Marking will be based on
style of your code (see above) as well as on the usability of your solution from an
end-user perspective. We will try your visualise
implementation on various road maps
and the correctness and meaningfulness of your implementation will determine your mark.
In an attempt to convince you to work in TDD (Test-Driven Development)-style, we require that you do the following by deadline of Session 7:
- Write at least one meaningful test for each testable function in
test_cities.py
- Write a stub (a stub is a function that does nothing or, if it must return a value,
returns a value that is unlikely to be correct) for each testable function in
cities.py
- Properly set up a GitHub repo (see separate guides) and commit the files above
If you run pytest, your tests will (most certainly) fail but this will not be considered in marking.