Closed Knight's tours on a 8x8 chessboard
The number of closed knight's tours on a 8x8 chessboard has already been computed. The raw data indicates a total of 1 692 674 826 245 closed tours, and a bit less after removing all the solutions that are linked by reflections and rotations.
This memo briefly describes a new algorithm which counts the 1 692 674 826 245 close tours in 5.1 days on a AMD Ryzen 5 1600X Six-Core Processor. The loop of the algorithm is basically the same as the one I used to compute Magic knight's Tours.
The algorithm is written in C++. pcaseNB[64] is an array of C++ objects, while each instance is a square on the chessboard.
The 4 methods - IsFree(), IsFree2(), FirstStep() and NextStep() - have been profoundly modified in order to search for closed knight's tours instead of magic knight's tours. IsFree2() is similar to IsFree() but when this method is called from a square, this one cannot be a neighbour of the end. FirstStep() is always called as soon as IsFree() or IsFree2() returns true. So, in order to achieve better performance, FirstStep() was integrated into IsFree() and IsFree2().
Another important modification was integrated in IsFree() and IsFree2(). This modification is built around the idea of using subchains, as they were described by Günter Stertenbrink (http://magictour.free.fr/). Thanks to this design pattern, the search for close knight's tours is now limited to 41790 independent searches. Each of these 41790 searches provides set of solutions sharing a distinctive feature, namely, the paths composed of 5 steps around the 4 corners. These 4x5 steps are the 4 subchains for the run.
Below is an example of 2 different closed tours during a run of this program :
As the 4 subchains are always the same for any given index, all the knight's move between the extremities of subchains can be avoided. It is almost as if the knight now have new possibilities to move between the 2 extremities. The solution 1682540098433 is in fact
This new chessboard has only 52 squares and, in the above run above, the moves between C4 and E6 or E4 and H3 ... are valid. Therefore, the algorithm performance is better. For each run between 0 and 41789, a pre-treatment is executed in order to build the dedicated chessboard (52 squares). For each chessboard, the graph between squares is modified : the 2 extremities of the 4 subchains are directly connected. The 12 squares (4 corners and 2 neighbours for each corner) are not used anymore.
With this new "52 squares chessbaord", another point is important in achieving a good performance : what is the best choice for starting the search ? It seems logical to choose one extremity among the 4 subchains, considering that a given starting square leads to an unique end. But the 8 extremities of the 4 subchains are all possible candidates. I looked for a rule based on the numbers of "the 2 steps neighours" of the starting square and ending square for the 8 extremities.
Idealy, it should be possible to find the best candidate for each of the 41790 indexes, but it would be a long process. I tested only several hundred indexes among the 41790 and I printed, for each tested index, these "2 steps neighours" figures. Below is an example with an i41K index that has 8 possibilities.
The first column (i41K) is the tested index among [0,41789], the first 2 numbers of the SUBCHAIN are respectively "start" (starting square) and "end" (ending square). "Start" and "end" are always the 2 extremities of the A1 subchain. The other lines are built by reflecting and rotating the chessboard in order to always have "start" and "end" around corner A1. (squares are numbered from 0 (A1) to 63 (H8), line by line, and so 7 is H1 and 8 is A2).
"s1" and "e1" are numbers of neighbours of "start" and "end". "s2" and "e2" are the numbers of "the 2 steps neighbours". The column FirstSteps indicates the number of times when FirstStep is called. The number of Closed Tours is of course the same for the 8 runs.
i41K | SUBCHAIN | s1 | e1 | s2 | e2 | FirstSteps | Closed Tours | |
2114 | 0 | [02,04][33,35][31,47][03,28] | 3 | 3 | 15 | 19 | 6821019247 | 118576414 |
2114 | 1 | [32,27][40,24][52,36][03,05] | 3 | 6 | 19 | 40 | 5939213819 | 118576414 |
2114 | 2 | [32,16][60,35][61,59][30,28] | 3 | 3 | 19 | 15 | 7615525330 | 118576414 |
2114 | 3 | [11,27][60,58][31,36][23,39] | 5 | 6 | 27 | 40 | 5630237847 | 118576414 |
2114 | 4 | [32,16][35,24][61,59][28,12] | 3 | 3 | 19 | 15 | 6995932800 | 118576414 |
2114 | 5 | [27,25][60,58][36,59][23,39] | 6 | 5 | 40 | 27 | 7168612087 | 118576414 |
2114 | 6 | [02,04][35,51][31,47][28,39] | 3 | 3 | 15 | 19 | 6514160181 | 118576414 |
2114 | 7 | [27,04][40,24][36,38][03,05] | 6 | 3 | 40 | 19 | 8337178181 | 118576414 |
FirstSteps are always different among the 8 runs for the same i41k index. The differences may vary by twice as much.
The rule I used is that the best time (i.e. FirstSteps mimimum) is obtained when e2 is the greatest. It's not always true but it's not so bad overall. In this exemple, 2 lines provide the greatest value (40) : line 1 and 3. These 2 lines provide the minimum for FirstSteps (5 939 213 819 and 5 630 237 847). I therefore selected the best candidate by counting "e2" for all the runs (24*24*24*24 = 331 776). This provided all the data for of initial conditions for the 41790 runs.
I noticed this rule was false when a specific subchain around A1 corner, was used as showed in this example. The subchain 11,16 (or 2,25 which is the reflection of 11,16) as subchain's extremities for "start" and "end", provides quite always better performance than the basic rule of "greatest e2".
i41K | SUBCHAIN | s1 | e1 | s2 | e2 | FirstSteps | Close Tours | |
2013 | 0 | [27,16][33,58][61,47][03,39] | 6 | 3 | 41 | 16 | 11587442890 | 127569508 |
2013 | 1 | [32,04][35,58][52,47][23,05] | 3 | 3 | 19 | 19 | 13866822103 | 127569508 |
2013 | 2 | [02,16][60,24][36,47][30,05] | 3 | 3 | 15 | 15 | 10454024377 | 127569508 |
2013 | 3 | [11,16][40,58][31,59][28,05] | 5 | 3 | 29 | 15 | 7448599141 | 127569508 |
2013 | 4 | [02,27][60,24][61,47][23,12] | 3 | 6 | 16 | 41 | 8895925680 | 127569508 |
2013 | 5 | [02,25][40,35][31,59][23,05] | 3 | 5 | 15 | 29 | 6655543734 | 127569508 |
2013 | 6 | [02,16][40,51][61,36][03,39] | 3 | 3 | 15 | 15 | 11360789427 | 127569508 |
2013 | 7 | [32,04][40,58][61,38][23,28] | 3 | 3 | 19 | 19 | 13891936571 | 127569508 |
Therefore, each time that this subchain is present, I choose it around corner 0. Sometimes, the subchain 11,16 (2,25) is used several times. In this case, I compare the different results and I keep the best.
With multi-core computers nowadays, it's essential, so as to achieve good performance, to use all CPU cores. Consequently, the algorithm is parallelized : runs from different index are executed as units computing in parallel. Synchronization between threads has been optimized in order to reduce time for critical sections (different threads accessing the same resource). That is why the results of the program are stored in 24 different private files, instead of an unique global file shared by all the threads.
Ultimately, I compile the program for a x64 windows 10 machine. Results are better with this compiler's option. With these improvements, it now takes 5.1 days to count all closed tours on a AMD Ryzen 5 1600X Six-Core Processor.
The program Count_CT_V6_i64.exe counts the closed tours.The parameters are as follows :
Count_CT_V6_i64.exe [-s start] [-n number] [-t thread]Results are stored in 24 different files under c:/tmp/MT directory. The files are named "count_ct_from_start(number)_thread_threadsT#".
The number of files filled by program corresponds to the number of threads chosen on the command line. Results are merged in a file (synthese.txt) by a simple script command used in a Cygwin terminal.
each line of synthese.txt shows the following data
thread id | i41K index | time run i41K | total time | steps forward | closed tours |
T1 | 107 | 6 930 | 29 688 | 31 882 076 878 | 740 166 958 |
The 4 subchains, typical of each i41k index between 0 and 41789 are stored in map_string_i41k.pdf
Knowing the count of closed tours for each i41k index, it's possible to build a program that provides on the fly, for a given value between 1 and 1 692 674 826 245, the image of the closed tour. This program called Get_CTid.exe takes the following parameters
Usage : Get_CTid.exe [-N index CT or -n index i41K] [-D BASE_DIR] [-c] [-a] [-o]
To provide in just a few seconds the image of a particular closed tour, for any value between 1 and 1 692 674 826 245, it is necessary to begin the search to a point "relatively closed" from the value to search. The first element to choose is the i41k index between [0,41789]. All closed tours are ordered in the file map_i41k_string.pdf
Below are the first lines of this file
The column Close Tours cumul (rank_1) shows the number of closed tours captured by the previous i41k indexes. Given a particular number between 0 and 1 692 674 826 245, it's easy to know what is the i41k index providing this closed tour. This first choice is good, but it may not in all cases enable a fast access to the closed tour. As a matter of fact, if the i41k provides hundreds of millions of closed tours, reach the latter closed tours may take a long time. Another mechanism must then be added.
I launch my program that counts closed tours again (Count_CT_file.exe) with a little piece of code that stores in files, every 400 000 closed tours, information enabling the program to start from this point. The column "number of files" shows, for every i41k index, the number of generated files. It simply consists of the quotient produced by the division "Close Tours"/400 000.
In order to optimize the access to these files (4 211 009 files for all closed tours), I created 4 levels of directories (a/b/c/d) where directory "a" is called by a number between 01 and 11, whereas the other directory levels are called by a number between 01 and 25.
Each file has exactly the same size of 888 bytes. The name of files are in the form i41k_filenumber.wct. The tree of all these 4211009 files are saved in a compressed form in the file CT_FILES.7z.
To ensure the program Get_CTid.exe works properly, it's necessary to decompress this file, and to set the CT_BASEDIR environment variable to the directory where the tree was decompressed. It's also possible to pass on this information to Get_CTid.exe via the "D" option
Here is an exemple of a result of Get_CTid.exe in a cmd windows.