SuDoku Perl Package
Download the Perl package here
 SYNOPSIS
 TERMINOLOGY AND CONVENTIONS
 DESCRIPTION
 Background
 Nice things to do with SuDoku puzzles
 What makes a good SuDoku puzzle?
 What makes a difficult SuDoku puzzle?
 PACKAGE OVERVIEW
 METHODS
 Mapping functions
 Construction
 Puzzle Creation
 Validity checking before a cell value is entered on the grid:
 Validity checking after a cell value is entered on the grid:
 Validity checking on completion
 Determine Possible Values
 Hint generation
 Presentation
 Serialization
 TODO
 AUTHOR
 LICENSE
 ENVIRONMENTAL NOTICE
NAME
SuDoku  Creates and Solves SuDoku puzzles
SYNOPSIS
Solve a 9x9 Puzzle
!/usr/bin/perl
use SuDoku;
@puzzle=(<<"!" =~ m/./gm);
070000250
080792060
006051007
003100000
007020600
000004800
700600100
030540090
048210000
!
SuDoku::Solve(\@puzzle);
SuDoku::SimpleTextOutput(\@puzzle);
print SuDoku::VerifyCompletedGrid(\@puzzle)?"Valid\n":"Invalid\n";
Create and solve an easy 9x9 Puzzle
use SuDoku;
SuDoku::CreateSparse(\@solution);
SuDoku::Solve(\@solution);
SuDoku::Solution2Puzzle(1,\@solution,@puzzle);
SuDoku::SimpleTextOutput(\@solution);
SuDoku::PrettyTextOutput(\@puzzle);
TERMINOLOGY AND CONVENTIONS
CellThe smallest item on the playing area. The order of cell positions is left to right, top to bottom. For a 9x9 grid, the cell indexes are 0..80.
Row
The group of cells on a horizontal line. The order of lines is top to bottom. For a 9x9 grid, rows are indexed 0..8.
Column
The group of cells on a vertical line. The order of columns is left to right. For a 9x9 grid, columns are indexed 0..8.
Box
A rectangular group of cells. On a 9 x 9 playing area, a box is made up of 3 rows and 3 columns. There are 9 boxes on a 9x9 grid, indexed 0..8. The order of boxes is left to right, top to bottom.
Xconstraint
The optional two diagonal lines of cells that run across the grid. On a 9 x 9 grid, each line consists of 9 cells, each cell indexed 0..8. The lines themselves are indexed 0..1 and the cell/line order is from topleft to bottom right, topright to bottom left.
Vicinity
Peripheral cells that are in the same row, column and box that the cell that interests us. On a 9x9 grid, there are 20 cells in the current cell's vicinity.
If the puzzle also has an Xconstraint an the cell lies on a diagonal line, then one of the diagonal liness is also included, making a vicimity set of 26. For a cell that lies in the middle of a 9x9 grid, the vicinity will be both diagonals, the row, column and box that the cell intersects, making a vicinity set of 32.
Grid
This is the entire playing area. To summarise: On a 9 x 9 playing area, there are 81 cells in the grid, 9 rows, 9 columns and 9 boxes and 2 optional diagonal braces.
DESCRIPTION
Background
SuDoku puzzles have become the world's answer to filling the time void that people used to do crossword puzzles in. The Swiss mathematician Euler was the first to seriously consider this idea of these tightlycoupled grids of numbers. This has since become a formal mathematic field known as quasigroup completion problems, a.k.a. Latin Squares.
Someone, somewhere, turned the idea for a puzzle. It become very popular in Japan where the Kanji character set is not suitable for crossword puzzles and alternative forms of entertainment to manga comics for sardinepacked bullet train commuters had to be found.
The puzzle is a 9 x 9 grid of cells filled with the digits 1 to 9, such that each digit occurs only once in a row, once in a column and once in a 3 x 3 box. Optionally, a diagonal constraint can be added such that the digits in the two diagonals across the grid are also unique. These simple logical relationships between the cells allow one to solve a grid where more than 2/3 of the cells have been blanked out. When the Xconstraint is used, one should reasonably be able to solve a puzzle where only 1/5 of the cell values are provided.
In 2006 the first of the postsecond world war baby boomers in Japan are retiring at the age of 65 and they will most certainly need to be kept busy with SuDoku puzzles. The use of SuDoku puzzles is imperative if we want to maintain political stability in the Far East ;)
Nice things to do with SuDoku puzzles
A number of puzzles can be partially overlapped so that they form and even larger puzzle. Typically, the 3 x 3 box is shared between puzzles, for example:
++++ ++++
### ###
++++ ++++
### ###
++++++++
#######
++++++++
###
++++++++
#######
++++++++
### ###
++++ ++++
### ###
++++ ++++
Anyway, who's to say that you should only create 9 x 9 grids? We could create junior 4 x 4, 6 x 6 and 8 x 8 grids. Or a joke 2 x 2 grid. Using hexadecimal notation, we can create 16 x 16 grids, and so on. Cerebral capability is the only limit here!
The twodimensional idea can be extended into a threedimensional puzzles: Using the 9 x 9 idiom, we could create a 9 x 9 x 9 cube which has embedded along each of its 3 axii 9 planar puzzles and a further 2 diagonal puzzles, making a total of (3x7)+(3x2)=33 puzzles in a cube. 3DComputer graphics are ideal for presenting and solving such a puzzle. Failing that, 33 slices of the cube in print and the gift for spacial visualtion are the only options. With such a tight coupling of constraints in a cube, each planar puzzle can be extremely sparse. Add all the planar puzzles together and you have plenty of information in 3space to solve all 33 puzzles simultaneously.
What makes a good SuDoku puzzle?
It must be solvableBruteforce computer simulations have shown that even though there are no conflicts in the existing cell values, some puzzles seem to have no apparent solution. Here is a trivial example with no solution:
+++
123
45 6
...and all the other cells blank
789
++..
:
:
It must only have one unique solution
It can happen that more than one set of values are possible to satisfy the onedigitperrow/column/box criteria and we end up with a puzzle that has multiple solutions. Such puzzles are unpopular and publically derided when they appear in the media.
In its simplest case, this situation is caused by tuples of diagonallymirrored cells in adjacent rows or columns, the rows or column which share a common box. A mirrored pair example is shown below using the values 4 and 6. If, for instance, all the cells containing the values 4 and 6 were not given in the puzzle, correct solutions to the puzzle would be to enter either a 4 or a 6 in the top left corner.
A simple setmodel indicates the values in each box which can cause problems:
+++..
+++..
+++..
457861
657841
A
 B 
128793
or
128793
Model: 


693245
493265
B
 A 
+++..
+++..
+++..
..
..
..
The same scenario can be applied in the vertical sense. It is also extendable to triplets, quads, quints, etc., modelled below:
A
C
B
A
C
D
B
A
C
D
E
B
or
or
etc..
B
A
C
D
B
C
A
E
B
C
D
A
Similar problems can also occur around corners:
B
A
A
B
A
B
Only one solution path
Puzzle creators attempt to create puzzles in such a way that there is only one logical path to get the final solution. This seems to be a bit of a holy grail but I wonder if puzzle players notice this though?
What makes a difficult SuDoku puzzle?
Difficulty is not only determined by the number of missing cells in a puzzle; factors such as the distribution of values in the remaining cells and the size of contiguous whitespace play roles too.
Relative puzzle difficulty could possibly be determined by the number of iterations that the solveprocedure has to perfrom to solve the puzzle. Such a method is only slightly similar to the way that a human brain might work when solving a puzzle:
We humble humans use techniques such as slicing and dicing, stacking, twinnig, trippling, identifying lonely numbers and other nebulous, organic processes. The automated solver on the other hand starts at a random point and recursively iterates over the entire grid until all cells have values in them, or until your computer pops its top. To prevent your computer having a mental breakdown, you can set the maximum number of recursions. This is by default 20000.
This bruteforce solving technique used here requires the same number of recursions regardless of the order in which the cells are solves and therefore the number of recursions could be used to grade the relative difficulty of a puzzle.
Empirical experience in solving and grading the difficulty of puzzles helped develop the following crude rules of thumb based on the number of given cells for the desired difficulty level:
Very EasyStandard: 45, with Xconstraint: 30
Easy
Standard: 40, with Xconstraint: 27
Medium
Standard: 35, with Xconstraint: 24
Difficult
Standard: 30, with Xconstraint: 21
Fiendish
Standard: 25, with Xconstraint: 18
Factored into this one can add a few pernicious twists, such as:
TwistedRemoving contiguous areas of space on the grid, e.g. blanking out and entire row, column or box
Wicked
Removing more cells that contain one value than of any other value.
Evil
Creating symmetrical patterns across the grid. This is only an observation and does not seem logical at all, but I suspect that Euler's mathmatical grid still has some secrets to divulge.
PACKAGE OVERVIEW
The approach is to start with the creation of a solution and then to remove cells to eventually end up with a puzzle. As a final test, we attempt to solve the puzzle to ensure that the puzzle can indeed be solved within a reasonable number of computer iterations. The number of iterations is a very crude indicator of the amount of human effort required to solve a puzzle, but correlates well.
Solving puzzles
A bruteforce method is used here to initially solve puzzles. There is nothing wrong with a bruteforce approach since computers are our slaves. Or is it the other way around? However, the hint generation code had to resort to more subtle, humanlike methods.
Creating puzzles
A number of design approaches come to mind to create a SuDoku puzzle. From a number of experiments, the one used here delivered the best and most consistent results:
Create a sparse grid with nonconflicting cell values and find a solution for the grid. These few values seed the grid. Generally, the sparser the grid, the greater the chance of a solution being found but the longer it will take to find the solution. The compromise is to populate 3 diagonal boxes with a random order of the digits 1..9, thus preventing any onedigitperrow/column/box conflicts. Assuming there is only one unique solution for such a sparsely populated grid, it is possible to create 47x10^15 complete solutions to build puzzles from. If the Xconstraint is required, the two diagonal lines are randomly populated with the digits 1 to 9 such that none of them conflict. This gives us approximately 9!^2 = 131x10^9 complete solutions to build puzzles from.
Once the solution has been found, remove a contigous box or a row or a column of cells or a combination of these groups depending on desired level of difficulty. Remove further remaining populated cells at random until the number of cells that remain for the desired level of difficulty has been achieved. Check each time before removing a cell if solutionambiguity would be introduced. If so, choose another random cell.
METHODS
Mapping functions
GetRowIndex4GridIndex()Returns the Row group index (e.g. 0..8 in a 9x9 grid) for a given grid index.
GetColIndex4GridIndex()
Returns the Column group index (e.g. 0..8 in a 9x9 grid) for a given grid index.
GetBoxIndex4GridIndex()
Returns the Box group index (e.g. 0..8 in a 9x9 grid) for a given grid index.
GetDiaIndex4GridIndex()
Returns the Diagonal group index (e.g. 0..1) for a given grid index. Returns 1 if the grid index does not lie on a diagonal.
GetGridIndex4RowIndexPos(row index, position in row)
Grid Index for row index and position in row
GetGridIndex4ColIndexPos(col index, position in col)
Grid Index for column index and position in column.
GetGridIndex4BoxIndexPos(box index, position in box)
Grid Index for box index and position in box.
GetGridIndex4DiaIndexPos(dia index, position in dia)
Grid Index for diagonal index and position in diagonal.
GetPosInRow4GridIndex(grid index)
Position 0..8 in row for the grid position
GetPosInCol4GridIndex(grid index)
Position 0..8 in col for the grid index.
GetPosInBox4GridIndex(grid index)
Position 0..8 in box for the grid index.
GetPosInDia(grid index)
Position 0..8 in box for the grid index. Returns 1 of the grid index is not on a diagonal.
GetRow4RowIndex(row index, ref to target list, ref to source grid)
Get the entire row into the target list
GetCol4ColIndex(row index, ref to target list, ref to source grid)
Get the entire column into the target list
GetBox4BoxIndex(row index, ref to target list, ref to source grid)
Get the entire box into the target list
GetDia4DiaIndex(diagonal index, ref to target list, ref to source grid)
Get the entire diagonal into the target list
SetRow4RowIndex(row index, ref to source list, ref to target grid)
Set the entire row on the target grid to the values contained in the source list.
SetCol4ColIndex(col index, ref to source list, ref to target grid)
Set the entire column on the target grid to the values contained in the source list.
SetBox4BoxIndex(box index, ref to source list, ref to target grid)
Set the entire box on the target grid to the values contained in the source list.
SetDia4DiaIndex(box index, ref to source list, ref to target grid)
Set the diagonal on the target grid to the values contained in the source list.
Construction
Solve(ref to puzzle grid)Solves the puzzle and returns the number of recursions that were required to solve the puzzle. If the maximum number of iterations has exceeded, then return 0. The puzzle in the list will be returned with the solution obtained so far.
SetMaxIterations(number of iterations)
By default, this is set to 5000. A solve process that causes more than this many iterations causes the solve process to terminate. Often, solving level 5 puzzles causes this to happen.
Puzzle Creation
CreateSparse(ref to grid)Create a sparse grid by populating row 0 and column 0 with random but nonconflicting digits 1..9. Solve for this grid and HeyPreto!  there is your solution. All you need to do is remove some cells and Kazam!  there is your puzzle.
CreateSparseX(ref to grid)
Create a sparse grid by populating each of the two diagonals with the a random but nonconflicting digits 1..9. Solve for this grid and XHeyPreto!  there is your Xsolution. All you need to do is remove some cells and XKazam!  there is your Xpuzzle.
Solution2Puzzle(level of difficulty, reference to solution list, reference to puzzle list, diagonal constraints are required)
Takes a known solution and removes cells to create a puzzle. The number of cells removed is related to the specified level of difficulty, 1=Easy...5=Very difficult.
Create(level of difficulty, ref to solution grid, ref to puzzle grid, Xconstraint)
Creates a solution and a puzzle and puts it in the lists referred to. The level of difficulty is 1=Easy...5=Diabolically difficult. Return FALSE (0) if for any reason the function failed. Else return number of iterations that were required to solve the puzzle.
Validity checking before a cell value is entered on the grid:
IsValueAlreadyInRow(value, row index, reference to grid)Checks for a duplicate value of the indexed cell in the row. This is used for after a value is entered into a cell.
IsValueAlreadyInCol(value, column index, reference to grid)
Checks for a duplicate value of the indexed cell in the column. This is used for after a value is entered into a cell.
IsValueAlreadyInBox(value, box index, reference to grid)
Checks for a duplicate value of the indexed cell in the box. This is used for after a value is entered into a cell.
IsValueAlreadyInVicinity(value, reference to grid,)
Checks for a duplicate value of the indexed cell in the box. This is used for after a value is entered into a cell.
Validity checking after a cell value is entered on the grid:
IsCellValueAlreadyInRow(cell index, reference to grid)Checks for a duplicate value of the indexed cell in the row not including itself. Returns false if not found.
IsCellValueAlreadyInCol(cell index, reference to grid)
Checks for a duplicate value of the indexed cell in the column not including itself. Returns false if not found.
IsCellValueAlreadyInBox(cell index, reference to grid)
Checks for a duplicate value of the indexed cell in the box not including itself. Returns false if not found.
IsCellValueAlreadyInDia(cell index, reference to grid)
Checks for a duplicate value of the indexed cell in the diagonal not including itself. Returns false if not found or of the cell is not on the diagonal.
IsCellValueAlreadyInVicinity(cell index, reference to grid,)
Checks for a duplicate value of the indexed cell in the row, column or box. It is sufficient not to check the diagonal. Returns false if not found.
VerifyRowSoFar(reference to grid, row index)
Verifies the row so far for duplicate values. Unfilled cells are ignored. Return false if the verification failed.
VerifyColSoFar(reference to grid, column index)
Verifies the Column so far for duplicate values. Unfilled cells are ignored. Return false if the verification failed.
VerifyBoxSoFar(reference to grid, box index)
Verifies the Box so far for duplicate values. Unfilled cells are ignored. Return false if the verification failed.
VerifyDiaSoFar(reference to grid, diagonal index)
Verifies the Diagonal so far for duplicate values. Unfilled cells are ignored. Return false if the verification failed.
VerifyGridSoFar(reference to grid)
Check entire grid so far. The puzzle does not need to be complete and we ignore cells that have no value in them. Starts by checking rows, then columns, then boxes. Stops checking when a verification has failed. Returns false if the verification failed.
Validity checking on completion
IsGridCompleted(reference to grid)Check if all the cells have values in them. The grid may not necessarily be correctly populated.
VerifyCompletedGrid(reference to grid)
Checks the completed grid, i.e. all cells must have a value in them. The VerifyGridSoFar function could also be used, but this one is faster. Returns false if the verification failed.
Determine Possible Values
The methods use the method of exclusion to determine the possble values of a given cell. Each method is scoped on a different cell group:
PossibleValues4CellRow(cell index, reference to grid, reference to accumulating list of results)Possible values for the row in which the given cell is.
PossibleValues4CellCol(cell index, reference to grid, reference to accumulating list of results)
Possible values for the column in which the given cell is.
PossibleValues4CellBox(cell index, reference to grid, reference to accumulating list of results)
Possible values for the box in which the given cell is.
PossibleValues4CellDia(cell index, reference to grid, reference to accumulating list of results)
Possible values for the diagonal in which the given cell is.
PossibleValues4CellVicinity(cell index, reference to grid, reference to accumulating list of results, include diagonal grid constraint)
Possible values for the vicinity that the given cell is in based on exclusion.
PossibleValues4Grid(reference to grid, reference to hash of results, include diagonal grid constraint)
Possible values for the entire grid. Returns a hash of cells containing a nested list of possible values.
Hint generation
GenerateHints(reference to puzzle grid, reference to list of hints in ascending order of difficulty, include diagonal grid constraint)Generates a series of hint for the current state of the puzzle, ordered from the easiest hints to the most difficult. We define an easy hint as one that has the lowest number of possible values. This is actualy only the case when a cell has only one possible value. If the user does not find the first hint useful, he should be able to select the next hint and so on. This does not mean that the hintedat cell would be easy to solve, simply that this would be the next one to attempt. This procedure needs to be repeated when a value has been entered into the a cell as this changes all dependencies.
This process is used to generate a walkthrough of a puzzle.
Warning: If incorrect values have been filled in, then the hints may be wrong.
GenerateWalkthrough(reference to puzzle grid, reference to solution grid, reference to walkthrough where each solvable cell points to the next suggested cell to be solved include diagonal grid constraint)
Generates a walkthough list consisting of cell indexes using humanbased solving techniques. The walkthough is based on finding the easiest hints on solving the previous cell.
A walkthrough shows a user where he might start the puzzle and after he has completed a value in a cell where he might feasibly attempt to do the next cell. It does so without actually telling you what the values are.
Presentation
CellIndex2GridLegend(cell index, ref to row legend, ref to column legend)Map a cell index to a humanreadable tuple in the form (Row identifyer, Column identiifier) such that the rows are numbered from 1..9 and columns are numbered A..I.
GridLegend2CellIndex(row legend,column legend)
Map a humanreadable tuple (Row identifyer 1..9, Column identiifier A..I) to a cell index. Returns 1 on failure.
SimpleTextOutput(reference to grid, show legend)
Show grid output in a simple single frame (shown on the left)
PrettyTextOutput(reference to grid, show legend)
Show grid output in a nested frame (showm on the right).
ABCDEFGHI
A
B
C
D
E
F
G
H
I
++
++++
1 243816759
++++++++++++++++++
2 615297438
1 
 4  3  8 


 5 

3 978453126
++++++++++++++++++
4 124738695
2  6  1 


 7  4 


5 356149287
++++++++++++++++++
6 789562314
3 


 4  5 


 6 
7 432975861
++++++++++++++++++
8 891624573
++++
9 567381942
++++++++++++++++++
++
4 
 2  4 


 6  9 

++++++++++++++++++
5 
 5  6 






++++++++++++++++++
6 

 9 






++++++++++++++++++
++++
++++++++++++++++++
7 
 3  2  9  7 
 8  6  1 
++++++++++++++++++
8  8 

 6  2 

 7  3 
++++++++++++++++++
9 









++++++++++++++++++
++++
SimplePDFOutput(ref. to grid, ref. to PDFpage object, ref. to PDFfont object, xposition, yposition, width, height, Xconstraint, show legend, puzzle caption)
Create a grid on the current page in the PDF object with a simple single frame in a PDFpage object. Positions are measured from bottom left corner of the page area. If it is a puzzle with an Xconstraint, then the diagonal cells are filled in lightgray. Requires the PFD::CreateSimple package.
#!/usr/bin/perl
use SuDoku;
use PDF::CreateSimple;
SuDoku::Create(1,\@solution,\@puzzle,1);
my $pdfFile = PDF::CreateSimple>new('puzzle.pdf');
SuDoku::PrettyPDFOutput(\@puzzle,$pdfFile,40,700,400,400,1,0);
SuDoku::SimplePDFOutput(\@solution,$pdfFile,40,220,150,150,1,0);
$pdfFile>closeFile;
PrettyPDFOutput(ref. to grid, ref. to PDFpage object, ref. to PDFfont object, xposition, yposition, width, height, Xconstraint, show legend, puzzle caption)
Draws a grid with bold lines around the boxes.
SimpleHTMLOutput(reference to grid, show legend, Xconstraint, puzzle caption)
Creates a simple HTML table containing the grid output.
PrettyHTMLOutput(reference to grid, show legend, include diagonal constraint, caption)
Generate grid content in nested HTML tables.
use SuDoku;
use CGI;
SuDoku::Create(2,\@solution,\@puzzle);
$q = new CGI;
print $q>start_html(title=>'SuDoku');
print $q>table(
$q>Tr(
$q>td(SuDoku::SimpleHTMLOutput(\@solution, 'Solution')),
$q>td(SuDoku::PrettyHTMLOutput(\@puzzle,'Puzzle'))
)
);
print $q>end_html;
Serialization
One grid object can be serialized per file. Unpopulated values are represented as 0's. The size of the grid is determined by the number of serialized cells
SerializeIn(path to file, ref to grid object)Loads a file containing the serialized puzzle or solution to a grid object.
SerializeOut(path to file, ref to grid object)
Saves a puzzle or solution to a file.
TODO
Turn this into a proper CPAN package.
Make this configurable for other puzzle sizes: 16 x 16 puzzles are already around using hexadecimal notation. 4x6 puzzles are useful to show children how it is done and to give them a sense of achievement. 2 x 2 puzzles have been published as a joke  we won't bother about those.
Roundthecorner diagonal mirrored tuple checking. This is even more complex than it sounds.
AUTHOR
This code was written by Gerrit Hoekstra <This email address is being protected from spambots. You need JavaScript enabled to view it.>
The bruteforce puzzlesolving method is from an idea of Edmund von der Burg, who demonstrates how to solve a SuDoku puzzle in 3 lines of Perl, albeit very obfuscated, on http://www.ecclestoad.co.uk/blog/2005/.
If you use this program to provide publicly available output, please give some form of credit in the results. If you can improve the way this script does something, fix it and send a diff or a code snippet please. Your efforts will be noted in subsequent releases (right here where you are now standing). Visit www.cpan.org or www.hoekstra.co.uk for the most recent version.
LICENSE
Copyright (c) 2005 Gerrit Hoekstra. Alle Rechte vorbehalten.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
ENVIRONMENTAL NOTICE
This work was created from 100%recycled electrons. Since you will probably be producing legions of puzzles in paper print, please add a paper recycling notice to all your output and try to use recycled material in the first instance. No animals were hurt during the production of this work, except when I forgot to feed my cats once. The cats and I are on speaking terms again.