Project Type:
Master/Diploma Thesis
Student:
Klemens Kranawetter
Mentor: Bernhard Geiger

Sudoku games are usually classified according to  rather vague  difficulty levels such as "easy", "challenging", "devilish", etc. But what is the underlying metric allowing such a classification? Clearly, while the number of clues give an indication of difficulty, there are certainly games with 21 clues which are more difficult to solve than a game with only 20. The literature therefore suggests to measure the difficulty of a game by the number of guesses a human player would have to take, or by the time he requires to solve a the game. Since the algorithms computing the difficulty depend on a variety of parameters, the estimated difficulty is almost as arbitrary as if a human would have obtained it.
We intend to define a singleletter number which does not depend on any parameters  and use information theory to get one: It is well known that 17 clues uniquely specify the solution of a Sudoku game. Therefore, the entropy of these 17 clues is equal to the entropy of the solution. Adding more clues does not increase the entropy. Adding more clues, however, increases the redundancy. We therefore assume that the difficulty of a game is somehow related to the redundancy of its clues: The higher the latter, the easier is the game. It is your task to evaluate this measure for a set of Sudoku games and verify if the assumption is true.
The project has been finished and the report can be downloaded here.