|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Artificial Intelligence/Data Mining Links | Webmaster Resources | AMCSL Forum: Web Mining | Submit Link | Archive | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Many different visual settings can be customized and tested without any programming. In addition to this some other options can be customized. For example you can create and train the game with 4x4 number of cells. Thus you can create many different games. You can also train it to play online. The created game can be used as widget on your site or can be posted on the list on this site. Here are some examples: Tic Tac Toe Games LearningLearning can be done by running perl script tt.cgi from command prompt.After results of learning are saved and transfered to web server someone can play with the program over the web. Learning is using a simple formula for update: V(s)=V(s)+alpha(V(s')-V(s)) where s is state after move, s' is state before move The learned values are saved in the the file level1.txt in the format s:V(s) The board position s is represented by string of 9 digits where 0 for 0 , 1 for RL program or x and 2 for emty cell During the learning the program counts the number of wins for each 500 games and prints all results in the end. That number always bigger in the end then in the beginning because the program starts without any knowledge how to play. (See table from Observations section) PlayingAfter learning is completed the file with values should be copied to directory with the script play.cgi. The player always is doing move first by clicking on the cell and the script will mark that position and also make own move.ObservationsBy increasing the number of games in learning we can improve learning results however there is limit. After 40000 number of wins for each 500 games during of learning is almost the same. But perfomance of such agent in the game with real opponent is bad. The demo is using agent trained in 400000 games. The perfomance is better. Learning can also be improved by changing alpha. Here is example of results of learning for alpha=0.02 , 0.4 and alpha=0.07-0.06*n/N
Perl Source Code and Related Links for Tic Tac Toe GameData file (learned values from training to play)Perl Source Code for Tic-Tac-Toe Game Working Demo: Tic-Tac-Toe Game Get Tic-Tac-Toe Widget Now Comments, suggestions LinksApplied Math and Computer Science Lab
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||