After downloading unzip file and copy the unzipped file to any folder on Windows system.
Converting weblog data to clickstream format with Clickstream Miner
The first step is to convert weblog to clickstream format.
After clicking menu on top clickstream analysis -> clickstream analysis the input window will show up. The input form has the following fields:
Weblog File
This is the input field for webserver log file which will be converted to clickstream format.
Exclude Robot File
This is the file with ip of robots that should be excluded from analysis.
The input is not required for this field. If the field is blank all ip (visits) will be counted.
The visit of web crawl program can be easy detected in the output file: one of visited pages will be 'robot.txt'.
Output File
This is the file where the Clickstream Miner put output - weblog in the clickstream format.
Do not include robots checkbox. Most robots will check robots.txt file for permission to crawl the website. This request will be logged to weblog file and could be used for sorting out non human visits. If checkbox is checked such visits will be skipped and not included in the output file.
In addition to clickstream text file Clickstream Miner also creates html file with the same clickstream data in the table. HTML file provides better view of data.
User Analysis
Clickstream Analysis window has also User Analysis button. Clicking on this button will bring User Analysis window with the following fields:
page keyword data file - this is the manually created text file. Each line of this file should have the page url and keywords that match to this page. Use tab as separation between keywords. The page url should appear in the same format as on weblog.
Here is the example of file format:
/cgi-bin/res/Naive_bayes_classifier.cgi classification naive bayes
/cgi-bin/clickstream/clickstream_analysis.cgi Clickstream_Miner
clickstream weblog file - this is the text file which is created by converting weblog data file to clickstream data format.
After clicking on Run the program will create user_analysis.txt file.
This feature allows to data mine user interests and build user profile. For example in one visit user could browse for perl books/articles/content, on another visit (or on the same visit) the same user could look for something about math. The user analysis can give the great opportunity for user modeling.