Applied Math & Computer Science Lab
Data Analysis, Optimization & Mathematical Modeling, Artificial Intelligence, Neural Net For Everyday Life Applications
AI/Data Mining Links Online Free Courses Online Bookstore AMCSL Forum Submit Link New Additions Archive
Practical Data Mining Courses      Get Certificate of Completion Now for Free   
Search the Web:    

Collaborative Filtering for Wesite Traffic Data - Perl Source Code

When browsing on Amazon.com - I was always interesting how they do this feature - "Customers Who Bought Items in Your Recent History Also Bought ..." After reading the book [1] I decided to apply collaborative filtering to website traffic data. Weblog data can provide the answer to such questions as: the users who visited the page A also visited pages ... OR the users who were interested in link A also clicked on links ....
So I created perl source code. The program finds similar pages for every page in the weblog. In our context similar page mean pages that were visited by similar users. The script opens weblog file, navigates through each line, extracts ip, url and saves data in the memory. The final output for this step is datatable, one row for each page, and each cell in this raw indicates how many visits was done by specific ip. This is two dimensional array data[url][ip]. And this array is input for the next step - collaborative filtering.
In this step for each page we use the data array to calculate similarity mesure between this page and all other pages. On the end the script print the results to file.
With this feature on website the user can find more easy and quickly interesting links, pages or web resources. And as we just saw it's very easy to implement in perl.

References

1. Toby Segaran Programming Collective Intelligence 2007
2. Collaborative filtering: perl source code




Copyright © 2008 by LZ.