Applied Math & Computer Science Lab
Data Analysis, Optimization & Mathematical Modeling, Artificial Intelligence, Neural Net For Everyday Life Applications
Artificial Intelligence/Data Mining Links Webmaster Resources AMCSL Forum: Web Mining Submit Link New Additions Archive Consulting Service
Products      Clickstream Miner   
Search the Web:    


Document Classification Using Naive Bayes Classifier with Perl

    Here we will implement in perl Naive Bayes classifier for text files. The problem that is used for classification is following: We have the set of labeled documents (text files) in some folder. The file name of each file is the class label plus "_" and the index number and the extension. So it can look like this: classname_0.txt, classname_1.txt, classname_2.txt .... We need to classify new document.
  The perl script opens each file , counts the frequency of each word and creates matrix of probabilities for each word and class. This matrix later is used to find the class label that is the best for need to be classified document.



Source Code

1. Naive Bayes Document Classification with Perl Script
2. Naive Bayes Classification with Perl, Data Example