Applied Math & Computer Science Lab
Data Analysis, Optimization & Mathematical Modeling, Artificial Intelligence, Neural Net For Everyday Life Applications
Links Online Free Courses Bookstore Forum Submit Link New Additions Archive
Search the Web:    

Probabilistic Classification

In probabilistic classification the class of new instance is calculated based on sum of the distance-weighted contributions of each training instance for each class. Then the class that has the highest sum is selected as the class for the new instance.
As an example consider the following problem:
Classify the points within square area (length of sides=1) Let diagonal from (0,0) to (1,1) divides all points in 2 classes. The training instances are randomly generated and the class is assigned based on what part of square the instance belong. So for each training instance (x,y)


The perl code below is created for this algorithm. The training data is used to classify the number of new instances. In the end the confusion matrix is printed to screen to show how many correctly or not correctly instances are classified.
This example is for 2 classes with 2 dimensions for instances. However this is just initial parameters and can be easy changed in the begining of the program in case different number of classes / dimensions is needed.
The distance-weighted contribution is calculated by the program as


The sigma is the parameter that should be ajusted to get enough small error of the classifying.
The perl code for the script can be downloaded from the link below. The script can be used to create classifier for different data mining purposes.



References



1 1. Probabilistic classification