Design
and Implement a Minimum Risk Bayes Decision Theoretic Classifier. All functions
should be developed as independent modules. For this assignment, assume that
all conditional density functions are multivariate normal and that the
parameters can be estimated by the Maximum Likelihood Method. You must do the
estimation of the means and covariances yourself but
can use built-in functions (or supplied functions) to invert the matrices. I
want you to implement a complete risk function decision using the posterior
probabilities from Bayes Theorem.
First,
set the loss functions to the "minimum error classification"
functions,
i.e., the 0-1 loss function, and assume equally likely prior probabilities
(although your program must be general). DO NOT change the discrimant
functions.
Test your algorithm on the Two_Class_FourDGaussians.dat dataset.
Resubstitution: Using the entire dataset, compute the mean vectors and covariance matrices, and compute the resubstitution error.
Cross
Validation:
Then, use 10-fold cross validation for training/testing. For each pass, output:
Note: The first fold is where you reserve the first 10% for testing and use the last 90% for training. Do not reorder the data.
Finally, compute the average classification rates
using the composite results from 10-fold cross validation.
Features: Run your program using all
features on each dataset.
Then,
for the two class gaussian dataset, produce the
summary information using only features 1 and 2 and then only features 3 and 4.
Compute and show the corresponding ROC
curves. What conclusions can you draw?
Finally,
experiment with different values in the loss matrix (using all four features) and
discuss your results. Compare them to the results for the 0-1 loss function.
Note: I am especially interested in your analysis.