ECE 8820 Pattern Recognition

Computer Project #1

Fall, 2008

 

Due October 14

 

Design and Implement a Minimum Risk Bayes Decision Theoretic Classifier. All functions should be developed as independent modules. For this assignment, assume that all conditional density functions are multivariate normal and that the parameters can be estimated by the Maximum Likelihood Method. You must do the estimation of the means and covariances yourself but can use built-in functions (or supplied functions) to invert the matrices. I want you to implement a complete risk function decision using the posterior probabilities from Bayes Theorem.

 

First, set the loss functions to the "minimum error classification" functions, i.e., the 0-1 loss function, and assume equally likely prior probabilities (although your program must be general). DO NOT change the discrimant functions.

 

Test your algorithm on the Two_Class_FourDGaussians.dat dataset.

 

Resubstitution: Using the entire dataset, compute the mean vectors and covariance matrices, and compute the resubstitution error.

 

Cross Validation: Then, use 10-fold cross validation for training/testing. For each pass, output:

 

  1. The mean vectors and covariance matrices
  2. The confusion matrix (in raw numbers)
  3. The error estimates (in percentages)
  4. The classification of each test vector

 

Note: The first fold is where you reserve the first 10% for testing and use the last 90% for training. Do not reorder the data.

Finally, compute the average classification rates using the composite results from 10-fold cross validation.

Features: Run your program using all features on each dataset.

Then, for the two class gaussian dataset, produce the summary information using only features 1 and 2 and then only features 3 and 4. Compute and show the corresponding ROC curves. What conclusions can you draw?

 

Finally, experiment with different values in the loss matrix (using all four features) and discuss your results. Compare them to the results for the 0-1 loss function.

 

 

Your report should contain a sections on

  1. The technical description of all techniques utilized
  2. The design of the algorithms (pseudo-code, flowcharts, or some other structured descriptive means)
  3. The results of the algorithms
  4. An analysis of the results. For example, did you obtain what you expected? Were there any surprises? What conclusions can you draw from the experiments?
  5. Well documented, structured, modular program listings

 

Note: I am especially interested in your analysis.