Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Day #87

November 10, 2017

Day #87 - Classification Metrics

Accuracy (Essential for classification), Weighted Accuracy = Weighted Kappa
Logarithmic Loss (Depends on soft predictions probabilities)
Area under Receiver Operating Curve (Considers ordering of objects, tries all threshold to convert soft predictions to hard labels)
Kappa (Similar to R Squared)

Notations
N - Number of objects
L - Number of classes
y - Ground truth
yi - Predictions
[a = b] - indicator function

Soft labels (soft predictions) are classifier's scores - Probabilities of objects
Hard Labels (hard predictions) - argmax fi(x), [f(x)>b], b - threshold for binary classification, Predict label, maximum value from soft prediction and set class for prediction label. Function of soft label

Accuracy Score

Most referred measure of classifier quality
Higher is better
Need hard predictions
Number of correctly guessed objects
Argmax of soft predictions

Logloss

Work with soft predictions
Make classifier output posterior probabilities
Penalises for wrong answers
Set constant to frequencies of each class

Area Under Curve

Based on threshold decide percentage of above / below the threshold
Metric tries all possible ones and aggregate scores
Depends on order of objects

AUC - ROC

Compute TruePositive, FalsePositive
AUC max value 1
Fraction of correctly ordered pairs

AUC = Fraction of correctly ordered pairs / total number of pairs
= 1 - (Fraction of incorrectly ordered pairs / total number of pairs)

Cohen's Kappa

Score = 1- ((1-accuracy)/(1-baseline))
Baselines different for each data
Similar to R squared
Here R predictions for dataset used as baseline
Error = (1- Accuracy)
Weighted Error Score = Confusion matrix * Weight matrix and sum their results
Weighted Kappa = 1 - ((weighted error)/(weighted baseline error))
Useful for medical applications

Ref - Link

Happy Learning and Coding!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

November 10, 2017

Day #87 - Classification Metrics

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts