Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database): Machine Learning Notes - Anomaly Detection

July 21, 2014

Machine Learning Notes - Anomaly Detection - Entropy Computation

This post is on my learning's from Machine Learning Session conducted by my colleague Gopi. It was really a good introduction and a lot of motivation towards learning the topic.

Concepts Discussed

Homogeneity - Is my data homogeneous
Pick the odd one out (Anomaly detection)
Entropy Computation

Wide variety of examples to find odd sets, variations. Example from below set identify the anomaly one

1,1,1,2

1,2,2,1

1,2,1,1

1,0,1,2

The last row involving zero is a odd one. Identifying them using entropy computation was very useful

Entropy Formula

Formula detailed notes from link

For row (1,1,1,2)

= -[((3/4)*log2(3/4)) + ((1/4)*log2(1/4))]

= -[-0.311 -0.5]

= .811

For row (1,2,2,1)

= -[((2/4)*log2(2/4)) + ((2/4)*log2(2/4))]

= -[-.5-.5]

= 1

For row (1,2,1,1)

= -[((3/4)*log2(3/4)) + ((1/4)*log2(1/4))]

= -[-0.311 -0.5]

= .811

For row (1,0,1,2)

= -[((2/4)*log2(2/4)) + ((1/4)*log2(1/4)) + ((1/4)*log2(1/4))]

= -[-0.5 -0.311 -0.311]

= 1.12

By excluding the row with higher values we will have homogeneous data set, The one last row with high entropy is the anomaly

If Data set is homogeneous after removing a particular record set then that particular record set is the anomaly one

More Concepts Introduced

Conditional Probability
ID3 Algorithm
Measure Entropy
Decision Tree
Random Forest
Bagging Technique

Happy Learning!!!

Data Science, Database, AI Startups and Domain Learning's (Video-Image-Text-Data-Database)

July 21, 2014

Machine Learning Notes - Anomaly Detection - Entropy Computation

No comments:

Git Code Repository

About Me

What is your Expertise

Search This Blog

Translate

About Me and Disclaimer

Labels

Data Science Good Reads

Cloud, Datacentre, BigData and NOSQL Blogs

SQL Links

Archecture Blog List

Programming Problems

Startup - Reads

Perl-Python-Ruby-Linux-Oracle

Management + Leadership Blogs

Research Papers & Podcasts

My Wordpress

Interesting Reads

Useful Links - C# and .NET

Java, Selenium, QTP and Test Tools Learning

Agile Testing

Reverse Logistics Reads

Biztalk Blogs

MS BI Links

Process - Learnt it :)

Usability Guidelines - Building Better Sites

.NET Test Tools and Other Interesting Reads

Review Checklist

Blog Archive

Live Traffic

Total Pageviews

Popular Posts