# Category: Machine Learning

## K Means Clustering

Clustering is one of the most common exploratory data analysis techniques used to get an intuition about the structure of the data. K-means clustering is one of the simplest and popular unsupervised machine learning algorithms. K-Means Clustering is an algorithm that, given a dataset, will identify which data points belong to each…

## Support Vector Machines

A support vector machine (SVM) is a supervised machine learning model which can be used for both classification and regression. But they have been extensively used for solving complex classification problems such as image recognition, voice detection etc. SVM algorithm outputs an optimal hyperplane that best separates the tags. The hyperplane is a boundary that…

## Random Forest

Random Forest algorithm is a supervised classification algorithm. It is a tree-based algorithm comprised of several decision trees. The difference between the Random Forest algorithm and the decision tree algorithm is that in Random Forest, the processes of finding the root node and splitting the feature nodes will run randomly….

## Gini Index

In Decision Tree, the major challenge is the identification of the attribute for the root node in each level. This process is known as attribute selection. There are two popular attribute selection measures: Gini Index Information Gain Gini Index calculates the amount of probability of a specific feature that is…

## Decision Trees

The decision tree algorithm is one of the best and most widely used algorithms in Machine Learning. It is a supervised learning algorithm. A decision tree uses a tree-like model to make predictions. It resembles an upside-down tree. A decision tree builds classification or regression models in the form of…

## Ridge and Lasso Regression

Regularization is a process used to create an optimally complex model. A model should be as simple as possible. Ridge and Lasso regression are some of the simple techniques to reduce model complexity and prevent over-fitting which may result from simple linear regression. Linear regression is the simplest supervised machine learning…

## Naive Bayes vs Logistic Regression

Naive Bayes is a linear classifier using Bayes Theorem and strong independence condition among features. Given a data set with n features represented by Naive Bayes states the probability of output: Y from features F_i is, Bayes theorem states that: Logistic regression is a linear classification method that learns the probability…

## Bias-Variance Tradeoff

Bias and variance are prediction errors. There is a trade-off between a model’s ability to minimize bias and variance which is referred to as the best solution for selecting a value of Regularization constant. Bias is the difference between the average prediction of the target variable and the correct value which we…

## Naive Bayes

Naive Bayes is a very popular Supervised Classification algorithm. This algorithm is called “Naive” because it makes a naive assumption that each feature is independent of other features. It is near to impossible to find such data sets in real life. Bayes’ theorem is the base for Naive Bayes Algorithm….