Regularization is a process used to create an optimally complex model. A model should be as simple as possible. Ridge and Lasso regression are some of the simple techniques to reduce model complexity and prevent over-fitting which may result from simple linear regression.
Linear regression is the simplest supervised machine learning algorithm. Ridge and Lasso are two special linear regression models used for regularisation.
Consider a simple linear regression
where m is the slope and c is the intercept. Our aim is to optimize m and c so that we can reduce the cost function.
A regression model that uses L1 regularization technique is called Lasso Regression and the model which uses L2 is called Ridge Regression.
The key difference between these two is the penalty term. In ridge regression, the cost function is altered by adding a penalty equivalent to the square of the magnitude of the coefficients.
In Lasso regression, instead of taking the square of the coefficients, magnitudes are taken into account. Lasso regression can result in feature selection whereas Ridge regression only reduces the coefficients close to zero, but not zero. So Lasso regression not only helps in reducing over-fitting but also helps in feature selection as well.
These regression techniques are best suited when we are dealing with a large set of features. Traditional methods like cross-validation, stepwise regression to handle overfitting and perform feature selection work well with a small set of features.