Before we discuss Linear Regression for ML, let us talk a little about Statistics. Many of us might have learnt linear regression in Statistics classes. Are ML and Statistics the same? No, they are not! ML builds heavily on Statistics, that’s all!
Both Machine Learning and statistical modelling learn from data. We can say they are different approaches to predictive modelling. A statistical model is a mathematical function which describes the relationship between a dependent variable and its independent variables. By the way, what is Statistics?
Statistics is a branch of Mathematics which deals with data. More precisely, it deals with the collection, analysis, interpretation, and presentation of data. The two main branches of Statistics are:
Descriptive Statistics – It gives descriptions about the population using numerical methods, graphs or tables. It uses measures of central tendency, that is, mean, median, mode and the measures of dispersion such as range, standard deviation, quartile deviation and variance. Descriptive statistics do not involve generalizing beyond the data at hand.
Inferential Statistics – It makes inferences and predictions about the population based on sample data. Probability distributions, Hypothesis testing, Correlation testing, and Regression analysis are different methods used in Inferential Statistics
A population is a group/set/collection of data that have something in common. A sample is a subset or part of a population selected to represent the population. A random sample is one in which every member of a population has an equal chance of being selected. The most commonly used sample is a simple random sample. A characteristic of a population is known as a parameter and that of a sample is known as a statistic.
A probability distribution is a mathematical function which gives the probability of occurrence of different possible outcomes in an experiment. There are many different classifications of probability distributions. They are normal distribution, chi-square distribution, binomial distribution, and Poisson distribution, to name a few. Different probability distributions serve different purposes.
The most commonly used probability distribution is the normal (or Gaussian) distribution (also sometimes called the bell curve). The curve is symmetric around the center, that is around the mean, µ, which is also the median and the mode of the distribution. Exactly half of the values are to the left of this point and half to the right, and the total area under the curve is 1.
So far I have only scratched the surface of Statistics. If you are really curious about what is below the surface, then just dive into it…it’s worth the effort.
Happy diving! 🙂