2020 – Page 2 – Overfit AI

Assumptions of Linear Regression Model

October 16, 2020 August 19, 2023Machine Learning

There are the five major assumptions: 1. Linear relationship: There should be a linear and additive relationship between the dependent (Y) variable and the independent (X –> x1,x2,x3,…) variable(s). A linear relationship suggests that a change in response Y due to one unit change in x1 is constant, regardless of…

Confidence Intervals

August 1, 2020 August 19, 2023Statistics

A confidence interval is a range of values we are fairly sure our true value lies in. It is calculated from the sample data and gives an interval estimate, as opposed to a point estimate. The confidence level, often expressed as a percentage (e.g., 95% or 99%), quantifies the level…

Key Statistical Tests

July 19, 2020 August 4, 2025Statistics

In the world of data science, statistical tests play a crucial role in drawing meaningful insights from data, making informed decisions, and validating hypotheses. Let’s explore five essential statistical tests: the Z-test, t-test, chi-squared test, ANOVA, and the lesser-known but powerful Fisher’s Exact Test. 1. Z-test: Unleash the Power of…

Hypothesis Testing

July 16, 2020 February 1, 2026Statistics

Hypothesis testing is a method statisticians use to make decisions or inferences about populations based on sample data. Hypothesis testing is a core concept in statistics that allows us to make informed decisions based on data. It’s a structured, methodical way to put our claims to the test, demanding evidence…

Heteroscedasticity

June 19, 2020 August 3, 2023Supervised Learning

A random variable is said to be heteroscedastic when different subpopulations have different variabilities (standard deviation). One of the basic assumptions of linear regression is that the data should be homoscedastic, i.e., heteroscedasticity is not present in the data. Due to the violation of assumptions, the Ordinary Least Squares (OLS) estimators…

A/B Testing

May 2, 2020 August 19, 2023Statistics

A/B testing is a basic randomized control experiment. It is a way to compare the two versions of a variable to find out which performs better in a controlled environment. A/B testing is also known as bucket testing or split-run testing Suppose we want to add some functionalities to an existing product. A/B testing…

Central Limit Theorem

April 30, 2020 August 16, 2023Statistics

The central limit theorem (CLT) is the foundation of statistics. Just by collecting a subset of data from a population and using statistics, we can draw conclusions about that population. CLT says that mean of the sampling distribution of the sample means is equal to the population mean irrespective of the distribution…

Q-Q plot – Importance in Linear Regression.

April 12, 2020 September 5, 2020Supervised Learning

Quantile-Quantile (Q-Q) plot, is a graphical tool for determining if two data sets come from populations with a common distribution such as a Normal, Exponential, or Uniform distribution. This helps in a scenario of linear regression when we have the training and test data set received separately and then we…

Hypothesis Testing in Linear Regression

March 20, 2020 November 10, 2020Supervised Learning

Hypothesis testing can be carried out in linear regression for the following purposes: To check whether a predictor is significant for the prediction of the target variable. Two common methods for this are — By the use of p-values:If the p-value of a variable is greater than a certain limit…

Year: 2020