**Naive Bayes** is a linear classifier using Bayes Theorem and strong independence condition among features. Given a data set with n features represented by

Naive Bayes states the probability of output: Y from features F_i is,

Bayes theorem states that:

**Logistic regression** is a linear classification method that learns the probability of a sample belonging to a certain class. Logistic regression tries to find the optimal decision boundary that best separates the classes. It is mainly used in cases where the output is boolean. Multi-class logistic regression can be used for outcomes with more than two values.

Comparison between the two algorithms:

1**. Model assumptions**

- Naive Bayes assumes all the features to be conditionally independent.
- Logistic regression splits feature space linearly and typically works reasonably well even if some of the variables are correlated.

**2. Learning mechanism**

- Naive Bayes is a generative model. That means Naive Bayes models the joint distribution of the feature X and target Y, and then predicts the posterior probability given as P(y|x). The posterior probability is the probability of event A happening given that event B has occurred.
- Logistic regression is a discriminative model. That means Logistic regression directly models the posterior probability of P(y|x) by learning the input to output mapping by minimising the error.

3**. Approach to be followed to improve model results**

- Naive Bayes: When the training data size is small relative to the number of features, the information/data on prior probabilities help in improving the results
- Logistic regression: When the training data size is small relative to the number of features, including regularisation such as Lasso and Ridge regression can help reduce overfitting and result in a more generalised model.