2022-11-16

Repository for implementation of statistics concepts for Data Science in Python

3 mins read The field of statistics is becoming increasingly important in the world of data science and machine learning. I have recently […]
2022-08-24

Performing A/B test in Python example – A case study from Udacity Data Scientist Nano Degree

11 mins read This is a simple walkthrough of an A/B test case study developed and used by Udacity. It is part of […]
2022-08-02

Measure the correlation between numerical and categorical variables and the correlation between two categorical variables in Python: Chi-Square and ANOVA

27 mins read Data analysis is an essential part of any research or business endeavor, and one of the most fundamental techniques is […]
2022-08-01

A simple tutorial on Sampling Importance and Monte Carlo with Python codes

16 mins read Introduction In this post, I’m going to explain the importance sampling. Importance sampling is an approximation method instead of a […]
2022-07-30

What is Reservoir Sampling in Stream Processing?

4 mins read Reservoir sampling is a fascinating algorithm that is especially useful when you have to deal with streaming data, which is […]
2022-07-23

A guide to Bootstrapping for Statistical Inference – Confidence Interval and Hypothesis Testing

14 mins read Introduction Inferential Statistics is the process of examining the observed data (sample) in order to make conclusions about the properties/parameters […]
2022-05-28

Understanding interaction effects in regression analysis

22 mins read In regression, an interaction effect exists when the effect of an independent variable on a dependent variable changes, depending on […]
2022-05-24

A guide on Maximum likelihood and Bayesian inference for parameter estimation

28 mins read Introduction In this post, I’ll explain what the maximum likelihood and Bayesian inference methods for parameter estimation are and go […]
2022-05-19

Understanding the basics of Bayesian Inference with Python Code

10 mins read Why did someone have to invent the Bayesian Inference? In one sentence: to update the probability as we gather more data. The […]
2022-03-17

Methods for sampling from complex distributions

8 mins read This writeup includes descriptions from a recent paper on algorithmic sampling, to describe in simpler terms the motivation and approach for […]
2022-03-11

A tutorial on Bayesian Statistics and Bayesian Machine Learning basics with Python Code

31 mins read Introduction Conditional probability and Bayes’ theorem are fundamental ideas in statistics that even laymen have heard of. Bayes’ theorem also […]
2022-02-20

Bayesian view of linear regression – Maximum Likelihood Estimation (MLE) and Maximum A Priori (MAP)

16 mins read Linear Regression is commonly the first machine learning problem that people are interested in in the area of study. For […]
2022-02-13

Handling skewness in features by applying transformation in Python

13 mins read In this tutorial, you will learn how to deal with your data when it is not following the normal distribution. One […]
2022-01-18

Why does LASSO regression (L1 regularization) shrink coefficients to zero but not the Ridge?

11 mins read We often read almost everywhere that Lasso regression encourages zero coefficient and hence provides a great tool for variable selection as well but it […]
2021-11-01

ARIMA and SARIMA for Real-World Time Series Forecasting in Python

15 mins read Time series and forecasting have been some of the key problems in statistics and Data Science. Data becomes a time […]
2021-10-19

Difference between Probability Density and Probability

5 mins read The probability density at x can be greater than one but then, how can it integrate to one? It’s a […]
2021-10-19

What is Conjugate Prior?

5 mins read What is Prior? Prior probability is the probability of an event before we see the data. In Bayesian Inference, the prior […]
2021-10-17

Important probability distributions for Data Science with Python code

33 mins read For a data scientist aspirant, Statistics is a must-learn thing. It can process complex and challenging problems in the real […]
2021-07-03

Understating and discovering multicollinearity in regression analysis with Python code

9 mins read In this post, I will explain the concept of collinearity and multicollinearity and why it is important to understand them […]
2021-06-21

Data Science and Machine Learning Cheat Sheets

5 mins read Click on the links to get the high-resolution cheat sheets. Algebra Linear Algebra Calculus Probability Statistics Python R Machine Learning […]