2019-11-12

Machine Learning From Scratch Series: K-means Clustering: K-Nearest Neighbors (KNN) Algorithm

8 mins read Introduction A famous quote states: “You are the average of the five people you spend the most time with.” Although […]
2019-07-02

Seaborn charts for plotting categorical features

12 mins read Data In this post, we will use one of Seaborn’s conveniently available datasets about the Titanic, which I’m sure many […]
2019-05-31

An explanation of z-distribution (standard normal distribution)

14 mins read The standard normal distribution, also called the z-distribution, is a special normal distribution where the mean is 0 and the standard deviation is 1. Any normal distribution can […]
2019-05-23

Tutorial on Crosstab Operations (pivot_table and crosstab methods) in Pandas

8 mins read Introduction Pandas offers several options for grouping and summarizing data but this variety of options can be a blessing and […]
2017-09-08

Understanding L1 and L2 as Loss Function and Regularization

6 mins read While practicing machine learning, you may have come upon a choice of the mysterious L1 vs L2. Usually, the two […]
2017-09-08

Different missing data mechanisms

3 mins read Missing data mechanisms concern the relationship between missing data and the values of variables in the data matrix. Given this focus, […]
2017-06-14

Implementations of Mutual Information (MI) and Entropy in Python

8 mins read In probability theory and information theory, the mutual information (MI) of two random variables is a measure of the mutual […]