Data Representation in NumPy

12 mins read The NumPy package is the workhorse of data analysis, machine learning, and scientific computing in the python ecosystem. It vastly simplifies manipulating […]

Image classification example with Gradio and Keras

12 mins read Image classification is a subset of machine learning that categorizes a group of images into labeled classes. We train an […]

What is the Bias-Variance Trade-off?

9 mins read Whenever you are using a Statistical, Econometrical, or Machine Learning model, no matter how simple the model is, you should […]

Common loss functions for training deep neural networks in PyTorch

17 mins read Neural networks can do a lot of different tasks. Whether it’s classifying data, like grouping pictures of animals into cats […]

Illustrated calculation of cross-entropy for binary, multi-class, and multi-label classification

8 mins read Cross-entropy is a commonly used loss function for classification tasks. Let’s see why and where to use it. We’ll start with […]

A complete tutorial on evaluation metrics for imbalanced classification

38 mins read A classifier is only as good as the metric used to evaluate it. If you choose the wrong metric to […]

Exploratory Data Analysis (EDA) example: Road safety dataset case study

20 mins read Getting a good feeling about a new dataset is not always easy and takes time. However, a good and broad […]

Pandas data selection using .loc and .iloc

8 mins read When it comes to select data on a DataFrame, Pandas loc and iloc are two top favorites. They are quick, fast, easy to read, […]

Understanding hypothesis testing with Covid-19 case study (Z-test and t-test)

13 mins read Introduction The coronavirus pandemic has made a statistician out of us all. We are constantly checking the numbers, making our […]

Styling Pandas DataFrames using Style API

10 mins read Python’s Pandas library allows you to present tabular data in a similar way as Excel. What’s not so similar is […]

Understanding the probabilistic interpretation of linear regression

6 mins read Linear regression is about finding a linear model that best fits a given dataset. For example, in a simple linear […]

Understanding Beta Distribution

9 mins read When to use Beta distribution The Beta distribution is a probability distribution on probabilities. For example, we can use it to model […]

The intuition behind Shapley Values

10 mins read The first time I heard about Shapley values was when I was reading up on model interpretability. I came across […]

Walkthrough of an exploratory analysis for classification problems

20 mins read In this post, I’ll outline how to perform an exploratory analysis for a binary classification problem. I am going to […]

Dealing with imbalanced data in machine learning

8 mins read Imbalanced classes are a common problem in machine learning classification where there is a disproportionate ratio of observations in each […]

List of useful tutorials for Exploratory Data Analysis (EDA)

< 1 min https://towardsdatascience.com/exploratory-data-analysis-8fc1cb20fd15 https://medium.com/omarelgabrys-blog/statistics-probability-exploratory-data-analysis-714f361b43d1 https://www.kaggle.com/ekami66/detailed-exploratory-data-analysis-with-python https://www.kaggle.com/dvigneshwer/kernele7f4dbb964/notebook Visualizing the distribution of a dataset — seaborn 0.10.0 documentationhttps://seaborn.pydata.org/tutorial/distributions.html https://www.kaggle.com/kashnitsky/topic-1-exploratory-data-analysis-with-pandas https://iq.opengenus.org/exploratory-data-analysis-python/ Plotting with categorical data […]

Types of Data & Measurement Scales: Nominal, Ordinal, Interval and Ratio

6 mins read There are four measurement scales: nominal, ordinal, interval, and ratio. These are simply ways to categorize different types of variables […]

How to split data in decision tree nodes?

17 mins read The problem: We need to recommend apps to users according to what they’re likely to download Recommendation systems are one […]

Machine Learning From Scratch Series: Gradient Descent

9 mins read Gradient Descent is an iterative algorithm that is used to minimize a function by finding the optimal parameters. Gradient Descent can […]

Logistic Regression Implementation From Scratch in Python

4 mins read The objective of this tutorial is to implement our own Logistic Regression from scratch. This is going to be different […]