2021-06-05

Python Assignment Expressions with walrus operator use cases

5 mins read Assignment expressions allow variable assignments to occur inside of larger expressions. While assignment expressions are never strictly necessary to write […]
2021-05-30

A tutorial on Context Managers in Python

9 mins read Python’s context managers are great for resource management and stopping the propagation of leaked abstractions. You’ve probably used it while […]
2021-05-04

Setup and run Jupyter notebook from a remote server by ssh

5 mins read In my research, I usually work with remote servers to run deep learning models inside machines more powerful than my […]
2021-04-28

Python Scipy sparse matrices explained

8 mins read What is a Sparse Matrix? Imagine you have a two-dimensional data set with 10 rows and 10 columns such that […]
2021-03-12

Bayesian Linear Regression using PyMC3

8 mins read Introduction In statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within […]
2021-03-02

ARIMA for time series forecasting in Python

11 mins read Making out-of-sample forecasts can be confusing when getting started with time series data. The statsmodels Python API provides functions for […]
2021-02-19

Pivot, Melt, Stack, and Unstack methods in Pandas

5 mins read Data does not come in a usable format by default; a data science professional has to spend 70–80% of their […]
2021-02-13

Python testing tutorial using pytest

18 mins read Testing your code brings a wide variety of benefits. It increases your confidence that the code behaves as you expect and […]
2020-12-18

How to determine epsilon and MinPts parameters of DBSCAN clustering

9 mins read Every data mining task has the problem of parameters. Every parameter influences the algorithm in specific ways. DBSCAN (Density-Based Spatial […]
2020-11-14

Machine Learning From Scratch Series: Linear Regression with Gradient Descent

10 mins read In the following sections, we are going to implement linear regression in a step-by-step fashion using just Python and NumPy. We will […]
2020-11-13

Machine Learning From Scratch Series: Logistic Regression

10 mins read In this article, we are going to implement the most commonly used Classification algorithm called Logistic Regression. First, we will […]
2020-11-08

Data Representation in NumPy

12 mins read The NumPy package is the workhorse of data analysis, machine learning, and scientific computing in the python ecosystem. It vastly simplifies manipulating […]
2020-07-24

Image classification example with Gradio and Keras

12 mins read Image classification is a subset of machine learning that categorizes a group of images into labeled classes. We train an […]
2020-07-13

Common loss functions for training deep neural networks in PyTorch

17 mins read Neural networks can do a lot of different tasks. Whether it’s classifying data, like grouping pictures of animals into cats […]
2020-07-12

A complete tutorial on evaluation metrics for imbalanced classification

38 mins read A classifier is only as good as the metric used to evaluate it. If you choose the wrong metric to […]
2020-07-01

Exploratory Data Analysis (EDA) example: Road safety dataset case study

20 mins read Getting a good feeling about a new dataset is not always easy and takes time. However, a good and broad […]
2020-06-24

Pandas data selection using .loc and .iloc

8 mins read When it comes to select data on a DataFrame, Pandas loc and iloc are two top favorites. They are quick, fast, easy to read, […]
2020-04-25

Styling Pandas DataFrames using Style API

10 mins read Python’s Pandas library allows you to present tabular data in a similar way as Excel. What’s not so similar is […]
2020-02-21

Walkthrough of an exploratory analysis for classification problems

20 mins read In this post, I’ll outline how to perform an exploratory analysis for a binary classification problem. I am going to […]
2020-02-05

Dealing with imbalanced data in machine learning

8 mins read Imbalanced classes are a common problem in machine learning classification where there is a disproportionate ratio of observations in each […]