2021-06-24

Data selection (indexing and slicing) in Pandas MultiIndex DataFrames

6 mins read A MultiIndex (also known as a hierarchical index) DataFrame allows you to have multiple columns acting as a row identifier and multiple […]
2021-06-21

Data Science and Machine Learning Cheat Sheets

5 mins read Click on the links to get the high-resolution cheat sheets. Algebra Linear Algebra Calculus Probability Statistics Python R Machine Learning […]
2021-06-12

Introduction to advanced candlesticks in finance: tick bars, dollar bars, volume bars, and imbalance bars

56 mins read In this article, we will explore why traditional time-based candlesticks are an inefficient method to aggregate price data, especially under […]
2021-05-26

5 steps to start becoming a Machine Learning Engineer

16 mins read Step 1: Adjusting Your Mindset Whenever I lead my workshops I always get a lot of questions afterward from developers […]
2021-04-28

Python Scipy sparse matrices explained

8 mins read What is a Sparse Matrix? Imagine you have a two-dimensional data set with 10 rows and 10 columns such that […]
2021-04-17

Understanding intuition behind Markov Chain Monte Carlo Methods (MCMC)

15 mins read For many of us, Bayesian statistics is voodoo magic at best or completely subjective nonsense at worst. Among the trademarks […]
2021-03-23

Review of important offline evaluation metrics for recommendation systems

28 mins read We are in an era of personalization. The user wants personalized content and businesses are capitalizing on the same. Recommendation […]
2021-03-12

Bayesian Linear Regression using PyMC3

8 mins read Introduction In statistics, Bayesian linear regression is an approach to linear regression in which the statistical analysis is undertaken within […]
2021-03-02

ARIMA for time series forecasting in Python

11 mins read Making out-of-sample forecasts can be confusing when getting started with time series data. The statsmodels Python API provides functions for […]
2021-02-25

Identifying time series AR, MA, ARMA, or ARIMA Models using ACF and PACF plots

4 mins read In time series analysis, the Autocorrelation Function (ACF) and the partial autocorrelation function (PACF) plots are essential in providing the […]
2021-02-19

Pivot, Melt, Stack, and Unstack methods in Pandas

5 mins read Data does not come in a usable format by default; a data science professional has to spend 70–80% of their […]
2021-02-08

Probability Density Estimation: Maximum Likelihood Estimation (MLE), Maximum A Posteriori (MAP), and Bayesian inference

14 mins read Maximum Likelihood Estimation (MLE) and Maximum A Posteriori (MAP) estimation are methods of estimating parameters of statistical models. Despite a […]
2021-02-04

Implicit Recommender Systems with Alternating Least Squares

13 mins read In today’s post, we will explain a certain algorithm for matrix factorization models for recommender systems which goes by the […]
2020-12-18

How to determine epsilon and MinPts parameters of DBSCAN clustering

9 mins read Every data mining task has the problem of parameters. Every parameter influences the algorithm in specific ways. DBSCAN (Density-Based Spatial […]
2020-11-24

A review of Deep learning based recommendation systems

20 mins read INTRODUCTION The number of research publications on deep learning-based recommendation systems has increased exponentially in the past recent years. In […]
2020-11-20

Steps to setup PyTorch with GPU for NVIDIA GTX 960m (Asus VivoBook n552vw) in Ubuntu

3 mins read In this post, I’m gonna describe the steps I used to utilize GPU for the PyTorch Deep Learning framework on […]
2020-11-18

Basics of Convolutional Neural Networks (CNN) from Deep Learning specialization

8 mins read These notes are taken from the first two weeks of the Convolutional Neural Networks course (part of Deep Learning specialization) by Andrew Ng […]
2020-11-14

Machine Learning From Scratch Series: Linear Regression with Gradient Descent

10 mins read In the following sections, we are going to implement linear regression in a step-by-step fashion using just Python and NumPy. We will […]
2020-11-13

Machine Learning From Scratch Series: Logistic Regression

10 mins read In this article, we are going to implement the most commonly used Classification algorithm called Logistic Regression. First, we will […]
2020-11-09

Restricted Boltzmann Machines (RBMs) Simply Explained

16 mins read Table of Content: Definition & Structure Reconstructions Probability Distributions Code Sample: Stacked RBMS Parameters & k Continuous RBMs Next Steps […]