2021-12-04

Sampling from a multivariate Gaussian (Normal) distribution with Python code

3 mins read Multivariate Gaussian distribution is a fundamental concept in statistics and machine learning that finds applications in various fields, including data […]
2021-11-20

Understanding Expectation-Maximization (EM) algorithm with an example in Python

7 mins read Suppose we have some data sampled from two different groups, red and blue: Here, we can see which data point […]
2021-11-15

Using pre-commit and Makefile for Python code development workflow

5 mins read Introduction When developing Python code we are constantly adding and committing changes. However, nothing stops us from committing low-quality code, e.g. code […]
2021-11-15

Machine Learning From Scratch Series: Naive Bayes and Gaussian Naive Bayes

16 mins read Introduction Naïve Bayes algorithm is a supervised classification algorithm based on the Bayes theorem with strong (Naïve) independence among features. In machine learning and data […]
2021-11-12

Making data pipelines in Pandas using .pipe() method

13 mins read Real-life data is usually messy. It requires a lot of preprocessing to be ready for use. Pandas being one of […]
2021-11-02

ARCH and GARCH models for Time Series Prediction in Python

11 mins read A change in the variance or volatility over time can cause problems when modeling time series with classical methods like […]
2021-11-02

Finding and removing seasonality in Time-Series Data with Python

17 mins read Seasonality in Time Series Time series data may contain seasonal variation. Seasonal variation, or seasonality, are cycles that repeat regularly […]
2021-11-01

ARIMA and SARIMA for Real-World Time Series Forecasting in Python

15 mins read Time series and forecasting have been some of the key problems in statistics and Data Science. Data becomes a time […]
2021-11-01

A review of techniques for Time Series prediction

43 mins read Working with time series data? Here’s a guide for you. In this article, you will learn how to compare and […]
2021-10-29

Difference between CMD and ENTRYPOINT Commands in Dockerfile

14 mins read Introduction Containers are designed for running specific tasks and processes, not for hosting operating systems. You create a container to serve […]
2021-10-17

Important probability distributions for Data Science with Python code

33 mins read For a data scientist aspirant, Statistics is a must-learn thing. It can process complex and challenging problems in the real […]
2021-10-08

Fundamentals of statistics for Data Scientists and Analysts with Python Code

36 mins read As Karl Pearson, a British mathematician once stated, Statistics is the grammar of science and this holds especially for Computer and Information […]
2021-10-02

Python command-line interface with Click library

7 mins read Python click tutorial shows how to create command-line interfaces with the click module. Python click Python click module is used to create […]
2021-10-01

REINFORCE Algorithm explained in Policy-Gradient based methods with Python Code

16 mins read Policy gradients Policy gradients is a family of algorithms for solving reinforcement learning problems by directly optimizing the policy in […]
2021-09-29

Comparing Python Command-Line Parsing Libraries: Argparse, Docopt, and Click

23 mins read This article uses the following versions of the libraries: (Ignore invoke for now, it’s a special surprise for later!) Command-Line Example The […]
2021-09-12

Best storage formats to save Pandas dataframes

6 mins read When working on data analytical projects, I usually use Jupyter notebooks and a great pandas library to process and move my data around. It […]
2021-09-09

SumTree data structure for Prioritized Experience Replay (PER) explained with Python Code

14 mins read Weighted sampling from a list-like collection is an important activity in many applications. Weighted sampling involves selecting samples randomly from […]
2021-08-22

Understanding Attention Mechanism in Sequence 2 Sequence Machine Translation

39 mins read Introduction Recurrent Neural Networks (or more precisely LSTM/GRU) have been found to be very effective in solving complex sequence-related problems […]
2021-08-03

How to use black, flake8, isort, and pre-commit framework to format Python codes

12 mins read black: The Uncompromising Code Formatter With black you can format Python code from 2.7 all the way to 3.8 (as of version […]
2021-08-03

Understanding Model Calibration and Brier Score

12 mins read Do you ever encounter a storm when the probability of rain in your weather app is below 10%? Well, this […]