A review on information theory concepts for machine learning: Entropy, Cross-Entropy, KL divergence, Information gain, and Mutual Information

58 mins read Information Theory Information theory is a field of study concerned with quantifying information for communication. It is a subfield of mathematics […]

Understanding ROC and Precision-Recall curves

25 mins read It can be more flexible to predict probabilities of an observation belonging to each class in a classification problem rather […]

Bulk Boto3 (bulkboto3): Python package for fast and parallel transferring a bulk of files to S3 based on boto3!

5 mins read Table of Contents: Introduction About bulkboto3 Getting Started Prerequisites Installation Usage Contributing Conclusion Introduction “How to transfer a bulk of […]

A tutorial on data science project experimentation with Jupyter, Papermill, and MLflow

7 mins read Your company (e.g., an e-commerce platform across several countries) is starting a new project on fraud detection. You begin by […]

Steps to package and publish Python codes to PyPI (pip)

6 mins read You wrote a new Python package that solves a specific problem and it’s now time to share it with the […]

Styling Pandas dataframes using Styler

7 mins read What is styling and why care? The basic idea behind styling is that a user will want to modify the way […]

Different Python package import patterns using __init__.py file

10 mins read I have had a few conversations lately about Python packaging, particularly around structuring the import statements to access the various modules of […]

Feature Importance calculation using Random Forest

5 mins read The feature importance (variable importance) describes which features are relevant. It can help with a better understanding of the solved […]

Mel Spectrogram Explained with Python Code

6 mins read Signals A signal is a variation in a certain quantity over time. For audio, the quantity that varies is air pressure. How […]

Categorical data type in Pandas

8 mins read You may have categorical data in your dataset. A categorical data is a type with two or more categories. If […]

NumPy Broadcasting tutorial

13 mins read In operations between NumPy arrays (ndarray), each shape is automatically converted to be the same by broadcasting. This article describes the following […]

PySpark equivalent methods for Pandas dataframes

8 mins read Pandas is the go-to library for every data scientist. It is essential for every person who wishes to manipulate data […]

A complete guide to writing custom Datasets and DataLoader in PyTorch

19 mins read Table of Contents An Introduction To PyTorch Dataset and DataLoaderWhy Write Good Data Loaders and Datasets?The Basic PyTorch Dataset StructureImplementing […]

Setup Celery with Redis for Django Tutorial

9 mins read When you work on data-intensive applications, long-running tasks can seriously slow down your users. Modern users expect pages to load […]

Understanding TF-IDF with Python example

6 mins read Term Frequency – Inverse Document Frequency (TF-IDF) is a widely used statistical method in natural language processing and information retrieval. […]

Understanding Pandas and NumPy views vs copies to handle SettingWithCopyWarning

33 mins read Table of Contents Prerequisites Example of a SettingWithCopyWarning Views and Copies in NumPy and Pandas Understanding Views and Copies in […]

Classical Time Series Forecasting Models in Python

11 mins read Machine learning methods can be used for the classification and forecasting of time series problems. Before exploring machine learning methods for time […]

Autocorrelation and Partial Autocorrelation explained with Python code

10 mins read What is correlation? In statistics, correlation or dependence refers to any statistical association between two random variables or bivariate data, whether causal […]

Understanding 1D, 2D, and 3D convolutional layers in deep neural networks

21 mins read In deep learning, convolutional layers have been major building blocks in many deep neural networks. The design was inspired by […]

Hyperparameter optimization techniques in machine learning with Python code

10 mins read In every Machine Learning project, it is possible and recommended to search the hyperparameter space to get the best performance […]