February 21, 2020

Walkthrough of an exploratory analysis for classification problems

In this post I outline how to perform an exploratory analysis for a binary classification problem. I am going to […]
February 5, 2020

Dealing with Imbalanced Data

https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18 https://towardsdatascience.com/methods-for-dealing-with-imbalanced-data-5b761be45a18 Imbalanced classes are a common problem in machine learning classification where there are a disproportionate ratio of observations […]
February 3, 2020

Exploratory Data Analysis

https://towardsdatascience.com/exploratory-data-analysis-8fc1cb20fd15 https://medium.com/omarelgabrys-blog/statistics-probability-exploratory-data-analysis-714f361b43d1 https://www.kaggle.com/ekami66/detailed-exploratory-data-analysis-with-python https://www.kaggle.com/dvigneshwer/kernele7f4dbb964/notebook Visualizing the distribution of a dataset — seaborn 0.10.0 documentationhttps://seaborn.pydata.org/tutorial/distributions.html https://www.kaggle.com/kashnitsky/topic-1-exploratory-data-analysis-with-pandas https://iq.opengenus.org/exploratory-data-analysis-python/ Plotting with categorical data […]
February 3, 2020

Data Levels of Measurement

There are four measurement scales: nominal, ordinal, interval and ratio. These are simply ways to categorize different types of variables […]
September 8, 2017

L1 and L2 as Loss Function and Regularization

While practicing machine learning, you may have come upon a choice of the mysterious L1 vs L2. Usually the two […]
September 8, 2017

Missing data methods

https://www.iriseekhout.com/missing-data/missing-data-mechanisms/mcar/   Missing Completely at Random Missing completely at random (MCAR) is the only missing data mechanism that can actually […]