Data Science

Regressor as Classifier

... it works better than you think.

596 words/3 min read

A classifier predicts a categorical target variable while a regressor predicts a continuous response variable. Can we fit a regressor as a classifier? Let’s find out.

Are Outliers Always a Problem?

no...sometimes we need to treat them with respect

436 words/3 min read
Outliers are mostly seen as not needed. We will see an example where this is not true.

Calibrate Your Classifier

you might wonder how you were doing without it all along

705 words/4 min read
Would you buy oranges weighed on an uncalibrated weighing scale? Then why would you trust an uncalibrated classifier?

Roc Curve Step by Step

...with precision-recall curve thrown in

1062 words/5 min read
The ROC curve is an important metric to compare classifiers. Learn how to draw one step-by-step.
Linear Regression From Scratch

Linear Regression From Scratch

Don't just tow the line, move it!

1597 words/8 min read

Imagine you are buying a car and you want to know about its mileage. You don’t want to go for the user reviews or the company’s claim of mileage. The option you are left with is to predict the mileage all by yourself. So, if you are an interested data scientist, why not give it a try?

Learning Curve

396 words/2 min read

Imagine you have four classifiers with similar accuracies. Are they really similar? Plotting a learning curve might reveal a hidden side to these classifiers.

Understand the Confusion Matrix

Understand the Confusion Matrix

Blow away the confusion

1101 words/6 min read

You’ve created a classification model and come across a new concept called confusion matrix. However tough it may seem, a classification model evaluation is not complete unless you add in your confusion matrix.

Likelihood ratio: keeping your classifier honest

Is a respectable accuracy score enough?

491 words/3 min read

Imagine training a classifier on a dataset only to find your friend is almost as good guessing at the target label, that too without looking at the data. Is your classifier any good then?

Naive Bayes With Quantile Discretization

Discretization saves the day!

208 words/1 min read
Listen – this blog post explained Often, classification datasets have a mix of continuous and categorical data. The continuous data typically have problems such as outliers, noise and lack of a defined distribution.