☛ Statisticians torture their data to fit the model. ☛ Data scientists torture their models to fit the data. ----------------- #datascience #machinelearning
Management tortures data scientists to fit their preconceptions.
Nowadays, or when working with real world data, you have to do both, if we can even call it torture. This is where the principle of Occam’s Razor comes into play: start with the simplest explanation. I usually begin with statistical models to understand the data and then build on that knowledge using a machine learning model or an ensemble. Cheers.
Carl McBride Ellis, PhD, ☛ Statisticians torture their data to fit the model. ☛ Data scientists torture their models to fit the data. Humorous and clever comparison. When I was an undergraduate Chemistry major way before personal computers, a grad student had posted 10 tips for conducting research on his lab door. One related to your observation was: "First plot the curve and then add the data points." My favorite tip was: "Don't just pray for miracles, depend on them."
Is it terrible to add “AI practitioners torture everyone?”
Both are extreme points of view. In the second case, you get overfitting; in the first, we could also think of a kind of overfitting, but in the opposite direction. It’s difficult to reach equilibrium, avoiding situations that could be described as “overfitting methodologies”.
And both should likely look up from their data and their models and observe the real world and get context.
This world is full of pain.
You gotta do both low bias low variance
It's (mostly) not my fault that AI will probably kill us all.
5dAnd if you torture them hard enough they will reveal their secrets and you might learn something valuable, but you torture them too much and they will tell you anything you want to hear. (Apologies for the macabre comment, but the analogy was too good to pass it 😅)