How to Train a Decision Tree Classifier… In SQL | by Dario Radečić | Apr, 2024

SQL can now replace Python for most supervised ML tasks. Should you make the switch?

Dario Radečić
Towards Data Science
by Resource Database on Unsplash

When it comes to , I’m an avid fan of attacking where it lives. 90%+ of the time, that’s going to be a , assuming we’re talking about supervised machine learning.

Python is amazing, but pulling dozens of GB of data whenever you want to train a model is a huge , especially if you need to retrain them frequently. Eliminating data movement makes a lot of sense. SQL is your friend.

For this article, I’ll use an always-free Oracle Database 21c provisioned on Oracle . I’m not sure if you can translate the logic to other database . Oracle works like a charm, and the database you provision won’t you a dime — ever.

I’ll leave the Python vs. Oracle for machine learning on huge dataset for some other time. Today, it’s all about getting back to basics.

I’ll use the following dataset today:

  • Fisher, R.A. (1936). The use of multiple measurements in taxonomic problems. of California, Irvine, School of Information and Computer Sciences. Retrieved…

Source link