What are the advantages of deep neural network in tabular data?


I have been reading about boosted trees on tabular data. Can you tell me the difference between them and if deep learning has some advantages?

Thank you.

What is the difference between gradient boosted trees and deep learning?

Deep learning models are artificial neural networks whereas boosted trees are an ensemble of decision trees. An ensemble in this context refers to a combination of multiple algorithms in order to make a prediction.

Deep learning models can be trained using the Peltarion Platform however boosted trees can currently not. If you are familiar with python I would recommend the XGBoost package in order to train boosted trees.

Does deep learning have some advantage over gradient boosted trees?

Sadly there is no clear conclusion on whether boosted trees are better or worse than deep learning models in general. Which method to opt for depends very much on the amount of available data and how difficult the task is.

Historically the consensus has leaned in favour of traditional machine learning methods such as boosted trees however more recent deep learning models such as TabNet, have outperformed boosted trees on multiple datasets.

In our internal investigation into this at Peltarion, we found that there are many cases where deep learning is preferable. One consistent observable pattern from our experiments was that the more data (rows) available the larger are the benefits of using deep learning models.

If possible and minimal improvements in performance are very important for your case, I would recommend trying out both approaches, as which method is preferable is very case specific.

1 Like