Very simply: plain decision trees usually overfit to training data (and, therefore, perform very badly out of sample). So the important part isn't the tree but the boosting. How you go from an ensemble of weak learners to something that works.
And this boosting generalises to any learner. You can apply it to regression too. Again, the boosting part is really the key. The innovation isn't a new technique either, it is just the aggressive application of computing power to these problems.
They are the same concept under the hood, but a GBDT is an ensemble model using a number of trees in tandem that are grown to improve the performance of the overall model.
And this boosting generalises to any learner. You can apply it to regression too. Again, the boosting part is really the key. The innovation isn't a new technique either, it is just the aggressive application of computing power to these problems.