GD vs. SGD

What are the differences between gradient descent and stochastic gradient descent? When would you use one over the other?

Junior

Maskinlæring


Gradient descent and stochastic gradient descent (SGD) are optimization algorithms used to minimize a function, typically associated with minimizing the error in a model.

The primary differences between the two are the following:

Gradient Descent (GD)

Stochastic Gradient Descent (SGD)

When to use one over the other:

Moreover, variations such as mini-batch gradient descent, which balances the benefits of both GD and SGD by considering a subset of the data for each update, are often used in practice. The choice between these algorithms often depends on computational resources, dataset size, and the specific problem’s characteristics.