Advanced Learning Rate Schedules for Stochastic Gradient Descent Methods

报告题目：

报告人：

王小玉研究助理教授

报告人所在单位：

香港科技大学

报告日期：

2024-01-05

报告时间：

13:30-14:10

报告地点：

光华东主楼2201

报告摘要：

Stochastic Gradient Descent (SGD) is a commonly used optimization algorithm for training machine learning models.The learning rate (step size), as a crucial hyperparameter in SGD, directly affects the magnitude of parameter updates.To improve the convergence speed and performance of models, researchers have proposed a variety of advanced learning rate scheduling methods. The objective of this talk is to review advanced learning rate schedules for SGD including step decay,

cyclical step size, and adaptive learning rate. By comprehensively comparing these advanced learning rate scheduling methods, we can observe their respective advantages and applicability to different problems and datasets. Choosing an appropriate learning rate schedule is a crucial step in optimizing the model training process, leading to improved performance and convergence speed.

学术海报5-王小玉.pdf

本年度学院报告总序号：

1059

科学研究

Advanced Learning Rate Schedules for Stochastic Gradient Descent Methods