Survey and Review,SIAM Review

当前位置： X-MOL 学术 › SIAM Rev. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Survey and Review
SIAM Review ( IF 10.8 ) Pub Date : 2022-08-04 , DOI: 10.1137/22n975500
J. M. Sanz-Serna

SIAM Review, Volume 64, Issue 3, Page 515-515, August 2022.
This issue contains two Survey and Review papers. The first, by Qinmeng Zou and Frédéric Magoulès, is “Delayed Gradient Methods for Symmetric and Positive Definite Linear Systems.” Gradient methods are the oldest and simplest algorithms to minimize a real objective function (f(x)). The ((n+1))st approximation to the minimizer is defined as (x_n+1 = x_n -\alpha_n g_n), where $g_n$ is the gradient (\nabla f(x_n)) and (\alpha_n>0) is a steplength that depends on the specific method being used. Old and simple as the idea may be, gradient algorithms are of much current interest in the literature; for instance, they played a major role in the influential survey devoted to optimization in machine learning published in Volume 60, Issue 2 of this journal. The paper by Zou and Magoulès focuses on the quadratic case $f(x) = (1/2)x^TAx-b^Tx$ ($A$ symmetric and positive definite), where finding the minimizer is of course equivalent to solving the linear system $Ax=b$ and the gradient $g_n$ coincides with the residual $Ax_n-b$. Well-known strategies to determine the steplength include steepest descent, where $\alpha_n$ is chosen so as to minimize $f(x_{n+1})$, and minimal gradient (or minimal residual), where one rather minimizes the length of $g_{n+1}$. In both of these strategies, the value of $\alpha_n$ depends only on $g_n$. The term “delayed” in the title of the article refers to methods where the recipe to determine $\alpha_n$ includes information from past gradients $g_{n-1}, g_{n-2}$, \dots, and/or past stepsizes $\alpha_{n-1}$, $\alpha_{n-2}$, \dots. The numerical experiments reported clearly indicate that such delayed strategies may give rise to algorithms that are competitive with conjugate gradient methods in large ill-conditioned problems. The paper presents a neat summary of the recent results in this area and of the techniques used to derive them. Mark Van der Boor, Sem C. Borst, Johan S. H. Van Leeuwaarden, and Debankur Mukherjee are the authors of the second paper, “Scalable Load Balancing in Networked Systems: A Survey of Recent Advances.” The problem under consideration is as follows. A dispatcher receives clients that arrive randomly, and her job is to direct them to one of $N\gg 1$ servers. The time required by each client to be served is also random, so that a queue (of random length) of waiting clients will be formed at each server. How should the dispatcher proceed to expedite the service? As one would expect in these days of cloud networks and data systems with massive number of individual centers, the problem is currently receiving much attention in the literature. A strategy that suggests itself is the so-called “join the shortest queue” (JSQ), where on their arrival clients are directed to the server having the shortest queue. While JSQ has been proved to possess several favorable properties, it may not be the best option, due to communication overheads: each time a client arrives, the dispatcher has to communicate with all servers to find the lengths of their queues. The paper analyzes, in the limit $N\rightarrow \infty$, many alternative strategies. Nonspecialists will have little difficulty in reading the easily accessible first few sections and may be interested in discovering how even small tweaks in the algorithms may result in substantial improvements of their performance.

中文翻译：

调查和审查

SIAM 评论，第 64 卷，第 3 期，第 515-515 页，2022 年 8 月。
本期包含两篇调查和评论论文。第一个由 Qinmeng Zou 和 Frédéric Magoulès 撰写，是“对称和正定线性系统的延迟梯度方法”。梯度方法是最小化真实目标函数 (f(x)) 的最古老和最简单的算法。最小化器的 ((n+1))st 近似定义为 (x_n+1 = x_n -\alpha_n g_n)，其中 $g_n$ 是梯度 (\nabla f(x_n)) 和 (\alpha_n> 0) 是取决于所使用的特定方法的步长。尽管这个想法既古老又简单，但梯度算法目前在文献中引起了极大的兴趣。例如，他们在本期刊第 60 卷第 2 期发表的关于机器学习优化的有影响力的调查中发挥了重要作用。Zou 和 Magoulès 的论文侧重于二次情况 $f(x) = (1/2)x^TAx-b^Tx$（$A$ 对称和正定），其中找到最小值当然等同于求解线性系统 $Ax=b$ 和梯度 $g_n$ 与残差 $Ax_n-b$ 重合。确定步长的众所周知的策略包括最速下降，其中选择 $\alpha_n$ 以最小化 $f(x_{n+1})$，以及最小梯度（或最小残差），其中一个宁可最小化长度$g_{n+1}$。在这两种策略中，$\alpha_n$ 的值仅取决于 $g_n$。文章标题中的“延迟”一词是指确定 $\alpha_n$ 的方法包括来自过去梯度 $g_{n-1}、g_{n-2}$、\dots 和/或过去的步长 $\alpha_{n-1}$, $\alpha_{n-2}$, \dots。报告的数值实验清楚地表明，这种延迟策略可能会产生在大型病态问题中与共轭梯度方法竞争的算法。本文简要总结了该领域的最新成果以及用于推导这些成果的技术。Mark Van der Boor、Sem C. Borst、Johan SH Van Leeuwaarden 和 Debankur Mukherjee 是第二篇论文“网络系统中的可扩展负载平衡：近期进展调查”的作者。正在考虑的问题如下。调度员接收随机到达的客户端，她的工作是将它们引导到 $N\gg 1$ 服务器之一。每个客户端服务所需的时间也是随机的，因此每个服务器都会形成一个等待客户端的队列（长度随机）。调度员应该如何着手加快服务？正如人们在当今具有大量独立中心的云网络和数据系统中所期望的那样，该问题目前在文献中受到了广泛关注。一个自以为是的策略是所谓的“加入最短队列”（JSQ），客户端到达时会被定向到具有最短队列的服务器。虽然 JSQ 已被证明具有几个有利的属性，但由于通信开销，它可能不是最佳选择：每次客户端到达时，调度程序都必须与所有服务器通信以查找其队列的长度。该论文分析了在限制 $N\rightarrow \infty$ 中的许多替代策略。

更新日期：2022-08-04

点击分享查看原文

点击收藏

阅读更多本刊最新论文本刊介绍/投稿指南11