当前位置: X-MOL 学术J. Heuristics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Evolutionary multi-level acyclic graph partitioning
Journal of Heuristics ( IF 2.7 ) Pub Date : 2020-07-15 , DOI: 10.1007/s10732-020-09448-8
Orlando Moreira , Merten Popp , Christian Schulz

Directed graphs are widely used to model data flow and execution dependencies in streaming applications. This enables the utilization of graph partitioning algorithms for the problem of parallelizing execution on multiprocessor architectures under hardware resource constraints. However due to program memory restrictions in embedded multiprocessor systems, applications need to be divided into parts without cyclic dependencies. We found that this can be done by a subsequent second graph partitioning step with an additional acyclicity constraint. We have four main contributions. First, we show that this more constrained version of the graph partitioning problem is NP-complete and present linear time heuristics. We then integrate them into an existing multi-level graph partitioning framework to better handle large graphs. This achieves a 9% reduction of the edge cut compared to the previous single-level algorithm. Based on this, we engineer an evolutionary algorithm to further reduce the cut, achieving a 30% reduction on average compared to the state of the art. Finally, we integrate the partitioning heuristics into a graph compiler for an embedded multiprocessor architecture and show that this can reduce the amount of communication for a real-world imaging application and thereby accelerate it by an average of 11%. It is shown that the compiler can emit optimized code for vastly different hardware platforms using the heuristics. In addition, we demonstrate how a custom fitness function for the evolutionary algorithm can be used to optimize other objectives like load balancing if the communication volume is not predominantly important on a given hardware platform.

中文翻译:

进化多级非循环图划分

有向图被广泛用于对流应用程序中的数据流和执行依赖性进行建模。这样就可以利用图分区算法来解决在硬件资源约束下并行执行多处理器体系结构的问题。但是,由于嵌入式多处理器系统中程序内存的限制,需要将应用程序分为没有循环依赖性的部分。我们发现,这可以通过随后的带有附加非循环性约束的第二个图分区步骤来完成。我们有四个主要贡献。首先,我们证明了图划分问题的这种更受约束的版本是NP完全的,并且具有线性时间启发式。然后,我们将它们集成到现有的多层次图分区框架可以更好地处理大型图。与以前的单级算法相比,这可将边缘切割减少9%。基于此,我们设计了一种进化算法来进一步减少切割,与现有技术相比平均减少了30%。最后,我们将分区启发式方法集成到用于嵌入式多处理器体系结构的图形编译器中,并表明这可以减少实际成像应用程序的通信量,从而平均提高11%。结果表明,使用启发式方法,编译器可以针对大量不同的硬件平台发出优化的代码。此外,如果通信量在给定的硬件平台上不是很重要的话,我们将演示如何将用于进化算法的自定义适应度函数用于优化其他目标,例如负载平衡。
更新日期:2020-07-15
down
wechat
bug