SDPA: An Optimizer for Program Analysis of Data-Parallel Applications
2018 IEEE 20th International Conference on High Performance …, 2018•ieeexplore.ieee.org
Data-parallel applications have become prevalent due to the fast development of big data
technologies. The performances of these applications are obviously one of the most crucial
indexes cared about, while program analysis is a commonly used approach for program
optimization. But because a great amount of complex operations, such as data partitioning,
distribution, and parallelization, are included in the big data programming model, each data-
parallel application is related to a great deal of complex framework code, even if the …
technologies. The performances of these applications are obviously one of the most crucial
indexes cared about, while program analysis is a commonly used approach for program
optimization. But because a great amount of complex operations, such as data partitioning,
distribution, and parallelization, are included in the big data programming model, each data-
parallel application is related to a great deal of complex framework code, even if the …
Data-parallel applications have become prevalent due to the fast development of big data technologies. The performances of these applications are obviously one of the most crucial indexes cared about, while program analysis is a commonly used approach for program optimization. But because a great amount of complex operations, such as data partitioning, distribution, and parallelization, are included in the big data programming model, each data-parallel application is related to a great deal of complex framework code, even if the application is simple. If we analyze the application directly, it is very possible that effective results cannot be got even if consuming a large amount of time. Hence, we present an optimizer, called SDPA, to accelerate and simplify the program analysis of data-parallel applications. SDPA makes use of an important feature in data-parallel applications that the running processes have obvious stages. We implement SDPA to accelerate analysis specified in Spark. Extensive experiments are conducted to evaluate the performance and coverage of SDPA on some classical benchmark applications and real-world applications selected from MLlib. The evaluation results show that SDPA can 1) decrease the preprocessing time by 96.4% to 98.8%, 2) reduce the analysis time by approximately 99.8% compared with the approach of analyzing the entire data-parallel applications directly, and 3) cover majority of real-world applications.
ieeexplore.ieee.org
Showing the best result for this search. See all results