Google Scholar

SDPA: An Optimizer for Program Analysis of Data-Parallel Applications

F Wang, X Shi, D Yu, Z Ke, H Jin… - 2018 IEEE 20th …, 2018 - ieeexplore.ieee.org

F Wang, X Shi, D Yu, Z Ke, H Jin, S Wu

2018 IEEE 20th International Conference on High Performance …, 2018•ieeexplore.ieee.org

Data-parallel applications have become prevalent due to the fast development of big data technologies. The performances of these applications are obviously one of the most crucial indexes cared about, while program analysis is a commonly used approach for program optimization. But because a great amount of complex operations, such as data partitioning, distribution, and parallelization, are included in the big data programming model, each data-parallel application is related to a great deal of complex framework code, even if the application is simple. If we analyze the application directly, it is very possible that effective results cannot be got even if consuming a large amount of time. Hence, we present an optimizer, called SDPA, to accelerate and simplify the program analysis of data-parallel applications. SDPA makes use of an important feature in data-parallel applications that the running processes have obvious stages. We implement SDPA to accelerate analysis specified in Spark. Extensive experiments are conducted to evaluate the performance and coverage of SDPA on some classical benchmark applications and real-world applications selected from MLlib. The evaluation results show that SDPA can 1) decrease the preprocessing time by 96.4% to 98.8%, 2) reduce the analysis time by approximately 99.8% compared with the approach of analyzing the entire data-parallel applications directly, and 3) cover majority of real-world applications.

ieeexplore.ieee.org

Show moreShow less

Save Cite Cited by 1 Related articles All 2 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

SDPA: An Optimizer for Program Analysis of Data-Parallel Applications