Similarity Aware Shuffling for the Distributed Execution of SQL Window Functions - Distributed Applications and Interoperable Systems (DAIS 2017)
Conference Papers Year : 2017

Similarity Aware Shuffling for the Distributed Execution of SQL Window Functions

Abstract

Window functions are extremely useful and have become increasingly popular, allowing ranking, cumulative sums and other analytic aggregations to be computed over a highly flexible and configurable sliding window. This powerful expressiveness comes naturally at the expense of heavy computational requirements which, so far, have been addressed through optimizations around centralized approaches by works both from the industry and academia. Distribution and parallelization has the potential to improve performance, but introduces several challenges associated with data distribution that may harm data locality. In this paper, we show how data similarity can be employed across partitions during the distributed execution of these operators to improve data co-locality between instances of a Distributed Query Engine and the associated data storage nodes. Our contribution can attain network gains in the average of 3 times and it is expected to scale as the number of instances increase. In the scenario with 8 nodes, we were to able attain bandwidth and time savings of 7.3 times and 2.61 times respectively.
Fichier principal
Vignette du fichier
450046_1_En_1_Chapter.pdf (808.1 Ko) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

hal-01800128 , version 1 (25-05-2018)

Licence

Identifiers

Cite

Fábio Coelho, Miguel Matos, José Pereira, Rui Oliveira. Similarity Aware Shuffling for the Distributed Execution of SQL Window Functions. 17th IFIP International Conference on Distributed Applications and Interoperable Systems (DAIS), Jun 2017, Neuchâtel, Switzerland. pp.3-18, ⟨10.1007/978-3-319-59665-5_1⟩. ⟨hal-01800128⟩
159 View
119 Download

Altmetric

Share

More