An Architecture for Data Warehousing in Big Data Environments

Bruno Martinho; Maribel Yasmina Santos

doi:10.1007/978-3-319-49944-4_18

Conference Papers Year : 2016

An Architecture for Data Warehousing in Big Data Environments

(1) , (1)

Bruno Martinho

Function : Author
PersonId : 1022312

Universidade do Minho = University of Minho [Braga]

Maribel Yasmina Santos

Function : Author
PersonId : 1022313

Universidade do Minho = University of Minho [Braga]

Abstract

Recent advances in Information Technologies facilitate the increasing capacity to collect and store data, being the Big Data term often mentioned. In this context, many challenges need to be addressed, being Data Warehousing one of them. In this sense, the main purpose of this work is to propose an architecture for Data Warehousing in Big Data, taking as input a data source stored in a traditional Data Warehouse, which is transformed into a Data Warehouse in Hive. Before proposing and implementing the architecture, a benchmark was conducted to verify the processing times of Hive and Impala, understanding how these technologies could be integrated in an architecture where Hive plays the role of a Data Warehouse and Impala is the driving force for the analysis and visualization of data. After the proposal of the architecture, it was implemented using tools like the Hadoop ecosystem, Talend and Tableau, and validated using a data set with more than 100 million records, obtaining satisfactory results in terms of processing times.

Keywords

Domains

Fichier principal

432749_1_En_18_Chapter.pdf (940.88 Ko)

Origin	Files produced by the author(s)
licence	CC BY 4.0 - Attribution

Connect in order to contact the contributor

https://inria.hal.science/hal-01630532

Submitted on : Tuesday, November 7, 2017-5:26:51 PM

Last modification on : Friday, December 1, 2023-12:30:10 PM

Long-term archiving on : Thursday, February 8, 2018-2:45:50 PM

Dates and versions

hal-01630532 , version 1 (07-11-2017)

Licence

CC BY 4.0 - Attribution

Identifiers

HAL Id : hal-01630532 , version 1
DOI : 10.1007/978-3-319-49944-4_18

Cite

Bruno Martinho, Maribel Yasmina Santos. An Architecture for Data Warehousing in Big Data Environments. 10th International Conference on Research and Practical Issues of Enterprise Information Systems (CONFENIS), Dec 2016, Vienna, Austria. pp.237-250, ⟨10.1007/978-3-319-49944-4_18⟩. ⟨hal-01630532⟩

An Architecture for Data Warehousing in Big Data Environments

Abstract

Keywords

Domains

Dates and versions

Licence

Identifiers

Cite

Export

Collections

Altmetric

Share