%0 Conference Proceedings %T MR-RBAT: Anonymizing Large Transaction Datasets Using MapReduce %+ Cardiff University %A Memon, Neelam %A Shao, Jianhua %Z Part 1: Data Anonymization and Computation %< avec comité de lecture %( Lecture Notes in Computer Science %B 29th IFIP Annual Conference on Data and Applications Security and Privacy (DBSEC) %C Fairfax, VA, United States %Y Pierangela Samarati %I Springer International Publishing %3 Data and Applications Security and Privacy XXIX %V LNCS-9149 %P 3-18 %8 2015-07-13 %D 2015 %R 10.1007/978-3-319-20810-7_1 %Z Computer Science [cs]Conference papers %X Privacy is a concern when publishing transaction data for applications such as marketing research and biomedical studies. While methods for anonymizing transaction data exist, they are designed to run on a single machine, hence not scalable to large datasets. Recently, MapReduce has emerged as a highly scalable platform for data-intensive applications. In the paper, we consider how MapReduce may be used to provide scalability in transaction anonymization. More specifically, we consider how RBAT may be parallelized using MapReduce. RBAT is a sequential method that has some desirable features for transaction anonymization, but its highly iterative nature makes its parallelization challenging. A direct implementation of RBAT on MapReduce using data partitioning alone can result in significant overhead, which can offset the gains from parallel processing. We propose MR-RBAT that employs two parameters to control parallelization overhead. Our experimental results show that MR-RBAT can scale linearly to large datasets and can retain good data utility. %G English %Z TC 11 %Z WG 11.3 %2 https://inria.hal.science/hal-01745812/document %2 https://inria.hal.science/hal-01745812/file/340025_1_En_1_Chapter.pdf %L hal-01745812 %U https://inria.hal.science/hal-01745812 %~ IFIP-LNCS %~ IFIP %~ IFIP-TC %~ IFIP-WG %~ IFIP-TC11 %~ IFIP-WG11-3 %~ IFIP-DBSEC %~ IFIP-LNCS-9149