%0 Conference Proceedings %T Youtube Revisited: On the Importance of Correct Measurement Methodology %+ Helsingin yliopisto = Helsingfors universitet = University of Helsinki %A Karkulahti, Ossi %A Kangasharju, Jussi %Z Part 1: Measurement Tools and Methods %< avec comité de lecture %( Lecture Notes in Computer Science %B 7th Workshop on Traffic Monitoring and Analysis (TMA) %C Barcelona, Spain %Y Moritz Steiner %Y Pere Barlet-Ros %Y Olivier Bonaventure %3 Traffic Monitoring and Analysis %V LNCS-9053 %P 17-30 %8 2015-04-21 %D 2015 %R 10.1007/978-3-319-17172-2_2 %Z Computer Science [cs] %Z Computer Science [cs]/Networking and Internet Architecture [cs.NI]Conference papers %X Measurements of large systems typically rely on sampling to keep the measurement effort practical. For example, Youtube’s video popularity has been measured by crawling either related videos or videos belonging to certain categories or by using a list of, e.g., the most recent videos as the data-source. In this paper we demonstrate that all these methods lead to a biased sample of data when compared to a random sample. We demonstrate the bias by comparing the differently sampled data sets in terms of different commonly used metrics, such as video popularity, age, length, or category. The results show that different sampling methods lead to significantly different values in the metrics, thus potentially leading to very different conclusions about the system under study. The goal of the paper is not to provide yet-another-set-of-numbers for YouTube; instead we seek to emphasize the importance of using correct measurement methodologies and understanding the inherent weaknesses of different methodologies. %G English %Z TC 6 %Z WG 6.6 %2 https://hal.science/hal-01411177/document %2 https://hal.science/hal-01411177/file/336978_1_En_2_Chapter.pdf %L hal-01411177 %U https://hal.science/hal-01411177 %~ IFIP-LNCS %~ IFIP %~ IFIP-TC %~ IFIP-TC6 %~ IFIP-TMA %~ IFIP-WG6-6 %~ IFIP-LNCS-9053