%0 Conference Proceedings %T Online Event Correlations Analysis in System Logs of Large-Scale Cluster Systems %+ Institute of Computing Technology %+ China Mobile [Hong Kong] %A Zhou, Wei %A Zhan, Jianfeng %A Meng, Dan %A Zhang, Zhihong %< avec comité de lecture %( Lecture Notes in Computer Science %B IFIP International Conference on Network and Parallel Computing (NPC) %C Zhengzhou, China %Y Chen Ding; Zhiyuan Shao; Ran Zheng %I Springer %3 Network and Parallel Computing %V LNCS-6289 %P 262-276 %8 2010-09-13 %D 2010 %R 10.1007/978-3-642-15672-4_23 %K System logs %K online log analysis %K event correlations %K online event prediction %Z Computer Science [cs]/Digital Libraries [cs.DL]Conference papers %X It has been long recognized that failure events are correlated, not independent. Previous research efforts have shown the correlation analysis of system logs is helpful to resource allocation, job scheduling and proactive management. However, previous log analysis methods analyze the history logs offline. They fail to capture the dynamic change of system errors and failures. In this paper, we purpose an online log analysis approach to mine event correlations in system logs of large-scale cluster systems. Our contributions are three-fold: first, we analyze the event correlations of system logs of a 260-nodes production Hadoop cluster system, and the result shows that the correlation rules of logs change dramatically in different periods; Second, we present a online log analysis algorithm Apriori-SO; third, based on the online event correlations mining, we present an online event prediction method that can predict diversities of failure events with the great detail. The experiment result of a 260-nodes production Hadoop cluster system shows that our online log analysis algorithm can analyze the log streams to obtain event correlation rules in soft real time, and our online event prediction method can achieve higher precision rate and recall rate than the offline log analysis approach. %G English %2 https://inria.hal.science/hal-01054978/document %2 https://inria.hal.science/hal-01054978/file/Online_Event_Correlations_Analysis_in_System_logs_of_Large-scale_Cluster_Systems.pdf %L hal-01054978 %U https://inria.hal.science/hal-01054978 %~ IFIP-LNCS %~ IFIP %~ IFIP-LNCS-6289 %~ IFIP-NPC %~ IFIP-2010