Abstract
![CDATA[The amount of data is growing sharply on the Internet. Some data like log files are enormous and entail valuable and precious hidden patterns. In other words, a log file is a set of recorded events that carry beneficial and vital information to develop web server performance, stability server loads, control, and rush up user response operations. However, analyzing massive data take a long time and require powerful hardware. Also, the performance of sequential pattern mining methods is usually unsatisfactory to deal with such data. This paper proposes a novel and advanced parallel method for finding the log file patterns, such as frequent patterns (e.g., URL, IP, Status Code ), how users accessed files, the number of errors, and the most common errors by applying the Apache Spark platform. Experiment results demonstrate that the proposed method's run time on three datasets is significantly less than its four rival pattern mining methods.]]
Original language | English |
---|---|
Title of host publication | Proceedings of the 7th International Conference on Web Research (ICWR), 19-20 May 2021, Tehran, Iran |
Publisher | IEEE |
Pages | 114-118 |
Number of pages | 5 |
ISBN (Print) | 9781665404266 |
DOIs | |
Publication status | Published - 2021 |
Event | International Conference on Web Research - Duration: 19 May 2021 → … |
Conference
Conference | International Conference on Web Research |
---|---|
Period | 19/05/21 → … |