Article: The Ontology-Based Event Mining Tools for Monitoring Global Processes by Abrosimova, Irina, and Lyudmila

Abstract

In this section, the article defines the Event Mining Platform, which is the platform that permits for the analysis of the information derived from internet sources, and organizing of the data to allow for effecting of process mining as per the event log format. The monitoring system for this platform is characterized by the multi-faceted ontology (a combination of various types of ontology). Additionally, the section indicates the main goal for the article, which is providing users with the knowledge to self-reliantly grow the understudied field with minimum limitations. 

Introduction

The internet is a source of all types of information and thus the data extracted and processed has no organizational, geographical, time and disciplinary limitations. The section highlights an approach that entails the implementation of visual construction models of the internet retrieved process and the aim of process mining methodologies, which is processing organized data from event logs. This is made possible by developing event logs formation mechanisms that enable for the processing of the unorganized data retrieved from the internet. The core to the whole of this task is ontology.

Consequently, this part provides the thesis statement of the article, which is the development of an effective solution for retrieval of data based on the problem’s ontology and user requirements.

Methods and Tools for Data Extraction

There are various mechanisms for data extraction, though it is important to select the optimal method and tool to enhance data access. These mechanisms as presented in the article include;

Mechanism/ToolDescription
Web-packersPermits utilization of HTML-markup in extracting required data. The method requires the determination of data structure and direct data extraction
DOM-treeThis mechanism allows for data extraction directly from the source
XpathInvolves the development of the analysis means for DOM-tree and utilizes the XML syntax to describe element location.
Regular ExpressionApplicable in the extraction of limited data like email addresses
The GateThis allows for the possibility of HTML markup webpage importation
RapidMinerThe mechanism is of importance in machine learning and data mining processes. For instance, ETL processes, and predictive analysis
Tomita-parserThe tool is important in the extraction of facts from texts. It uses GLR-algorithm in parsing texts for grammar

Ontology-Based Event Mining Platform

            The process of event mining occurs in various stages including;

  1. Problem ontology development, where a user can incorporate new relations, concepts and event types
  2. Sources ontology setting. A description of sources of information can be provided in this stage.
  3. Starting request formation
  4. Data search.  Information retrieval methods and tools are used in the stage.
  5. Extraction of events and facts from the results in stage four above.
  6. Advanced search. The user adds events and information linked to the starting events based on the relationships in ontology and concepts.
  7. Representation of the retrieved information and facts extracted in a log format.

Algorithm for Searching and structuring the Information on Events

Searching and structuring event information is a time-consuming process and involves various other problems like the availability of duplicate information and assigning main concepts to a wrong semantic content. These issues can be solved by the utilization of ontology. The process of ontology extension as indicated in the article entails automation of the process based on the information results. This is a difficult task and requires an algorithm for optimization. Presented below is the algorithm utilized by the system.

Example of Information Retrieval and Data Structuring at Events Analysis

The paper presents an experiment with RapidMiner (RMonto extension). The stages of the experiment were first determined, and the proceedings for data retrieval were as shown below.

The experiment further slightly changed the algorithm earlier discussed to suit the demands of RapidMiner. The new algorithm entailed the following steps;

  1. Loading the ontology and performing reasoning.
  2. Data retrieval by SPARQL-query.
  3. Conversion of the results from the queries to a word list.
  4. Web-page loading.

The resulting structured data was in a tubular format, though with difficulties in the table created.

 Conclusion

            The most difficult stage when using processing tools is the transformation of ontology into the needed format. The existing tools lack various functionalities and are mostly focused on providing a solution to perfectly handled tasks.

Disadvantage of the paper.

The paper has provided a single experiment (based on a single tool) and thus it cannot be concluded that the available tools present difficulties in the transformation of ontology to the right format. This can be an issue with RapidMiner or the extension of the experiment used. It is important to extensively experiment with other tools and various extensions to justify the papers claim.

Work cited

Abrosimova, Polina, Irina Shalyaeva, and Lyudmila Lyadova. "The Ontology-Based Event Mining Tools for Monitoring Global Processes." 2018 IEEE 12th International Conference on Application of Information and Communication Technologies (AICT). IEEE, 2018.