Technology portfolio — business process mining, digital twins and knowledge graphs

Enefit IT
6 min readMar 18, 2021

Written by: Kristjan Eljand | Technology Scout

In last two years, we’ve been systematically gathering and evaluating new digital technologies. This article is the first introduction to our digital technology portfolio by giving our opinion about business process mining, digital twins, and knowledge graphs.

Our digital technology portfolio consists of techniques and concepts that can be implemented by software engineers or data scientists. Each technology is assessed from the perspective of business value (horizontal axis on the chart below) and technological matureness (vertical axis). This time, we describe our view on business process mining, digital twins and knowledge graphs — three technologies that have many connections.

Figure 2. Digital Technology Portfolio of E-Lab. Each point on the chart represents one technology. The expected business value is represented on the horizontal axis and the technological matureness on the vertical axis. Green — technologies that are worth to be tested and demonstrated. Dark blue — wait until the tech becomes more mature or until the expected business value increases.

Business Process Mining

Business Process Mining (BPM) is a family of techniques that support the analysis of business processes based on event logs. In other words, BPM is about taking a historical data about the business process and use it to visualize the actual workflow, calculate statistics of the process, analyze the effectiveness of process steps or participants and predict the outcome of ongoing processes.

In addition, modern BPM tools allow you to define the parameters of the process and run various simulations to see what parts of the process are or could become problematic. The animation below is an example of business process simulation with Apromore.

Figure 3. Simulating the process flow with Apromore

The use-cases of BPM could be grouped as follows:

1. Automatic process discovery — takes a process event log as an input and produces as-is model of business process. It has several advantages compared to manual business process modelling: A. It shows how processes are “actually” carried out not how they are “designed” to be carried out. B. It makes automatic updates of process map possible. C. It can maintain unlimited amount of processes.

2. Descriptive process mining — analyzing the effect of different process steps and participants to the duration and the result of the process. Descriptive mining enables to answer questions like “what is the average duration of each process step?” or “how many times are certain activities being carried out?”.

3. Process conformance checking — compares the desired process model with actual process flow and outputs the list of violations. Conformance checking allows to answer questions like “which process cases doesn’t meet the quality criteria?”.

4. Process variant analysis — compares different versions of process flow with each other and allows to answer questions like “what distinguishes fast/successful process flows from slow/unsuccessful processes?”.

5. Predictive process mining — predicts process outcome and enables to answer questions like “what is the probability that my ongoing process will be successful?”.

6. What-if analysis — simulates the process flow based on predetermined parameters or historical patterns and enables to answer questions like “what if the number of incoming work doubles next month?” or “how much can be increase the speed of the process by adding two additional employees?”.

Evaluation

BPM has one of the highest technological matureness scores in our current technology portfolio — tech performance has been validated and the tools have general availability. The business value is expected to be mediocre and the reason is simple — many business processes doesn’t generate high quality business logs in a way that would enable to do BPM. That said, we suggest testing the technology on real use-cases to understand its value to the company.

Figure 4. The position of Business Process Mining in our Technology Portfolio

Digital Twin

Digital Twin is a digital copy of physical object (device, process, person, place). In many ways, it is just about gathering near real-time data about some object and process with enough granularity to be able to see the behavioral patterns. We argue that it only makes sense to distinguish digital twins into separate concept, if it has the following features:

1. Firstly, it needs a near real time relation with the physical object through the data that is generated by that object.

2. Digital twin needs to be represented in relationship with other digital twins (the change in one twin should automatically affect the other twins).

3. Digital twin network needs to allow simulations.

If those three criterions are fulfilled, we can say that we have good digital representation of some object or process that we can use for statistical analysis or “what-if” simulations. This representation can then be used for:

  • workflow optimization — optimizing the underlying process by trying out various options.
  • stress testing — testing, what are the limits of certain system before it breaks.

NB: Notice, that when we detect business process with BPM and are able to simulate the workflow, then we can say to have a good digital twin of this process.

Evaluation

We believe that digital twins have high business value potential — effective digital twins allow to make decisions not on historical data but rather on simulated data. Thus, it has a promise for better decisions with less costs (of failure).

That said, the technological matureness is low. Today, the key problem of digital twins is not how to set it up but rather in creating/training the models that say how the parameters of Digital twin X should change when the parameters of Digital twin Y. If we model this relationship incorrectly, we won’t get any meaningful results. Our suggestion is to proceed with caution — wait until the technology gets more mature or test it on non-critical use-cases.

Knowledge graphs

A knowledge graph is a data that describes entities and their relationships in graph-structured data model. I know from the personal experience that understanding this definition is not easy. Thus, let me clarify it by comparing the traditional table-based representation with graph representation.

Imagine, that we have some data about our suppliers, the products that they sell and the purchases that we have made from those suppliers. In relational database, we would store the supplier information in one database table, the items they offer in another and the purchases that we have made in a third table.

The graph database stores the same sort of data, but it also able to store the relationship between the entities. I can see the relationships without having to run JOIN queries.

Knowledge graph is aimed at managing data that has many entity types and is highly interconnected. Such data can also be represented in traditional SQL format, but challenges arise in the modelling and querying of the data. It is reasonable to consider graph-based data representation if:

  • the data is highly related,
  • data schema needs to be flexible or if we don’t know what the correct schema is,
  • we need to have a data structure that is more like the way people think.

NB: The data of digital twins (objects and their relations) is also regularly represented as knowledge graph. Also, business process with its interconnected tasks and participants could be represented as a knowledge graph.

Evaluation

In our opinion, the knowledge graphs have mediocre business value in the energy sector and the technological matureness is above average. One of the main problems with graph representation of the data is that we haven’t seen any high-value use-case (yet :)).

So, this was the first peek into our digital technology portfolio. I hope that everybody found something interesting and useful.

--

--