
#Process 3 key attributes how to
Refer to Including Multiple Timestamp Columns to learn how to include multiple timestamps in Disco. It allows you to analyze the processing time of an activity (the time someone actively spent on performing that task), also called execution time or activity handling time. Sometimes, you have a start and a complete timestamp for each activity in the process. If you don’t have a sequentialized log file, you need timestamps to determine the order of the activities in your process.

While you can also filter out less relevant events later in the analysis, it is important to make sure that the relevant process steps are captured in your data. Look for events which describe the interesting activities for your process from the business perspective. Your data needs to be on the transactional level (you should have access to the history of each case) and should not be aggregated to the case level.Įvents can sometimes record not only activities you care about, but also less interesting technical information. If you have only one entry (one row) for each case, then your data is not detailed enough. There should be names for different process steps or status changes that were performed in the process. Some of these steps might occur more than once for a single case while not all of them need to happen every time. For example, a document authoring process may consist of the steps ‘Create’, ‘Update’, ‘Submit’, ‘Approve’, ‘Request rework’, ‘Revise’, ‘Publish’, ‘Discard’ (performed by different people such as authors and editors). In an event log:Īn activity forms one step in your process. Because a data set that is used for process mining consists of events, this kind of data is often referred to as event log. In contrast to the data mining example above, an individual row does not represent a complete process instance, but just an event. In Figure 2, you see a simplified example data set from a call center process. If we would want to derive rules for another attribute, for example, predict how old the customers who buy our widgets typically are, then the Age column would be the classification target.įor process mining, we have a slightly different mental model, because we look at the data from a process perspective. The result shows that only males with a high salary are buying the widgets. A data mining tool would then be able to construct a decision tree like depicted on the right in Figure 1.

Because we want to find out who is buying the widgets, we would make the Buy widget column the classification target. An instance is a learning example that can be used for learning the classification rules.įigure 1: Data mining example: The classification target class needs to be configured.īefore the classification algorithm can be started, one needs to determine which of the columns is the target class. Each row forms one instance in the data set.

There are columns for the attributes Name, Salary, Sex, Age, and Buy widget. On the left in Figure 1, you see a very simple example of a data set. Imagine that you have a widget factory and you want to understand which kinds of customers are buying your widgets. To understand what this means, let us first take a look at another mental model: The mental model for classification techniques in data mining. To be able to do that, process mining approaches data with a mental model that maps the data to a process view.

You want to answer questions such as “What does my As-is process currently look like?”, “Are there waste and unnecessary steps that could be eliminated?”, “Where are the bottlenecks?””, and “Are there deviations from the rules and prescribed processes?”. The core idea of process mining is to analyze data from a process perspective. But what exactly is an event log? Where do event logs come from? And how do you know whether your data satisfies the requirements to apply process mining? This is what this introductory chapter is about. The starting point for process mining is a so-called event log. so most organizations have lots of data. There are so many workflow systems, CRM systems, ERP systems, delivery notes, request, complaint, ticketing, or order systems, etc. Sometimes people are worried that they do not have the right data, but in practice this is rarely the case. These data are collected right now by the various IT systems you already have in place to support your business. Instead you can use data that accumulates as a byproduct of the increasing automation and digitization of your business processes. There is no need to first set up a data collection framework. One of the big advantages of process mining is that it starts with the data that is already there, and usually it starts very simple.
