The mapping of a log is defined by the different variables which are mapped onto the specific characteristics.
For an eventlog
:
case_id
)activity_id
)activity_instance_id
)lifecycle_id
)timestamp
)resource
)For an activitylog
:
case_id
)activity_id
)timestamps
)resource
)More information on these characteristics can be found here. Each of these
can be modified to approach event logs from a different angle. This can
be done using the eventlog()
or activitylog()
,
auxiliary set_
-functions, or by using an existing
mapping.
library(bupaR)
eventlog()
/activitylog()
The eventlog()
and activitylog()
functions
are not only used to instantiate a log
object, but can also
be used to modify it, by using a log
object as input and
setting only the identifiers one wants to change.
For example, consider the traffic_fines
data. We could
change case_id
argument to the vehicleclass column as
follows (This is a purely hypothetical example). You will see that the
number of cases has changed after this modification.
%>%
traffic_fines eventlog(case_id = "vehicleclass")
## # Log of 34724 events consisting of:
## 4 traces
## 4 cases
## 34724 instances of 11 activities
## 16 resources
## Events occurred from 2006-06-17 until 2012-03-26
##
## # Variables were mapped as follows:
## Case identifier: vehicleclass
## Activity identifier: activity
## Resource identifier: resource
## Activity instance identifier: activity_instance_id
## Timestamp: timestamp
## Lifecycle transition: lifecycle
##
## # A tibble: 34,724 × 18
## case_id activity lifecycle resource timestamp amount article
## <chr> <fct> <fct> <fct> <dttm> <chr> <dbl>
## 1 A1 Create Fine complete 561 2006-07-24 00:00:00 35.0 157
## 2 A1 Send Fine complete <NA> 2006-12-05 00:00:00 <NA> NA
## 3 A100 Create Fine complete 561 2006-08-02 00:00:00 35.0 157
## 4 A100 Send Fine complete <NA> 2006-12-12 00:00:00 <NA> NA
## 5 A100 Insert Fine No… complete <NA> 2007-01-15 00:00:00 <NA> NA
## 6 A100 Add penalty complete <NA> 2007-03-16 00:00:00 71.5 NA
## 7 A100 Send for Credi… complete <NA> 2009-03-30 00:00:00 <NA> NA
## 8 A10000 Create Fine complete 561 2007-03-09 00:00:00 36.0 157
## 9 A10000 Send Fine complete <NA> 2007-07-17 00:00:00 <NA> NA
## 10 A10000 Insert Fine No… complete <NA> 2007-08-02 00:00:00 <NA> NA
## # ℹ 34,714 more rows
## # ℹ 11 more variables: dismissal <chr>, expense <chr>, lastsent <chr>,
## # matricola <dbl>, notificationtype <chr>, paymentamount <dbl>, points <dbl>,
## # totalpaymentamount <chr>, vehicleclass <chr>, activity_instance_id <chr>,
## # .order <int>
set_
-functionsIf we only want to change one of the elements, as in the example
above, set()
provides a very convenient way to do so. The
same change as before can be done as follows:
%>%
traffic_fines set_case_id("vehicleclass")
## # Log of 34724 events consisting of:
## 4 traces
## 4 cases
## 34724 instances of 11 activities
## 16 resources
## Events occurred from 2006-06-17 until 2012-03-26
##
## # Variables were mapped as follows:
## Case identifier: vehicleclass
## Activity identifier: activity
## Resource identifier: resource
## Activity instance identifier: activity_instance_id
## Timestamp: timestamp
## Lifecycle transition: lifecycle
##
## # A tibble: 34,724 × 18
## case_id activity lifecycle resource timestamp amount article
## <chr> <fct> <fct> <fct> <dttm> <chr> <dbl>
## 1 A1 Create Fine complete 561 2006-07-24 00:00:00 35.0 157
## 2 A1 Send Fine complete <NA> 2006-12-05 00:00:00 <NA> NA
## 3 A100 Create Fine complete 561 2006-08-02 00:00:00 35.0 157
## 4 A100 Send Fine complete <NA> 2006-12-12 00:00:00 <NA> NA
## 5 A100 Insert Fine No… complete <NA> 2007-01-15 00:00:00 <NA> NA
## 6 A100 Add penalty complete <NA> 2007-03-16 00:00:00 71.5 NA
## 7 A100 Send for Credi… complete <NA> 2009-03-30 00:00:00 <NA> NA
## 8 A10000 Create Fine complete 561 2007-03-09 00:00:00 36.0 157
## 9 A10000 Send Fine complete <NA> 2007-07-17 00:00:00 <NA> NA
## 10 A10000 Insert Fine No… complete <NA> 2007-08-02 00:00:00 <NA> NA
## # ℹ 34,714 more rows
## # ℹ 11 more variables: dismissal <chr>, expense <chr>, lastsent <chr>,
## # matricola <dbl>, notificationtype <chr>, paymentamount <dbl>, points <dbl>,
## # totalpaymentamount <chr>, vehicleclass <chr>, activity_instance_id <chr>,
## # .order <int>
It is also possible to extract the log
mapping at a
certain point of time using mapping()
.
<- mapping(traffic_fines)
mapping_fines mapping_fines
## Case identifier: case_id
## Activity identifier: activity
## Resource identifier: resource
## Activity instance identifier: activity_instance_id
## Timestamp: timestamp
## Lifecycle transition: lifecycle
We can adjust the mapping incrementally by using the described approaches above.
%>%
traffic_fines set_case_id("vehicleclass") %>%
set_activity_id("notificationtype") -> traffic_fines
Later, we can always undo these changes and “reset” the original
mapping using re_map()
.
%>%
traffic_fines re_map(mapping_fines)
## # Log of 34724 events consisting of:
## 44 traces
## 10000 cases
## 34724 instances of 11 activities
## 16 resources
## Events occurred from 2006-06-17 until 2012-03-26
##
## # Variables were mapped as follows:
## Case identifier: case_id
## Activity identifier: activity
## Resource identifier: resource
## Activity instance identifier: activity_instance_id
## Timestamp: timestamp
## Lifecycle transition: lifecycle
##
## # A tibble: 34,724 × 18
## case_id activity lifecycle resource timestamp amount article
## <chr> <fct> <fct> <fct> <dttm> <chr> <dbl>
## 1 A1 Create Fine complete 561 2006-07-24 00:00:00 35.0 157
## 2 A1 Send Fine complete <NA> 2006-12-05 00:00:00 <NA> NA
## 3 A100 Create Fine complete 561 2006-08-02 00:00:00 35.0 157
## 4 A100 Send Fine complete <NA> 2006-12-12 00:00:00 <NA> NA
## 5 A100 Insert Fine No… complete <NA> 2007-01-15 00:00:00 <NA> NA
## 6 A100 Add penalty complete <NA> 2007-03-16 00:00:00 71.5 NA
## 7 A100 Send for Credi… complete <NA> 2009-03-30 00:00:00 <NA> NA
## 8 A10000 Create Fine complete 561 2007-03-09 00:00:00 36.0 157
## 9 A10000 Send Fine complete <NA> 2007-07-17 00:00:00 <NA> NA
## 10 A10000 Insert Fine No… complete <NA> 2007-08-02 00:00:00 <NA> NA
## # ℹ 34,714 more rows
## # ℹ 11 more variables: dismissal <chr>, expense <chr>, lastsent <chr>,
## # matricola <dbl>, notificationtype <chr>, paymentamount <dbl>, points <dbl>,
## # totalpaymentamount <chr>, vehicleclass <chr>, activity_instance_id <chr>,
## # .order <int>
Read more:
Copyright © 2023 bupaR - Hasselt University