Filters the log based on frequency of activities.
Usage
filter_activity_frequency(
log,
interval = NULL,
percentage = NULL,
reverse = FALSE,
eventlog = deprecated()
)
# S3 method for class 'log'
filter_activity_frequency(
log,
interval = NULL,
percentage = NULL,
reverse = FALSE,
eventlog = deprecated()
)
# S3 method for class 'grouped_log'
filter_activity_frequency(
log,
interval = NULL,
percentage = NULL,
reverse = FALSE,
eventlog = deprecated()
)
Arguments
- log
log
: Object of classlog
or derivatives (grouped_log
,eventlog
,activitylog
, etc.).- percentage, interval
The target coverage of activity instances. Provide either
percentage
orinterval
.percentage
(numeric
): A percentile of p will return the most common activity types of the log, which account for at least p% of the activity instances.interval
(numeric
vector of length 2): An activity frequency interval. Half open interval can be created usingNA
.
For more information, see 'Details' below.- reverse
logical
(defaultFALSE
): Indicating whether the selection should be reversed.- eventlog
Value
When given an object of type log
, it will return a filtered log
.
When given an object of type grouped_log
, the filter will be applied in a stratified way (i.e. each separately for each group).
The returned log will be grouped on the same variables as the original log.
Details
Filtering the log based on activity frequency can be done in two ways: using an interval
of allowed frequencies,
or specify a coverage percentage
:
percentage
: When filtering using a percentage p%, the filter will return p% of the activity instances, starting from the activity labels with the highest frequency. The filter will retain additional activity labels as long as the number of activity instances does not exceed the percentage threshold.interval
: When filtering using an interval, activity labels will be retained when their absolute frequency fall in this interval. The interval is specified using a numeric vector of length 2. Half open intervals can be created by usingNA
, e.g.,c(10, NA)
will select activity labels which occur 10 times or more.
Methods (by class)
filter_activity_frequency(log)
: Filters activities for alog
.filter_activity_frequency(grouped_log)
: Filters activities for agrouped_log
.
References
Swennen, M. (2018). Using Event Log Knowledge to Support Operational Exellence Techniques (Doctoral dissertation). Hasselt University.
See also
Other filters:
filter_activity()
,
filter_activity_instance()
,
filter_activity_presence()
,
filter_case()
,
filter_case_condition()
,
filter_endpoints()
,
filter_endpoints_condition()
,
filter_flow_time()
,
filter_idle_time()
,
filter_infrequent_flows()
,
filter_lifecycle()
,
filter_lifecycle_presence()
,
filter_precedence()
,
filter_precedence_condition()
,
filter_precedence_resource()
,
filter_processing_time()
,
filter_resource()
,
filter_resource_frequency()
,
filter_throughput_time()
,
filter_time_period()
,
filter_trace()
,
filter_trace_frequency()
,
filter_trace_length()
,
filter_trim()
,
filter_trim_lifecycle()