Function detecting missing values at different levels of aggregation
overview: presents an overview of the absolute and relative number of missing values for each column
column: presents an overview of the absolute and relative number of missing values for a particular column
activity: presents an overview of the absolute and relative number of missing values for each column, aggregated by activity
detect_missing_values( activitylog, level_of_aggregation, column, details, filter_condition )
activitylog | The activity log |
---|---|
level_of_aggregation | Level of aggregation at which missing values are identified (either "overview", "column" or "activity) |
column | Column name of the column that needs to be analyzed when the level of aggregation is "column" |
details | Boolean indicating wheter details of the results need to be shown |
filter_condition | Condition that is used to extract a subset of the activity log prior to the application of the function |
activitylog containing the rows of the original activity log which contain a missing value
#>#>#>#> #> patient_visit_nr 0 #> activity 0 #> originator 2 #> start 1 #> complete 0 #> triagecode 1 #> specialization 0#>#> #> patient_visit_nr 0.000000 #> activity 0.000000 #> originator 3.773585 #> start 1.886792 #> complete 0.000000 #> triagecode 1.886792 #> specialization 0.000000#>#> # A tibble: 4 x 7 #> patient_visit_nr activity originator start complete #> <dbl> <chr> <chr> <dttm> <dttm> #> 1 510 Clinica~ Doctor 7 2017-11-20 11:35:01 2017-11-20 11:36:09 #> 2 533 0 NA 2017-11-22 18:35:00 2017-11-22 18:37:00 #> 3 534 Registr~ NA 2017-11-22 18:35:00 2017-11-22 18:37:00 #> 4 512 Clinica~ Doctor 7 NA 2017-11-20 11:33:57 #> # ... with 2 more variables: triagecode <dbl>, specialization <chr>detect_missing_values(activitylog = hospital_actlog, level_of_aggregation = "activity")#>#>#>#> # A tibble: 9 x 7 #> activity patient_visit_nr originator start complete triagecode specialization #> <chr> <int> <int> <int> <int> <int> <int> #> 1 0 0 1 0 0 0 0 #> 2 Clinical~ 0 0 1 0 1 0 #> 3 registra~ 0 0 0 0 0 0 #> 4 Registra~ 0 1 0 0 0 0 #> 5 Trage 0 0 0 0 0 0 #> 6 Treatment 0 0 0 0 0 0 #> 7 Treatmen~ 0 0 0 0 0 0 #> 8 Triaga 0 0 0 0 0 0 #> 9 Triage 0 0 0 0 0 0#>#> # A tibble: 9 x 7 #> activity patient_visit_nr originator start complete triagecode specialization #> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> #> 1 0 0 1 0 0 0 0 #> 2 Clinical~ 0 0 0.111 0 0.111 0 #> 3 registra~ 0 0 0 0 0 0 #> 4 Registra~ 0 0.0714 0 0 0 0 #> 5 Trage 0 0 0 0 0 0 #> 6 Treatment 0 0 0 0 0 0 #> 7 Treatmen~ 0 0 0 0 0 0 #> 8 Triaga 0 0 0 0 0 0 #> 9 Triage 0 0 0 0 0 0#>#> # A tibble: 4 x 7 #> patient_visit_nr activity originator start complete #> <dbl> <chr> <chr> <dttm> <dttm> #> 1 510 Clinica~ Doctor 7 2017-11-20 11:35:01 2017-11-20 11:36:09 #> 2 533 0 NA 2017-11-22 18:35:00 2017-11-22 18:37:00 #> 3 534 Registr~ NA 2017-11-22 18:35:00 2017-11-22 18:37:00 #> 4 512 Clinica~ Doctor 7 NA 2017-11-20 11:33:57 #> # ... with 2 more variables: triagecode <dbl>, specialization <chr>detect_missing_values(activitylog = hospital_actlog, level_of_aggregation = "column", column = "triagecode")#>#>#>#>#>#>#> # A tibble: 1 x 7 #> patient_visit_nr activity originator start complete #> <dbl> <chr> <chr> <dttm> <dttm> #> 1 510 Clinica~ Doctor 7 2017-11-20 11:35:01 2017-11-20 11:36:09 #> # ... with 2 more variables: triagecode <dbl>, specialization <chr># }