Takes a dataframe with a phecode column and replaces it with the
hierarchichal levels of the phecodes as columns code_l1, code_l2
and
code_l3
.
add_phecode_levels( phecode_data, phecode_column = phecode, remove_phecode_column = TRUE )
phecode_data | Dataframe with a column encoding phecodes (make sure
codes are normalized via |
---|---|
phecode_column | Unquoted name of the column containing phecodes |
remove_phecode_column | Should the original phecode column be kept in the data? |
Dataframe with phecode levels added as three integer columns
patient_data <- dplyr::tribble( ~patient, ~code, ~counts, 1, "250.23", 7, 1, "250.25", 4, 1, "696.42", 1, 1, "555.21", 4, 2, "401.22", 6, 2, "204.21", 5, 2, "735.22", 4, 2, "751.11", 2, ) # Default removes original phecode column add_phecode_levels(patient_data, code)#> # A tibble: 8 x 5 #> patient counts code_l1 code_l2 code_l3 #> <dbl> <dbl> <int> <int> <int> #> 1 1 7 250 2 3 #> 2 1 4 250 2 5 #> 3 1 1 696 4 2 #> 4 1 4 555 2 1 #> 5 2 6 401 2 2 #> 6 2 5 204 2 1 #> 7 2 4 735 2 2 #> 8 2 2 751 1 1# Can keep original column as well add_phecode_levels(patient_data, code, remove_phecode_column = FALSE)#> # A tibble: 8 x 6 #> patient code counts code_l1 code_l2 code_l3 #> <dbl> <chr> <dbl> <int> <int> <int> #> 1 1 250.23 7 250 2 3 #> 2 1 250.25 4 250 2 5 #> 3 1 696.42 1 696 4 2 #> 4 1 555.21 4 555 2 1 #> 5 2 401.22 6 401 2 2 #> 6 2 204.21 5 204 2 1 #> 7 2 735.22 4 735 2 2 #> 8 2 751.11 2 751 1 1