Make a data.frame showing either the outcome (k-score) of all possible de-identify choices or only those that meet a certain k-score threshold.

deidentify_choices_table(
  data,
  date_cols = NULL,
  group_rare_values_cols,
  k_score_columns = NULL,
  preferred_k_score = NULL
)

Arguments

data

A data.frame with the data you want to de-identify.

date_cols

A vector of strings with the name of date columns that you want to be aggregated. If NULL, will use all date columns in the data.

group_rare_values_cols

A string or vector of strings with the columns that you want to turn rare values (below k% where k is 1-99) into NA.

k_score_columns

A string or vector of strings for the names of columns to generate the k-score from. If NULL (default), will use the columns inputted in date_cols and group_rare_values_cols. Note that if you select columns for these parameters and don't include them in k_score_columns, deidentifying these columns won't affect the k-score.

preferred_k_score

A number of vector of numbers to set the minimum (and maximum if a vector) k-score you want from the possible choices.

Value

Returns a data.frame that only has all possible choices of decisions to make and the k-score that it returns. Each row is a possible decision when using the deidentify_data() function and includes summary statistics for the k-score of that decision. If preferred_k_score is set, returns only choices that meet this parameter. If no choices meet this k-score minimum, will return an empty data.frame.

Examples

deidentify_choices_table(mtcars, group_rare_values_cols = c("mpg", "vs"), k_score_columns = c("mpg", "vs"))
#> date_aggregation date_columns rare_values_limit rare_values_columns #> 1 <NA> <NA> 1 mpg, vs #> 2 <NA> <NA> 2 mpg, vs #> 3 <NA> <NA> 3 mpg, vs #> 4 <NA> <NA> 4 mpg, vs #> 5 <NA> <NA> 5 mpg, vs #> 6 <NA> <NA> 6 mpg, vs #> 7 <NA> <NA> 7 mpg, vs #> 8 <NA> <NA> 8 mpg, vs #> 9 <NA> <NA> 9 mpg, vs #> 10 <NA> <NA> 10 mpg, vs #> 11 <NA> <NA> 11 mpg, vs #> 12 <NA> <NA> 12 mpg, vs #> 13 <NA> <NA> 13 mpg, vs #> 14 <NA> <NA> 14 mpg, vs #> 15 <NA> <NA> 15 mpg, vs #> 16 <NA> <NA> 16 mpg, vs #> 17 <NA> <NA> 17 mpg, vs #> 18 <NA> <NA> 18 mpg, vs #> 19 <NA> <NA> 19 mpg, vs #> 20 <NA> <NA> 20 mpg, vs #> 21 <NA> <NA> 21 mpg, vs #> 22 <NA> <NA> 22 mpg, vs #> 23 <NA> <NA> 23 mpg, vs #> 24 <NA> <NA> 24 mpg, vs #> 25 <NA> <NA> 25 mpg, vs #> 26 <NA> <NA> 26 mpg, vs #> 27 <NA> <NA> 27 mpg, vs #> 28 <NA> <NA> 28 mpg, vs #> 29 <NA> <NA> 29 mpg, vs #> 30 <NA> <NA> 30 mpg, vs #> 31 <NA> <NA> 31 mpg, vs #> 32 <NA> <NA> 32 mpg, vs #> 33 <NA> <NA> 33 mpg, vs #> 34 <NA> <NA> 34 mpg, vs #> 35 <NA> <NA> 35 mpg, vs #> 36 <NA> <NA> 36 mpg, vs #> 37 <NA> <NA> 37 mpg, vs #> 38 <NA> <NA> 38 mpg, vs #> 39 <NA> <NA> 39 mpg, vs #> 40 <NA> <NA> 40 mpg, vs #> 41 <NA> <NA> 41 mpg, vs #> 42 <NA> <NA> 42 mpg, vs #> 43 <NA> <NA> 43 mpg, vs #> 44 <NA> <NA> 44 mpg, vs #> 45 <NA> <NA> 45 mpg, vs #> 46 <NA> <NA> 46 mpg, vs #> 47 <NA> <NA> 47 mpg, vs #> 48 <NA> <NA> 48 mpg, vs #> 49 <NA> <NA> 49 mpg, vs #> 50 <NA> <NA> 50 mpg, vs #> 51 <NA> <NA> 51 mpg, vs #> 52 <NA> <NA> 52 mpg, vs #> 53 <NA> <NA> 53 mpg, vs #> 54 <NA> <NA> 54 mpg, vs #> 55 <NA> <NA> 55 mpg, vs #> 56 <NA> <NA> 56 mpg, vs #> 57 <NA> <NA> 57 mpg, vs #> 58 <NA> <NA> 58 mpg, vs #> 59 <NA> <NA> 59 mpg, vs #> 60 <NA> <NA> 60 mpg, vs #> 61 <NA> <NA> 61 mpg, vs #> 62 <NA> <NA> 62 mpg, vs #> 63 <NA> <NA> 63 mpg, vs #> 64 <NA> <NA> 64 mpg, vs #> 65 <NA> <NA> 65 mpg, vs #> 66 <NA> <NA> 66 mpg, vs #> 67 <NA> <NA> 67 mpg, vs #> 68 <NA> <NA> 68 mpg, vs #> 69 <NA> <NA> 69 mpg, vs #> 70 <NA> <NA> 70 mpg, vs #> 71 <NA> <NA> 71 mpg, vs #> 72 <NA> <NA> 72 mpg, vs #> 73 <NA> <NA> 73 mpg, vs #> 74 <NA> <NA> 74 mpg, vs #> 75 <NA> <NA> 75 mpg, vs #> 76 <NA> <NA> 76 mpg, vs #> 77 <NA> <NA> 77 mpg, vs #> 78 <NA> <NA> 78 mpg, vs #> 79 <NA> <NA> 79 mpg, vs #> 80 <NA> <NA> 80 mpg, vs #> 81 <NA> <NA> 81 mpg, vs #> 82 <NA> <NA> 82 mpg, vs #> 83 <NA> <NA> 83 mpg, vs #> 84 <NA> <NA> 84 mpg, vs #> 85 <NA> <NA> 85 mpg, vs #> 86 <NA> <NA> 86 mpg, vs #> 87 <NA> <NA> 87 mpg, vs #> 88 <NA> <NA> 88 mpg, vs #> 89 <NA> <NA> 89 mpg, vs #> 90 <NA> <NA> 90 mpg, vs #> 91 <NA> <NA> 91 mpg, vs #> 92 <NA> <NA> 92 mpg, vs #> 93 <NA> <NA> 93 mpg, vs #> 94 <NA> <NA> 94 mpg, vs #> 95 <NA> <NA> 95 mpg, vs #> 96 <NA> <NA> 96 mpg, vs #> 97 <NA> <NA> 97 mpg, vs #> 98 <NA> <NA> 98 mpg, vs #> 99 <NA> <NA> 99 mpg, vs #> k_score_columns min_k_score mean_k_score median_k_score max_k_score #> 1 mpg, vs 1 1.230769 1 2 #> 2 mpg, vs 1 1.230769 1 2 #> 3 mpg, vs 1 1.230769 1 2 #> 4 mpg, vs 1 3.200000 2 11 #> 5 mpg, vs 1 3.200000 2 11 #> 6 mpg, vs 1 3.200000 2 11 #> 7 mpg, vs 14 16.000000 16 18 #> 8 mpg, vs 14 16.000000 16 18 #> 9 mpg, vs 14 16.000000 16 18 #> 10 mpg, vs 14 16.000000 16 18 #> 11 mpg, vs 14 16.000000 16 18 #> 12 mpg, vs 14 16.000000 16 18 #> 13 mpg, vs 14 16.000000 16 18 #> 14 mpg, vs 14 16.000000 16 18 #> 15 mpg, vs 14 16.000000 16 18 #> 16 mpg, vs 14 16.000000 16 18 #> 17 mpg, vs 14 16.000000 16 18 #> 18 mpg, vs 14 16.000000 16 18 #> 19 mpg, vs 14 16.000000 16 18 #> 20 mpg, vs 14 16.000000 16 18 #> 21 mpg, vs 14 16.000000 16 18 #> 22 mpg, vs 14 16.000000 16 18 #> 23 mpg, vs 14 16.000000 16 18 #> 24 mpg, vs 14 16.000000 16 18 #> 25 mpg, vs 14 16.000000 16 18 #> 26 mpg, vs 14 16.000000 16 18 #> 27 mpg, vs 14 16.000000 16 18 #> 28 mpg, vs 14 16.000000 16 18 #> 29 mpg, vs 14 16.000000 16 18 #> 30 mpg, vs 14 16.000000 16 18 #> 31 mpg, vs 14 16.000000 16 18 #> 32 mpg, vs 14 16.000000 16 18 #> 33 mpg, vs 14 16.000000 16 18 #> 34 mpg, vs 14 16.000000 16 18 #> 35 mpg, vs 14 16.000000 16 18 #> 36 mpg, vs 14 16.000000 16 18 #> 37 mpg, vs 14 16.000000 16 18 #> 38 mpg, vs 14 16.000000 16 18 #> 39 mpg, vs 14 16.000000 16 18 #> 40 mpg, vs 14 16.000000 16 18 #> 41 mpg, vs 14 16.000000 16 18 #> 42 mpg, vs 14 16.000000 16 18 #> 43 mpg, vs 14 16.000000 16 18 #> 44 mpg, vs 14 16.000000 16 18 #> 45 mpg, vs 14 16.000000 16 18 #> 46 mpg, vs 14 16.000000 16 18 #> 47 mpg, vs 14 16.000000 16 18 #> 48 mpg, vs 14 16.000000 16 18 #> 49 mpg, vs 14 16.000000 16 18 #> 50 mpg, vs 14 16.000000 16 18 #> 51 mpg, vs 14 16.000000 16 18 #> 52 mpg, vs 14 16.000000 16 18 #> 53 mpg, vs 14 16.000000 16 18 #> 54 mpg, vs 14 16.000000 16 18 #> 55 mpg, vs 14 16.000000 16 18 #> 56 mpg, vs 14 16.000000 16 18 #> 57 mpg, vs 32 32.000000 32 32 #> 58 mpg, vs 32 32.000000 32 32 #> 59 mpg, vs 32 32.000000 32 32 #> 60 mpg, vs 32 32.000000 32 32 #> 61 mpg, vs 32 32.000000 32 32 #> 62 mpg, vs 32 32.000000 32 32 #> 63 mpg, vs 32 32.000000 32 32 #> 64 mpg, vs 32 32.000000 32 32 #> 65 mpg, vs 32 32.000000 32 32 #> 66 mpg, vs 32 32.000000 32 32 #> 67 mpg, vs 32 32.000000 32 32 #> 68 mpg, vs 32 32.000000 32 32 #> 69 mpg, vs 32 32.000000 32 32 #> 70 mpg, vs 32 32.000000 32 32 #> 71 mpg, vs 32 32.000000 32 32 #> 72 mpg, vs 32 32.000000 32 32 #> 73 mpg, vs 32 32.000000 32 32 #> 74 mpg, vs 32 32.000000 32 32 #> 75 mpg, vs 32 32.000000 32 32 #> 76 mpg, vs 32 32.000000 32 32 #> 77 mpg, vs 32 32.000000 32 32 #> 78 mpg, vs 32 32.000000 32 32 #> 79 mpg, vs 32 32.000000 32 32 #> 80 mpg, vs 32 32.000000 32 32 #> 81 mpg, vs 32 32.000000 32 32 #> 82 mpg, vs 32 32.000000 32 32 #> 83 mpg, vs 32 32.000000 32 32 #> 84 mpg, vs 32 32.000000 32 32 #> 85 mpg, vs 32 32.000000 32 32 #> 86 mpg, vs 32 32.000000 32 32 #> 87 mpg, vs 32 32.000000 32 32 #> 88 mpg, vs 32 32.000000 32 32 #> 89 mpg, vs 32 32.000000 32 32 #> 90 mpg, vs 32 32.000000 32 32 #> 91 mpg, vs 32 32.000000 32 32 #> 92 mpg, vs 32 32.000000 32 32 #> 93 mpg, vs 32 32.000000 32 32 #> 94 mpg, vs 32 32.000000 32 32 #> 95 mpg, vs 32 32.000000 32 32 #> 96 mpg, vs 32 32.000000 32 32 #> 97 mpg, vs 32 32.000000 32 32 #> 98 mpg, vs 32 32.000000 32 32 #> 99 mpg, vs 32 32.000000 32 32
deidentify_choices_table(mtcars, group_rare_values_cols = c("mpg", "vs"), preferred_k_score = 5:15)
#> date_aggregation date_columns rare_values_limit rare_values_columns #> 1 <NA> <NA> 7 mpg, vs #> 2 <NA> <NA> 8 mpg, vs #> 3 <NA> <NA> 9 mpg, vs #> 4 <NA> <NA> 10 mpg, vs #> 5 <NA> <NA> 11 mpg, vs #> 6 <NA> <NA> 12 mpg, vs #> 7 <NA> <NA> 13 mpg, vs #> 8 <NA> <NA> 14 mpg, vs #> 9 <NA> <NA> 15 mpg, vs #> 10 <NA> <NA> 16 mpg, vs #> 11 <NA> <NA> 17 mpg, vs #> 12 <NA> <NA> 18 mpg, vs #> 13 <NA> <NA> 19 mpg, vs #> 14 <NA> <NA> 20 mpg, vs #> 15 <NA> <NA> 21 mpg, vs #> 16 <NA> <NA> 22 mpg, vs #> 17 <NA> <NA> 23 mpg, vs #> 18 <NA> <NA> 24 mpg, vs #> 19 <NA> <NA> 25 mpg, vs #> 20 <NA> <NA> 26 mpg, vs #> 21 <NA> <NA> 27 mpg, vs #> 22 <NA> <NA> 28 mpg, vs #> 23 <NA> <NA> 29 mpg, vs #> 24 <NA> <NA> 30 mpg, vs #> 25 <NA> <NA> 31 mpg, vs #> 26 <NA> <NA> 32 mpg, vs #> 27 <NA> <NA> 33 mpg, vs #> 28 <NA> <NA> 34 mpg, vs #> 29 <NA> <NA> 35 mpg, vs #> 30 <NA> <NA> 36 mpg, vs #> 31 <NA> <NA> 37 mpg, vs #> 32 <NA> <NA> 38 mpg, vs #> 33 <NA> <NA> 39 mpg, vs #> 34 <NA> <NA> 40 mpg, vs #> 35 <NA> <NA> 41 mpg, vs #> 36 <NA> <NA> 42 mpg, vs #> 37 <NA> <NA> 43 mpg, vs #> 38 <NA> <NA> 44 mpg, vs #> 39 <NA> <NA> 45 mpg, vs #> 40 <NA> <NA> 46 mpg, vs #> 41 <NA> <NA> 47 mpg, vs #> 42 <NA> <NA> 48 mpg, vs #> 43 <NA> <NA> 49 mpg, vs #> 44 <NA> <NA> 50 mpg, vs #> 45 <NA> <NA> 51 mpg, vs #> 46 <NA> <NA> 52 mpg, vs #> 47 <NA> <NA> 53 mpg, vs #> 48 <NA> <NA> 54 mpg, vs #> 49 <NA> <NA> 55 mpg, vs #> 50 <NA> <NA> 56 mpg, vs #> k_score_columns min_k_score mean_k_score median_k_score max_k_score #> 1 mpg, vs 14 16 16 18 #> 2 mpg, vs 14 16 16 18 #> 3 mpg, vs 14 16 16 18 #> 4 mpg, vs 14 16 16 18 #> 5 mpg, vs 14 16 16 18 #> 6 mpg, vs 14 16 16 18 #> 7 mpg, vs 14 16 16 18 #> 8 mpg, vs 14 16 16 18 #> 9 mpg, vs 14 16 16 18 #> 10 mpg, vs 14 16 16 18 #> 11 mpg, vs 14 16 16 18 #> 12 mpg, vs 14 16 16 18 #> 13 mpg, vs 14 16 16 18 #> 14 mpg, vs 14 16 16 18 #> 15 mpg, vs 14 16 16 18 #> 16 mpg, vs 14 16 16 18 #> 17 mpg, vs 14 16 16 18 #> 18 mpg, vs 14 16 16 18 #> 19 mpg, vs 14 16 16 18 #> 20 mpg, vs 14 16 16 18 #> 21 mpg, vs 14 16 16 18 #> 22 mpg, vs 14 16 16 18 #> 23 mpg, vs 14 16 16 18 #> 24 mpg, vs 14 16 16 18 #> 25 mpg, vs 14 16 16 18 #> 26 mpg, vs 14 16 16 18 #> 27 mpg, vs 14 16 16 18 #> 28 mpg, vs 14 16 16 18 #> 29 mpg, vs 14 16 16 18 #> 30 mpg, vs 14 16 16 18 #> 31 mpg, vs 14 16 16 18 #> 32 mpg, vs 14 16 16 18 #> 33 mpg, vs 14 16 16 18 #> 34 mpg, vs 14 16 16 18 #> 35 mpg, vs 14 16 16 18 #> 36 mpg, vs 14 16 16 18 #> 37 mpg, vs 14 16 16 18 #> 38 mpg, vs 14 16 16 18 #> 39 mpg, vs 14 16 16 18 #> 40 mpg, vs 14 16 16 18 #> 41 mpg, vs 14 16 16 18 #> 42 mpg, vs 14 16 16 18 #> 43 mpg, vs 14 16 16 18 #> 44 mpg, vs 14 16 16 18 #> 45 mpg, vs 14 16 16 18 #> 46 mpg, vs 14 16 16 18 #> 47 mpg, vs 14 16 16 18 #> 48 mpg, vs 14 16 16 18 #> 49 mpg, vs 14 16 16 18 #> 50 mpg, vs 14 16 16 18
if (FALSE) { deidentify_choices_table(deidentify::initiations, date_cols = c("arrest_date", "felony_review_date"), group_rare_values_cols = c("race", "primary_charge_flag"), k_score_columns = c("primary_charge_flag", "gender", "race", "arrest_date", "felony_review_date")) }