Function to calculate a summary statistic (mean, median, vvconverter::mode, min, max etc.) by group and use it to fill missing values. Note: this takes and produces a tibble rather than a vector.

fill_df_with_agg_by_group(
  df,
  group,
  columns,
  overwrite_col = FALSE,
  statistic = mean,
  fill_empty_group = FALSE
)

Arguments

df

tibble to use

group

string or vector of strings: columns to group by

columns

string or vector of strings: columns to impute

overwrite_col

boolean: whether to overwrite column. If FALSE, a new column with suffix _imputed will be created

statistic

function: summary statistic to use (mean, median, min etc.). For now requires a function with na.rm argument

fill_empty_group

boolean: If TRUE, fills groups that only contain NA with summary statistic of entire column

Value

a tibble with filled column(s)