You are making a table for publication, and you want to add background colors in a column based on the cell values.

Step 1 - Start with a data frame that contains the values that you want to display in the arrangement that you want to display them.

Step 2 - Use gt::gt() to turn your data frame into an editable, professional looking table.

Step 3 - Use gt::data_color() to add background colors to the cells of a column. data_color() will assign colors based on the values of the cells, which creates a visual mapping between color and value.

Step 4 - Specify the columns to color. Set the columns argument of data_color() to a column name, or a vector of column names. Do not use quotation marks.

Step 5 - Set colors to a scales function listed below. Each maps data to colors in a different way.

Step 6 - Choose a color palette for your scales function from the RColorBrewer or viridis packages. You can also provide a vector of built-in colors() to use.

data %>% 
  gt() %>% 
  data_color(
    columns = c(col_A, col_B),
    colors = scales::col_numeric(
      palette = "YlGn",
      domain = NULL
    )
  )

Color mapping functions

To match values of your data with colors, use the color mapping functions from the scales package.

Function Purpose
scales::col_numeric() match continuous data to a continuous gradient of colors
scales::col_bin() split continuous data into specified number of bins with one color per bin
scales::col_quantile() split continuous data into quantiles with one color per quantile
scales::col_factor() match each level of factored data to a color

Color palettes

Run RColorBrewer::display.brewer.all() or ?viridis::viridis_pal() to see the names of the color palettes provided by these packages.

Example

northeast contains the number of COVID-19 cases reported by by several states in the northeastern US in March of 2021.

northeast
## # A tibble: 9 × 2
##   state total_cases
##   <fct>       <dbl>
## 1 VT            885
## 2 ME           1160
## 3 NH           1437
## 4 RI           2345
## 5 CT           5384
## 6 MA          10208
## 7 PA          17001
## 8 NJ          23253
## 9 NY          50724

Before we publish the table, we’d like to color the total_cases column based on the number of total cases, where darker shades of blue indicate a higher number of COVID-19 cases. This will make it easy to spot at a glance variation between the states.

To do this, we load the gt package and convert our data frame to a gt table. Then we use data_color() to color the column with col_numeric() which matches the data values to a continuous gradient of blues from RColorBrewer.

library(gt)
northeast %>% 
  gt() %>% 
  data_color(
    columns = total_cases,
    colors = scales::col_numeric(
      palette = "Blues",
      domain = NULL
    ) 
  )
state total_cases
VT 885
ME 1160
NH 1437
RI 2345
CT 5384
MA 10208
PA 17001
NJ 23253
NY 50724

Add cell colors based on cell values in SAS

data_color() is similar to SAS’s STYLE = {BACKGROUND = } option to a custom format to set the color formats for different values of the column based on specified cutoffs.

In gt, if you wish to create color categories based on cutoff values of a numeric variable, instead use a call to gt::tab_style() to set the color for each set of values for the column.

In SAS:

PROC FORMAT;
  VALUE color_format LOW -< 0 = 'RED'
                     0 -< 100 = 'YELLOW'
                     100 - HIGH = 'GREEN';
RUN;
PROC PRINT DATA = data;
  VAR col_A col_B;
  VAR col_C / STYLE = {BACKGROUNDCOLOR = color_format.};
RUN;

In R:

data %>% 
  gt() %>% 
  tab_style(
    style = cell_fill(color = "red"),
    locations = cells_body(columns = (col_C < 0))
  ) %>% 
  tab_style(
    style = cell_fill(color = "yellow"),
    locations = cells_body(columns = (col_C >= 0 & col_C < 100))
  ) %>%
  tab_style(
    style = cell_fill(color = "green"),
    locations = cells_body(columns = (col_C >= 100))
  )