You are making a table for publication, and you want to add background colors in a column based on the cell values.
Step 1 - Start with a data frame that contains the values that you want to display in the arrangement that you want to display them.
Step 2 - Use gt::gt() to turn your data
frame into an editable, professional looking table.
Step 3 - Use gt::data_color() to add background
colors to the cells of a column. data_color() will
assign colors based on the values of the cells, which creates a visual
mapping between color and value.
Step 4 - Specify the columns to color.
Set the columns argument of data_color() to a
column name, or a vector of column names. Do not use quotation
marks.
Step 5 - Set colors to a scales
function listed below. Each maps data to colors in a different
way.
Step 6 - Choose a color palette for your scales
function from the RColorBrewer or viridis packages. You can also provide
a vector of built-in colors() to use.
data %>%
gt() %>%
data_color(
columns = c(col_A, col_B),
colors = scales::col_numeric(
palette = "YlGn",
domain = NULL
)
)
Color mapping functions
To match values of your data with colors, use the color mapping functions from the scales package.
| Function | Purpose |
|---|---|
scales::col_numeric() |
match continuous data to a continuous gradient of colors |
scales::col_bin() |
split continuous data into specified number of
bins with one color per bin |
scales::col_quantile() |
split continuous data into quantiles with one color per quantile |
scales::col_factor() |
match each level of factored data to a color |
Color palettes
Run RColorBrewer::display.brewer.all() or
?viridis::viridis_pal() to see the names of the color
palettes provided by these packages.
northeast contains the number of COVID-19 cases reported
by by several states in the northeastern US in March of 2021.
northeast
## # A tibble: 9 × 2
## state total_cases
## <fct> <dbl>
## 1 VT 885
## 2 ME 1160
## 3 NH 1437
## 4 RI 2345
## 5 CT 5384
## 6 MA 10208
## 7 PA 17001
## 8 NJ 23253
## 9 NY 50724
Before we publish the table, we’d like to color the
total_cases column based on the number of total cases,
where darker shades of blue indicate a higher number of COVID-19 cases.
This will make it easy to spot at a glance variation between the
states.
To do this, we load the gt package and convert our data frame to a gt
table. Then we use data_color() to color the column with
col_numeric() which matches the data values to a continuous
gradient of blues from RColorBrewer.
library(gt)
northeast %>%
gt() %>%
data_color(
columns = total_cases,
colors = scales::col_numeric(
palette = "Blues",
domain = NULL
)
)
| state | total_cases |
|---|---|
| VT | 885 |
| ME | 1160 |
| NH | 1437 |
| RI | 2345 |
| CT | 5384 |
| MA | 10208 |
| PA | 17001 |
| NJ | 23253 |
| NY | 50724 |
Add cell colors based on cell values in SAS
data_color() is similar to SAS’s
STYLE = {BACKGROUND = } option to a custom format to set
the color formats for different values of the column based on specified
cutoffs.
In gt, if you wish to create color categories based on cutoff values
of a numeric variable, instead use a call to
gt::tab_style() to set the color for each set of values for
the column.
In SAS:
PROC FORMAT;
VALUE color_format LOW -< 0 = 'RED'
0 -< 100 = 'YELLOW'
100 - HIGH = 'GREEN';
RUN;
PROC PRINT DATA = data;
VAR col_A col_B;
VAR col_C / STYLE = {BACKGROUNDCOLOR = color_format.};
RUN;
In R:
data %>%
gt() %>%
tab_style(
style = cell_fill(color = "red"),
locations = cells_body(columns = (col_C < 0))
) %>%
tab_style(
style = cell_fill(color = "yellow"),
locations = cells_body(columns = (col_C >= 0 & col_C < 100))
) %>%
tab_style(
style = cell_fill(color = "green"),
locations = cells_body(columns = (col_C >= 100))
)