class: center, middle, inverse, title-slide # Lec 10 - ggplot2 ecosystem & designing visualizations ##
Statistical Programming ### Sem 1, 2020 ###
Dr. Colin Rundel --- exclude: true --- class: middle # The ggplot2 ecosystem --- class: middle # ggthemes --- ## ggplot2 themes ```r g = ggplot(palmerpenguins::penguins, aes(x=species, y=body_mass_g, fill=species)) + geom_boxplot() ``` .pull-left[ ```r g ``` <img src="Lec10_files/figure-html/theme2-1.png" width="50%" style="display: block; margin: auto;" /> ```r g + theme_dark() ``` <img src="Lec10_files/figure-html/theme3-1.png" width="50%" style="display: block; margin: auto;" /> ] .pull-right[ ```r g + theme_minimal() ``` <img src="Lec10_files/figure-html/theme4-1.png" width="50%" style="display: block; margin: auto;" /> ```r g + theme_void() ``` <img src="Lec10_files/figure-html/theme5-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## ggthemes .pull-left[ ```r g + ggthemes::theme_economist() + ggthemes::scale_fill_economist() ``` <img src="Lec10_files/figure-html/theme6-1.png" width="50%" style="display: block; margin: auto;" /> ```r g + ggthemes::theme_fivethirtyeight() + ggthemes::scale_fill_fivethirtyeight() ``` <img src="Lec10_files/figure-html/theme7-1.png" width="50%" style="display: block; margin: auto;" /> ] .pull-right[ ```r g + ggthemes::theme_gdocs() + ggthemes::scale_fill_gdocs() ``` <img src="Lec10_files/figure-html/theme8-1.png" width="50%" style="display: block; margin: auto;" /> ```r g + ggthemes::theme_wsj() + ggthemes::scale_fill_wsj() ``` <img src="Lec10_files/figure-html/theme9-1.png" width="50%" style="display: block; margin: auto;" /> ] --- ## And for those who miss Excel .pull-left[ ```r g + ggthemes::theme_excel() + ggthemes::scale_fill_excel() ``` <img src="Lec10_files/figure-html/theme10-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ ```r g + ggthemes::theme_excel_new() + ggthemes::scale_fill_excel_new() ``` <img src="Lec10_files/figure-html/theme11-1.png" width="100%" style="display: block; margin: auto;" /> ] --- class: middle # GGally --- ```r GGally::ggpairs(palmerpenguins::penguins) ``` <img src="Lec10_files/figure-html/ggally-1.png" width="50%" style="display: block; margin: auto;" /> --- class: middle <img src="imgs/hex-ggrepel.png" width="45%" style="display: block; margin: auto;" /> --- ```r d = tibble( car = rownames(mtcars), weight = mtcars$wt, mpg = mtcars$mpg ) %>% filter(weight > 2.75, weight < 3.45) ``` -- .pull-left[ ```r ggplot(d, aes(x=weight, y=mpg)) + geom_point(color="red") + * geom_text( * aes(label = car) * ) ``` <img src="Lec10_files/figure-html/ggrepel2-1.png" width="65%" style="display: block; margin: auto;" /> ] .pull-right[ ```r ggplot(d, aes(x=weight, y=mpg)) + geom_point(color="red") + * ggrepel::geom_text_repel( * aes(label = car) * ) ``` <img src="Lec10_files/figure-html/ggrepel3-1.png" width="65%" style="display: block; margin: auto;" /> ] --- ```r ggplot(d, aes(x=weight, y=mpg)) + geom_point(color="red") + ggrepel::geom_text_repel( aes(label = car), * nudge_x = .1, box.padding = 1, point.padding = 0.6, * arrow = arrow(length = unit(0.02, "npc")), segment.alpha = 0.25 ) ``` <img src="Lec10_files/figure-html/ggrepel5-1.png" width="65%" style="display: block; margin: auto;" /> --- class: middle <img src="imgs/patchwork_logo.png" width="45%" style="display: block; margin: auto;" /> --- ## Plot objects ```r library(patchwork) p1 = ggplot(palmerpenguins::penguins) + geom_boxplot(aes(x = island, y = body_mass_g)) p2 = ggplot(palmerpenguins::penguins) + geom_boxplot(aes(x = species, y = body_mass_g)) p3 = ggplot(palmerpenguins::penguins) + geom_point(aes(x = flipper_length_mm, y = body_mass_g, color = sex)) p4 = ggplot(palmerpenguins::penguins) + geom_point(aes(x = bill_length_mm, y = body_mass_g, color = sex)) ``` --- ```r p1 + p2 + p3 + p4 ``` <img src="Lec10_files/figure-html/pw_plot1-1.png" width="70%" style="display: block; margin: auto;" /> --- ```r p1 + p2 + p3 + p4 + plot_layout(nrow=1) ``` <img src="Lec10_files/figure-html/pw_plot2-1.png" width="70%" style="display: block; margin: auto;" /> --- ```r p1 / (p2 + p3 + p4) ``` <img src="Lec10_files/figure-html/pw_plot3-1.png" width="70%" style="display: block; margin: auto;" /> --- ```r p1 + { p2 + { p3 + p4 + plot_layout(ncol = 1) } } + plot_layout(ncol = 1) ``` <img src="Lec10_files/figure-html/pw_plot4-1.png" width="60%" style="display: block; margin: auto;" /> --- ```r p1 + p2 + p3 + p4 + plot_annotation(title = "Palmer Penguins", tag_levels = c("A","1")) ``` <img src="Lec10_files/figure-html/pw_plot5-1.png" width="70%" style="display: block; margin: auto;" /> --- class: middle <img src="imgs/hex-gganimate.png" width="45%" style="display: block; margin: auto;" /> --- .pull-left[ ```r airq = airquality airq$Month = month.name[airq$Month] ggplot( airq, aes(Day, Temp, group = Month) ) + geom_line() + geom_segment( aes(xend = 31, yend = Temp), linetype = 2, colour = 'grey' ) + geom_point(size = 2) + geom_text( aes(x = 31.1, label = Month), hjust = 0 ) + * gganimate::transition_reveal(Day) + coord_cartesian(clip = 'off') + labs( title = 'Temperature in New York', y = 'Temperature (°F)' ) + theme_minimal() + theme(plot.margin = margin(5.5, 40, 5.5, 5.5)) ``` ] .pull-right[ <img src="imgs/gganim_weather.gif" style="display: block; margin: auto;" /> ] .footnote[https://github.com/thomasp85/gganimate] --- class: center, middle # Why do we visualize? --- ## Asncombe's Quartet ```r datasets::anscombe %>% as_tibble() ``` ``` ## # A tibble: 11 x 8 ## x1 x2 x3 x4 y1 y2 y3 y4 ## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 10 10 10 8 8.04 9.14 7.46 6.58 ## 2 8 8 8 8 6.95 8.14 6.77 5.76 ## 3 13 13 13 8 7.58 8.74 12.7 7.71 ## 4 9 9 9 8 8.81 8.77 7.11 8.84 ## 5 11 11 11 8 8.33 9.26 7.81 8.47 ## 6 14 14 14 8 9.96 8.1 8.84 7.04 ## 7 6 6 6 8 7.24 6.13 6.08 5.25 ## 8 4 4 4 19 4.26 3.1 5.39 12.5 ## 9 12 12 12 8 10.8 9.13 8.15 5.56 ## 10 7 7 7 8 4.82 7.26 6.42 7.91 ## 11 5 5 5 8 5.68 4.74 5.73 6.89 ``` --- ## Tidy anscombe .midi[ ```r (tidy_anscombe = datasets::anscombe %>% pivot_longer(everything(), names_sep = 1, names_to = c("var", "group")) %>% pivot_wider(id_cols = group, names_from = var, values_from = value, values_fn = list(value = list)) %>% unnest(cols = c(x,y))) ``` ``` ## # A tibble: 44 x 3 ## group x y ## <chr> <dbl> <dbl> ## 1 1 10 8.04 ## 2 1 8 6.95 ## 3 1 13 7.58 ## 4 1 9 8.81 ## 5 1 11 8.33 ## 6 1 14 9.96 ## 7 1 6 7.24 ## 8 1 4 4.26 ## 9 1 12 10.8 ## 10 1 7 4.82 ## # … with 34 more rows ``` ] --- .midi[ ```r tidy_anscombe %>% group_by(group) %>% summarize( mean_x = mean(x), mean_y = mean(y), sd_x = sd(x), sd_y = sd(y), cor = cor(x,y), .groups = "drop" ) ``` ``` ## # A tibble: 4 x 6 ## group mean_x mean_y sd_x sd_y cor ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 1 9 7.50 3.32 2.03 0.816 ## 2 2 9 7.50 3.32 2.03 0.816 ## 3 3 9 7.5 3.32 2.03 0.816 ## 4 4 9 7.50 3.32 2.03 0.817 ``` ] --- ```r ggplot(tidy_anscombe, aes(x = x, y = y, color = as.factor(group))) + geom_point(size=2) + facet_wrap(~group) + geom_smooth(method="lm", se=FALSE, fullrange=TRUE, formula = y~x) + guides(color=FALSE) ``` <img src="Lec10_files/figure-html/unnamed-chunk-9-1.png" width="45%" style="display: block; margin: auto;" /> --- ## DatasauRus .pull-left-narrow[ ```r library(datasauRus) ``` ``` ## ## Attaching package: 'datasauRus' ``` ``` ## The following object is masked _by_ '.GlobalEnv': ## ## datasaurus_dozen ``` ```r ggplot( datasaurus_dozen, aes( x = x, y = y, color = dataset ) ) + geom_point() + facet_wrap(~dataset) + guides(color=FALSE) ``` ] .pull-right-wide[ <img src="Lec10_files/figure-html/unnamed-chunk-11-1.png" width="75%" style="display: block; margin: auto;" /> ] --- class: middle <img src="Lec10_files/figure-html/unnamed-chunk-12-1.png" width="1800" style="display: block; margin: auto;" /> --- .pull-left-narrow[ ```r datasauRus::datasaurus_dozen ``` ``` ## # A tibble: 1,846 x 3 ## dataset x y ## <chr> <dbl> <dbl> ## 1 dino 55.4 97.2 ## 2 dino 51.5 96.0 ## 3 dino 46.2 94.5 ## 4 dino 42.8 91.4 ## 5 dino 40.8 88.3 ## 6 dino 38.7 84.9 ## 7 dino 35.6 79.9 ## 8 dino 33.1 77.6 ## 9 dino 29.0 74.5 ## 10 dino 26.2 71.4 ## # … with 1,836 more rows ``` ] -- .pull-right-wide[ ```r datasaurus_dozen %>% group_by(dataset) %>% summarize(mean_x = mean(x), mean_y = mean(y), sd_x = sd(x), sd_y = sd(y), cor = cor(x,y), .groups = "drop") ``` ``` ## # A tibble: 12 x 6 ## dataset mean_x mean_y sd_x sd_y cor ## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> ## 1 away 54.3 47.8 16.8 26.9 -0.0641 ## 2 bullseye 54.3 47.8 16.8 26.9 -0.0686 ## 3 circle 54.3 47.8 16.8 26.9 -0.0683 ## 4 dino 54.3 47.8 16.8 26.9 -0.0645 ## 5 dots 54.3 47.8 16.8 26.9 -0.0603 ## 6 h_lines 54.3 47.8 16.8 26.9 -0.0617 ## 7 high_lines 54.3 47.8 16.8 26.9 -0.0685 ## 8 slant_down 54.3 47.8 16.8 26.9 -0.0690 ## 9 slant_up 54.3 47.8 16.8 26.9 -0.0686 ## 10 star 54.3 47.8 16.8 26.9 -0.0630 ## 11 v_lines 54.3 47.8 16.8 26.9 -0.0694 ## 12 wide_lines 54.3 47.8 16.8 26.9 -0.0666 ``` ] --- ```r ggplot(datasauRus::datasaurus_dozen, aes(x = x, y = y, color = dataset)) + geom_point() + facet_wrap(~dataset) + guides(color=FALSE) ``` <img src="Lec10_files/figure-html/unnamed-chunk-15-1.png" width="45%" style="display: block; margin: auto;" /> --- ## Simpson's Paradox .pull-left[ <img src="Lec10_files/figure-html/unnamed-chunk-17-1.png" width="1050" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="Lec10_files/figure-html/unnamed-chunk-18-1.png" width="1050" style="display: block; margin: auto;" /> ] --- class: center, middle # Designing effective visualizations --- ## Keep it simple .pull-left[ <img src="imgs/pie-3d.jpg" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="Lec10_files/figure-html/pie-to-bar-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Use color to draw attention .pull-left[ <img src="Lec10_files/figure-html/unnamed-chunk-19-1.png" width="100%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="Lec10_files/figure-html/unnamed-chunk-20-1.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Tell a story <br/> .pull-left[ <img src="imgs/time-series-story-1.png" width="98%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="imgs/time-series-story-2.png" width="100%" style="display: block; margin: auto;" /> ] .footnote[ *Credit*: Angela Zoss and Eric Monson, Duke DVS ] --- ## Leave out non-story details <br/> <br/> .pull-left[ <img src="imgs/vis_inj1.png" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="imgs/vis_inj2.png" width="96%" style="display: block; margin: auto;" /> ] .footnote[ *Credit*: Angela Zoss and Eric Monson, Duke DVS ] --- ## Ordering matter .pull-left[ <img src="imgs/vis_order1.png" width="80%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="imgs/vis_order2.png" width="80%" style="display: block; margin: auto;" /> ] .footnote[ *Credit*: Angela Zoss and Eric Monson, Duke DVS ] --- ## Clearly indicate missing data <br/> <img src="imgs/vis_missing.png" width="100%" style="display: block; margin: auto;" /> .footnote[ http://ivi.sagepub.com/content/10/4/271, Angela Zoss and Eric Monson, Duke DVS ] --- ## Reduce cognitive load <img src="imgs/vis_text.png" width="100%" style="display: block; margin: auto;" /> .footnote[ http://www.storytellingwithdata.com/2012/09/some-finer-points-of-data-visualization.html ] --- ## Use descriptive titles .pull-left[ <img src="imgs/vis-title-1.png" width="80%" style="display: block; margin: auto;" /> ] .pull-right[ <img src="imgs/vis-title-2.png" width="80%" style="display: block; margin: auto;" /> ] .footnote[ *Credit*: Angela Zoss and Eric Monson, Duke DVS ] --- ## Annotate figures directly <img src="imgs/vis_annotate.png" width="80%" style="display: block; margin: auto;" /> .footnote[ https://bl.ocks.org/susielu/23dc3082669ee026c552b85081d90976 ] --- ## All of the data doesn't tell a story <img src="imgs/vis_nyt1.png" width="60%" style="display: block; margin: auto;" /> .footnote[ http://www.nytimes.com/interactive/2014/06/05/upshot/how-the-recession-reshaped-the-economy-in-255-charts.html ] --- ## All of the data doesn't tell a story <img src="imgs/vis_nyt2.png" width="60%" style="display: block; margin: auto;" /> .footnote[ http://www.nytimes.com/interactive/2014/06/05/upshot/how-the-recession-reshaped-the-economy-in-255-charts.html ] --- ## All of the data doesn't tell a story <img src="imgs/vis_nyt3.png" width="60%" style="display: block; margin: auto;" /> .footnote[ http://www.nytimes.com/interactive/2014/06/05/upshot/how-the-recession-reshaped-the-economy-in-255-charts.html ] --- class: middle, center # Chart Remakes / Makeovers --- ## The Why Axis - BLS <img src="imgs/vis_bls.gif" width="60%" style="display: block; margin: auto;" /> .footnote[ http://thewhyaxis.info/defaults/ ] --- ## The Why Axis - Gender Gap <img src="imgs/vis_gap.jpg" width="45%" style="display: block; margin: auto;" /> .footnote[ http://thewhyaxis.info/gap-remake/ ] --- ## Acknowledgments Above materials are derived in part from the following sources: * Hadley Wickham - [R for Data Science](http://r4ds.had.co.nz/) & [Elegant Graphics for Data Analysis](https://ggplot2-book.org) * [ggplot2 website](https://ggplot2.tidyverse.org/) * Visualization training materials developed by Angela Zoss and Eric Monson, [Duke DVS](http://libcms.oit.duke.edu/data/)