The Jigsaw Puzzle Pieces - Creating the Graphs with ggplot2

The case is adapted from how the R Shiny App for a Library Survey was created.

This post is the second of the three in a series to make an interactive data visualization web app:

Part 1 - The Dirty Work
Part 2 - The Jigsaw Puzzle Pieces - Creating Graphs with ggplot2
Part 3 - Assembling the Pieces - Creating R Shiny App

Previously, as the first step, we did a lot with data cleaning, especially reshaping for later visualization. With the cleaned data, now let’s build the jigsaw puzzle pieces, the single plots that later will be assembled to form the Shiny App.

Along the way, we will encounter chart types of stacked bars (count/percentage), grouped bars, lollipop chart, dot plot, heatmap and word cloud. Because the goal of the survey is to find out the distribution of service usage by groups, the plots we heavily rely on mostly concern graphing the distribution of categories.

So nothing unusual or fancy. This post is really not about fancy plots, and we will see how we make use of the data we reshaped in the last post to tell stories.

First Steps

We will use the ggplot2 package for visualization throughout unless otherwise stated.

library(dplyr)
library(stringr)
library(data.table)
library(ggplot2)
library(wordcloud2)
library(gridExtra)
library(grid)

Sample data

Let’s load the sample survey data.

## setwd()
load("~/Desktop/r sample files/survey")

head(survey)

## # A tibble: 6 x 17
##   status  country major  Q1.1   Q1.2  Q1.3  top_reason     place_options  
##   <fct>   <fct>   <fct>  <chr>  <chr> <chr> <chr>          <chr>          
## 1 Freshm… U.S.    Undef… Never  Never Never Find a quiet … quiet (occasio…
## 2 Freshm… China   Undef… Occas… Often Often Find a quiet … crowded,focused
## 3 Freshm… U.S.    Undef… Never  Occa… Often Meet up with … (close to) sil…
## 4 Freshm… China   Undef… Occas… Occa… Occa… Meet up with … focused,(close…
## 5 Freshm… U.S.    Undef… Never  Occa… Occa… Find a quiet … (close to) sil…
## 6 Freshm… China   Undef… Occas… Often Often Find a quiet … (close to) sil…
## # ... with 9 more variables: space_lib <chr>, rank_crowded <dbl>,
## #   rank_modpop <dbl>, rank_noisy <dbl>, rank_quiet <dbl>,
## #   rank_silent <dbl>, rank_relaxed <dbl>, rank_focused <dbl>,
## #   workshops <chr>

Color palette

Let’s create a list of a color palette. This will make working with colors more easily later in plotting.

palette <- list(purple = c("#351F39", "#351C4D", "#6c1f55", "#765285", "#8a6899"),
                turquoise = c("#709FB0", "#849974", "#A0C1B8"),
                golden = c("#D1A827", "#f3da4c"))

A Lot of Bars

We did quite some cleaning in the last post to get the data structure ready for plotting. We don’t need to do that all the time, of course, if the current data structure supports the graphing needs.

Here if we want to present how often respondents of different majors use the library website to search for articles, we can do that right away. The data on website usage is Q1.3 (“I use our library website to (how often) - Search for articles”). The data for major is major.

Q1.3 and major both contain discrete values. The data structure is ready for visualizing the distribution of how students of each major use the library website to search for articles. Bar plots and their variations can easily achieve this goal.

tail(survey[c("Q1.3","major")])

## # A tibble: 6 x 2
##   Q1.3         major                                    
##   <chr>        <fct>                                    
## 1 Often        Data Science & Interactive Media Business
## 2 Occasionally Business, Finance & Economics            
## 3 Often        Interactive Media Arts                   
## 4 Often        Business, Finance & Economics            
## 5 Occasionally Humanities & Social Sciences             
## 6 Occasionally CS & Engineering

table(survey$Q1.3)

## 
##        Never Occasionally        Often 
##           77          126          117

table(survey$major)

## 
##             Business, Finance & Economics 
##                                       118 
##              Humanities & Social Sciences 
##                                        32 
## Data Science & Interactive Media Business 
##                                        24 
##                    Interactive Media Arts 
##                                        20 
##                                   Science 
##                                        21 
##                          CS & Engineering 
##                                        27 
##                               Mathematics 
##                                        21 
##                                 Undefined 
##                                        57

stacked bars (count)

ggplot(survey, aes(major)) + 
  geom_bar(aes(fill = Q1.3), position = position_stack(reverse = TRUE), width = 0.4, alpha = 0.75) +
## geom_bar() adds a layer of stacked bars to the plot
## aes(fill = Q1.3) fills the stacked bars with the counts of each value of Q1.3 (never/occasionally/often)
## position = position_stack(reverse = TRUE) reverses the order of the stacked bars
  scale_fill_manual(values = c(palette[[1]][4], palette[[2]][1], palette[[3]][1])) +
## manually fills the bars with the preset color scheme
  scale_x_discrete(limits = rev(levels(survey$major))) +
## reverse the order of levels of x axis  
  coord_flip() +
## flip the coordinates
theme(axis.text.x = element_text(size = 12),
      axis.text.y = element_text(size = 12, margin = margin(0,3,0,0)),
      axis.title.y = element_blank(),
      axis.title.x = element_blank(),
      axis.ticks.x = element_line(size = 0),
      legend.title = element_blank(),
      legend.text = element_text(size = 10),
      plot.margin = unit(c(1,1,1,1), "cm"))

stacked bars (percentage)

Alternatively, we can fill the stacked bars with the percentage of each value of Q1.3.

ggplot(survey, aes(major)) + 
  geom_bar(aes(fill = Q1.3), position = "fill", width = 0.4, alpha = 0.75) +
## position = "fill" sets the plot as stacked bars with filled percentages instead of counts
  scale_fill_manual(values = c(palette[[1]][4], palette[[2]][1], palette[[3]][1])) +
  scale_x_discrete(limits = rev(levels(survey$major))) +
  coord_flip() +
  theme(axis.text.x = element_text(size = 12),
        axis.text.y = element_text(size = 12, margin = margin(0,3,0,0)),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_line(size = 0),
        legend.title = element_blank(),
        legend.text = element_text(size = 10),
        plot.margin = unit(c(1,1,1,1), "cm"))

grouped bars

Or we can group the bars rather than stack the bars.

ggplot(survey, aes(major)) + 
  geom_bar(aes(fill = Q1.3), position = "dodge", width = 0.4, alpha = 0.75) +
## position = "dodge" sets the plot as grouped bars
  scale_fill_manual(values = c(palette[[1]][4], palette[[2]][1], palette[[3]][1])) +
  scale_x_discrete(limits = rev(levels(survey$major))) +
  coord_flip() +
  theme(axis.text.x = element_text(size = 12),
        axis.text.y = element_text(size = 12, margin = margin(0,3,0,0)),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_line(size = 0),
        legend.title = element_blank(),
        legend.text = element_text(size = 10),
        plot.margin = unit(c(1,1,1,1), "cm"))

Using Summary Stats

We made many frequency tables in the previous post. Time to put them to use.

lollipop chart

If we want to plot the distribution of major, that could be easily done with a bar chart. But we could also be easily bored by a bar chart. A lollipop chart, which is a variation of bar chart, can do the same thing for us with some visual diversity.

To make a lollipop chart, we need a frequency table that summarizes the counts of each major. We already did this in the last post, which we repeat below.

tb <- survey %>% count(major) %>% data.frame() # generates a frequency table
tb <- tb %>% arrange(-n) %>% filter(major != "Undefined") # reorder by frequency; remove undefined majors
tb

##                                       major   n
## 1             Business, Finance & Economics 118
## 2              Humanities & Social Sciences  32
## 3                          CS & Engineering  27
## 4 Data Science & Interactive Media Business  24
## 5                                   Science  21
## 6                               Mathematics  21
## 7                    Interactive Media Arts  20

Now let’s plot the lollipop chart from the frequency table.

ggplot(tb, aes(n, reorder(major, -n), label = n)) +
  geom_segment(aes(x = 0, y = reorder(major, -n), xend = n, yend = reorder(major, -n)), 
               size = 0.5, color = "grey50") +
  ## reorder Group2 by its frequency in descending order
  geom_point(size = 8) +
  ## geom_point() creates scatterplots that display the relationship between two variables, which takes the values of both x and y axes.
  ## Here, if we add geom_point() before geom_segment(), we'll see the sticks stabbing the points. 
  ## So the sequence actually matters here. 
  geom_text(color = "white", size = 3) +
  coord_flip() +
  theme(axis.text.x = element_text(size = 12, angle = 90, hjust = 1),
  ## The label of x axis would be too wide to lay out horizontally. hjust = 1 means right-justified.
        axis.text.y = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_line(size = 0))

word cloud

Another graph type/package that needs a frequency table is the word cloud, at least for the wordcloud2 package that we will be using. The good thing about the graphs created by the wordcloud2 package is that if we hover over the graphs, we will be able to view the frequency of the chosen word in a tooltip.

We have data from student submissions on “My favorite place to study in Library”.

head(survey$space_lib, n = 25)

##  [1] "desk"             "windows"          "428"             
##  [4] "428"              "talking"          "silent"          
##  [7] "windows"          "400"              "silent"          
## [10] "428"              "NULL"             "silent"          
## [13] "silent"           "windows"          "group study room"
## [16] "400"              "group study room" "group study room"
## [19] "400"              "407"              "407, silent"     
## [22] "other"            "other"            "group study room"
## [25] "NULL"

Extracting the strings and putting them into a frequency table before plotting, which we elaborated in the previous post.

lib <- unlist(strsplit(survey$space_lib, ","))
lib <- str_trim(lib, side = "both")
lib <- lib[!lib %in% c("NULL")]
lib <- data.frame(lib) %>% count(lib) %>% data.frame() %>% arrange(-n)
lib

##                 lib  n
## 1            silent 77
## 2               407 58
## 3           windows 47
## 4             other 31
## 5  group study room 23
## 6               400 22
## 7               428 17
## 8              desk 14
## 9              sofa 11
## 10       large desk  7
## 11           corner  6
## 12       small desk  4
## 13               4F  2
## 14            table  2
## 15           outlet  1
## 16          privacy  1
## 17          reserve  1
## 18          talking  1

Now we can plot the word cloud with two lines of codes.

color1 <- rep(c(palette[[1]][4], palette[[2]][1], palette[[3]][1], palette[[2]][2], palette[[2]][3], palette[[3]][2]),
               length.out=nrow(lib))
## sets the color scheme to be applied in the wordcloud2() function

wordcloud2(lib, size=1.2, shape = "diamond", color = color1, ellipticity = 0.9)

Using Info from Several Columns

heatmap

In our survey data, we have several variables rank_crowded, rank_modpop, rank_noisy, rank_quiet, and rank_silent asking if library users would like to study in an environment that is “crowded”, “moderately populated”, “noisy”, “quiet (occasional whispers)”, or “(close to) silent”.

head(survey[c("rank_crowded","rank_modpop", "rank_noisy", "rank_quiet", "rank_silent")])

## # A tibble: 6 x 5
##   rank_crowded rank_modpop rank_noisy rank_quiet rank_silent
##          <dbl>       <dbl>      <dbl>      <dbl>       <dbl>
## 1           NA          NA         NA          1          NA
## 2            1          NA         NA         NA          NA
## 3           NA           3         NA         NA           1
## 4           NA          NA         NA         NA           2
## 5           NA          NA         NA          3           1
## 6           NA          NA          2          3           1

When plotting, we’d like to present the results altogether in a matrix. We did some preprocessing in order to bind the columns by group and to reshape them to the long format that a plot type supports.

rank_status <- 
  data.frame(survey %>% group_by(status) %>% summarise(rank_crowded = round(mean(rank_crowded, na.rm =TRUE),2))) %>% 
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_modpop = round(mean(rank_modpop, na.rm =TRUE),2)))) %>%
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_noisy = round(mean(rank_noisy, na.rm =TRUE),2)))) %>%
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_quiet = round(mean(rank_quiet, na.rm =TRUE),2)))) %>%
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_silent = round(mean(rank_silent, na.rm =TRUE),2)))) %>%
  rename(`crowded` = rank_crowded, `moderately populated` = rank_modpop, `noisy` = rank_noisy, `quiet (occasional whispers)` = rank_quiet, `(close to) silent` = rank_silent) 

rank_status <- rank_status %>% melt()

rank_status

##            status                    variable value
## 1        Freshman                     crowded  1.52
## 2       Sophomore                     crowded  1.60
## 3          Junior                     crowded  1.25
## 4          Senior                     crowded  3.42
## 5  Other Programs                     crowded  4.00
## 6        Freshman        moderately populated  2.43
## 7       Sophomore        moderately populated  2.36
## 8          Junior        moderately populated  1.67
## 9          Senior        moderately populated  3.21
## 10 Other Programs        moderately populated  3.50
## 11       Freshman                       noisy  3.08
## 12      Sophomore                       noisy  3.40
## 13         Junior                       noisy  2.00
## 14         Senior                       noisy  4.93
## 15 Other Programs                       noisy  7.00
## 16       Freshman quiet (occasional whispers)  1.90
## 17      Sophomore quiet (occasional whispers)  2.13
## 18         Junior quiet (occasional whispers)  1.45
## 19         Senior quiet (occasional whispers)  1.97
## 20 Other Programs quiet (occasional whispers)  1.91
## 21       Freshman           (close to) silent  1.76
## 22      Sophomore           (close to) silent  1.71
## 23         Junior           (close to) silent  1.89
## 24         Senior           (close to) silent  1.58
## 25 Other Programs           (close to) silent  1.93

Now let’s plot the heat map.

ggplot(data = rank_status, aes(x = variable, y = status)) +
  geom_tile(aes(fill = value), alpha = 0.95)+
  scale_fill_gradient(high = "white", low = palette[[2]][1]) +
  geom_text(aes(label = value), size = 2.5, alpha = 0.9)+
  scale_y_discrete(limits = rev(levels(rank_status$status))) +
  theme(axis.text.x = element_text(size = 12, angle = 90, hjust = 1),
        axis.text.y = element_text(size = 12, margin = margin(0,3,0,0)),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_line(size = 0),
        plot.subtitle = element_text(size = 10, margin = margin(0,0,15,0)),
        legend.title = element_blank(),
        legend.position = "none",
        plot.margin = unit(c(1,1,1,1), "cm")) +
  labs(subtitle = "  * Average scores of ranking (1 to 6). Note: limited cases in each group.")

Visually Grouping Data

Let’s say we want to visualize the distribution of top reason of visiting library (top_reason) by groups. There are multiple ways to do that. We will use this subsample as a demo to show ways of visually grouping data.

Previously we’ve created a data frame dtset summarizing top reason of visiting library by country. We did quite some work of reshaping.

Reason <- c("Work on a class assignment/paper", "Watch video or listen audio", "Use specialized databases \\(e.g. Bloomberg, Wind\\)", "Use a library computer", "Use a group study room", "Print, photocopy, scan", "Other", "Meet up with friends", "Hang out between classes", "Get readings from Course Reserve", "Get help from a librarian", "Find a quiet place to study", "Borrow books and materials", "Attend a library workshop")

r <- data.frame(survey$country)
for (m in 1: length(Reason)){
  r[,Reason[m]] <- str_extract(survey$top_reason, Reason[m])
}

dtset <- data.frame(Reason)
levels(dtset$Reason)[levels(dtset$Reason) == "Use specialized databases \\(e.g. Bloomberg, Wind\\)"] <- "Use specialized databases (e.g. Bloomberg, Wind)"

g <- c("China", "U.S.", "Other")
for (n in 1:length(g)){
  dtset[,g[n]] <- apply(r[r[] == g[n], 2:15], 2, function(x) length(which(!is.na(x))))
}

dtset$Total<- rowSums(dtset[,2:4], na.rm = TRUE, dims = 1)
dtset <- dtset[order(dtset$Total,decreasing = T),]

dtset

##                                              Reason China U.S. Other Total
## 12                      Find a quiet place to study   123   47    50   220
## 6                            Print, photocopy, scan   104   45    52   201
## 1                  Work on a class assignment/paper    89   46    31   166
## 13                       Borrow books and materials    63   15    28   106
## 10                 Get readings from Course Reserve    30    6     8    44
## 4                            Use a library computer    11   20    10    41
## 5                            Use a group study room    32    4     5    41
## 8                              Meet up with friends    23    4     9    36
## 9                          Hang out between classes    16    6     5    27
## 11                        Get help from a librarian    13    2     7    22
## 3  Use specialized databases (e.g. Bloomberg, Wind)    12    3     2    17
## 14                        Attend a library workshop     9    5     3    17
## 2                       Watch video or listen audio     7    2     2    11
## 7                                             Other     2    2     1     5

We will use this readily available data for all the demos below.

Coloring

Let’s say we want to plot the distribution of top visits and make the contrast more visible among groups. One way to achieve the goal is to color the levels in groups. Specifically, we decide to group the frequencies as “high”, “medium” and “low”.

We first code the levels.

Reason2 <- c("Work on a class assignment/paper", "Watch video or listen audio", "Use specialized databases (e.g. Bloomberg, Wind)", "Use a library computer",  "Use a group study room", "Print, photocopy, scan",  "Other", "Meet up with friends", "Hang out between classes", "Get readings from Course Reserve", "Get help from a librarian", "Find a quiet place to study", "Borrow books and materials", "Attend a library workshop")

l <- data.frame(Reason = Reason2, Level = c("High","Low","Low","Medium","Medium","High","Low","Medium","Low","Medium","Low","High","High","Low"))
## label the frequency level
## the standard may be a bit arbitrary

dtset <- dtset %>% left_join(l, by = "Reason")

The rest is standard plotting.

ggplot(dtset, aes(x = reorder(Reason, Total), y = Total, fill = factor(Level, levels = c("High","Medium","Low")))) + 
  geom_bar(stat = "identity", alpha = 0.75) + 
  scale_fill_manual(values = c(palette[[1]][4], palette[[2]][1], palette[[3]][1]), name="Level of\nFrequency") +
##  \n in "Level of\nFrequency" breaks the line
  coord_flip() +
  theme(axis.text.x = element_text(size = 12),
        axis.text.y = element_text(size = 12, margin = margin(0,3,0,0)),
        axis.title.y = element_blank(),
        axis.title.x = element_text(size = 12, margin = margin(15,0,0,0)),
        axis.ticks.x = element_line(size = 0),
        legend.title = element_text(size = 12),
        legend.text = element_text(size = 12),
        plot.margin = unit(c(0,0,1,0), "cm"))

Facets

Now let’s say we want to plot the distribution of top visits by country. This can be done with “facets” - we will get panels within one plot for each subgroup.

To support the graphing needs, we need to reshape the dtset data frame into long format.

dtset2 <- dtset %>% melt()
dtset2

##                                              Reason  Level variable value
## 1                       Find a quiet place to study   High    China   123
## 2                            Print, photocopy, scan   High    China   104
## 3                  Work on a class assignment/paper   High    China    89
## 4                        Borrow books and materials   High    China    63
## 5                  Get readings from Course Reserve Medium    China    30
## 6                            Use a library computer Medium    China    11
## 7                            Use a group study room Medium    China    32
## 8                              Meet up with friends Medium    China    23
## 9                          Hang out between classes    Low    China    16
## 10                        Get help from a librarian    Low    China    13
## 11 Use specialized databases (e.g. Bloomberg, Wind)    Low    China    12
## 12                        Attend a library workshop    Low    China     9
## 13                      Watch video or listen audio    Low    China     7
## 14                                            Other    Low    China     2
## 15                      Find a quiet place to study   High     U.S.    47
## 16                           Print, photocopy, scan   High     U.S.    45
## 17                 Work on a class assignment/paper   High     U.S.    46
## 18                       Borrow books and materials   High     U.S.    15
## 19                 Get readings from Course Reserve Medium     U.S.     6
## 20                           Use a library computer Medium     U.S.    20
## 21                           Use a group study room Medium     U.S.     4
## 22                             Meet up with friends Medium     U.S.     4
## 23                         Hang out between classes    Low     U.S.     6
## 24                        Get help from a librarian    Low     U.S.     2
## 25 Use specialized databases (e.g. Bloomberg, Wind)    Low     U.S.     3
## 26                        Attend a library workshop    Low     U.S.     5
## 27                      Watch video or listen audio    Low     U.S.     2
## 28                                            Other    Low     U.S.     2
## 29                      Find a quiet place to study   High    Other    50
## 30                           Print, photocopy, scan   High    Other    52
## 31                 Work on a class assignment/paper   High    Other    31
## 32                       Borrow books and materials   High    Other    28
## 33                 Get readings from Course Reserve Medium    Other     8
## 34                           Use a library computer Medium    Other    10
## 35                           Use a group study room Medium    Other     5
## 36                             Meet up with friends Medium    Other     9
## 37                         Hang out between classes    Low    Other     5
## 38                        Get help from a librarian    Low    Other     7
## 39 Use specialized databases (e.g. Bloomberg, Wind)    Low    Other     2
## 40                        Attend a library workshop    Low    Other     3
## 41                      Watch video or listen audio    Low    Other     2
## 42                                            Other    Low    Other     1
## 43                      Find a quiet place to study   High    Total   220
## 44                           Print, photocopy, scan   High    Total   201
## 45                 Work on a class assignment/paper   High    Total   166
## 46                       Borrow books and materials   High    Total   106
## 47                 Get readings from Course Reserve Medium    Total    44
## 48                           Use a library computer Medium    Total    41
## 49                           Use a group study room Medium    Total    41
## 50                             Meet up with friends Medium    Total    36
## 51                         Hang out between classes    Low    Total    27
## 52                        Get help from a librarian    Low    Total    22
## 53 Use specialized databases (e.g. Bloomberg, Wind)    Low    Total    17
## 54                        Attend a library workshop    Low    Total    17
## 55                      Watch video or listen audio    Low    Total    11
## 56                                            Other    Low    Total     5

The facet_wrap() option allows us to plot by groups.

ggplot(dtset2, aes(value, reorder(Reason, value))) +
  geom_segment(aes(x = 0, y = reorder(Reason, value), xend = value, yend = reorder(Reason, value)), size = 0.3, color = "grey50") +
  geom_point(color = palette[[1]][4], size = 2) +
  facet_wrap(~variable, nrow = 2) +
## produces the facets in two rows  
  theme(axis.text.x = element_text(size = 10),
        axis.text.y = element_text(size = 10, margin = margin(0,5,0,0)),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_line(size = 0),
        strip.text = element_text(size=12))

Combining graphs

Another scenario of grouping plots is “combining graphs”. Facets work on subgroups, but combining puts plots of different topics and types into one graph rather than plots subgroups on one graph.

For instance, we can combine “top reason of visiting library” and “distribution of survey participants”; but we will use faceting for plotting “top reason of visiting library (U.S.)” and “top reason of visiting library (other international students)”.

Below we will create a combined plot on student submissions of “My favorite place to study in Library” with user preference on study environment.

rank_status <- 
  data.frame(survey %>% group_by(status) %>% summarise(rank_crowded = round(mean(rank_crowded, na.rm =TRUE),2))) %>% 
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_modpop = round(mean(rank_modpop, na.rm =TRUE),2)))) %>%
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_noisy = round(mean(rank_noisy, na.rm =TRUE),2)))) %>%
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_quiet = round(mean(rank_quiet, na.rm =TRUE),2)))) %>%
  left_join(data.frame(survey %>% group_by(status) %>% summarise(rank_silent = round(mean(rank_silent, na.rm =TRUE),2)))) %>%
  rename(`crowded` = rank_crowded, `moderately populated` = rank_modpop, `noisy` = rank_noisy, `quiet (occasional whispers)` = rank_quiet, `(close to) silent` = rank_silent) 

rank_status <- rank_status %>% melt()

c1 <-
      ggplot(rank_status, aes(x = variable, y = status)) +
        geom_tile(aes(fill = value), alpha = 0.95)+
        scale_fill_gradient(high = "white", low = palette[[2]][1]) +
        geom_text(aes(label = value), size = 2.5, alpha = 0.9)+
        scale_y_discrete(limits = rev(levels(rank_status$status))) +
        theme(axis.text.x = element_text(size = 12, angle = 90, hjust = 1),
              axis.text.y = element_text(size = 12, margin = margin(0,3,0,0)),
              axis.title.y = element_blank(),
              axis.title.x = element_blank(),
              axis.ticks.x = element_line(size = 0),
              plot.subtitle = element_text(size = 10, margin = margin(0,0,15,0)),
              legend.title = element_blank(),
              legend.position = "none",
              plot.margin = unit(c(1,1,1,1), "cm")) +
        labs(subtitle = "  * Average scores of ranking (1 to 6). Note: limited cases in each group.") 

c1

lib <- unlist(strsplit(survey$space_lib, ","))
lib <- str_trim(lib, side = "both")
lib <- lib[!lib %in% c("NULL")]
lib <- data.frame(lib) %>% count(lib) %>% data.frame() %>% arrange(-n)

c2 <-
  ggplot(lib, aes(n, reorder(lib, -n), label = n)) +
  geom_segment(aes(x = 0, y = reorder(lib, -n), xend = n, yend = reorder(lib, -n)), 
               size = 0.5, color = "grey50") +
  geom_point(size = 5) +
  geom_text(color = "white", size = 2) +
  coord_flip() +
  theme(axis.text.x = element_text(size = 12, angle = 90, hjust = 1),
        axis.text.y = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_blank(),
        axis.ticks.x = element_line(size = 0))

c2

grid.arrange(c1, c2, nrow = 1, top = textGrob("Study Space Preference", gp = gpar(fontsize = 10)))

As you may say, the combination is a bit arbitrary and the combined plot is not so beautiful. But I want to show you the rationale of using “combining graphs” rather than “facets”.