Thursday, March 7, 2013

A Title for Andy

How to Plot a Novelty Image Using ggplot2: ggoldy ggopher

ggplot2 produces lovely, high quality data visualizations often with minimal effort on the part of the R user– for which, we are all grateful. In addition to serious applications of ggplot2, however, there are more frivolous, entertaining uses of the package to explore such as the creation of novelty plots.

This is a tutorial highlighting a more fun than functional use for ggplot2: graphing a Big 10 mascot known for his “adorable chubby cheeks and toothy grin.”

You will need the ggplot2 package installed and loaded for this tutorial.

install.packages("ggplot2")
require(ggplot2)


Step One: Choosing an Image to Plot

As long as you can create or obtain (x,y) coordinates for your image, you should be able to create a ggplot version. Using a grid in a program like Photoshop, Inkscape, or even Paint can be helpful in determining the coordinates needed to render the image.


Step Two: Dataset Creation

Following along with step one, in order to create an image in ggplot, you will need a dataset containing x values, y values, and corresponding colors appropriate for your image.

Most of ggoldy is comprised of the color burlywood2 (you can find a useful color chart with names here: http://www.stat.columbia.edu/~tzheng/files/Rcolor.pdf). The first coordinate plotted from my data should be a burlywood2 colored point at (6,4).

If you have time to create graphs in the image of anthropomorphized rodents, you possibly have the free time to enter all of your coordinates into R by hand and create a dataframe (shown below). A smarter option would be entering your coordinates and corresponding colors into a .csv file and then loading that file into R using the read.csv() function.

x = c(6, 7, 5, 6, 7, 8, 9, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 9, 10, 11, 12, 
    1, 2, 3, 10, 11, 12, 1, 2, 3, 4, 5, 8, 9, 10, 11, 12, 13, 14, 1, 2, 3, 4, 
    9, 10, 11, 12, 13, 14, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 2, 
    6, 7, 11, 12, 2, 6, 7, 11, 12, 2, 6, 7, 11, 12, 1, 2, 3, 4, 5, 6, 7, 8, 
    9, 10, 11, 12, 13, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 6, 
    7, 5, 6, 7, 8, 3, 5, 8, 10, 4, 5, 9, 10, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
    12)

y = c(4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7, 8, 
    8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 
    10, 10, 10, 10, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 
    12, 12, 12, 12, 12, 13, 13, 13, 13, 13, 14, 14, 14, 14, 14, 15, 15, 15, 
    15, 15, 15, 15, 15, 15, 15, 15, 15, 15, 16, 16, 16, 16, 16, 16, 16, 16, 
    16, 16, 16, 16, 16, 16, 16, 9, 9, 10, 10, 10, 10, 12, 12, 12, 12, 13, 13, 
    13, 13, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17, 17)

color = c("burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "Maroon", "Maroon", "Maroon", "Maroon", "Maroon", "Maroon", 
    "Maroon", "Maroon", "Maroon", "Maroon", "Maroon", "Maroon", "Maroon", "Maroon", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2", 
    "burlywood2", "burlywood2", "burlywood2", "burlywood2", "burlywood2")

Step Three: Put Your Dataset Into a Dataframe

Once you have entered and stored the coordinates and colors you wish to plot, the data should be put into a dataframe.

Note: If you read your data in from a .csv file, you do not need to do this step.

goldy = data.frame(Gopher = x, Goldy = y, color = color)

head(goldy)
##   Gopher Goldy      color
## 1      6     4 burlywood2
## 2      7     4 burlywood2
## 3      5     5 burlywood2
## 4      6     5 burlywood2
## 5      7     5 burlywood2
## 6      8     5 burlywood2

Step Four: Writing the Code

First, specify the basic plot elements by adding (data=) to the ggplot function. This builds the base of the graph, telling ggplot to pull data from your dataframe.

ggplot(data = goldy)

Next, use the geom_tile() function to inform ggplot of how you want your data to look when it is plotted. geom_tile() represents the data in a blocky visual style. All of the coordinates are plotted in a way that fills the plane with rectangles, as opposed to geom_point() which plots single points in surrounding white space.

For the creation of ggoldy, the aesthetic function takes:

  • x=
  • y=
  • fill=
  • width=
+geom_tile(aes(x = Gopher, y = Goldy, fill = color, width = 1))

Add scale_fill_identity() so ggplot knows to use the colors you specified in your dataset instead of picking default colors for you.

+scale_fill_identity()

The use of theme_bw() tells ggplot that you want a white background with black gridlines for your graph.

+theme_bw()

Once your code is complete, it should look like this:

ggplot(data = goldy) + geom_tile(aes(x = Gopher, y = Goldy, fill = color, width = 1)) + 
    scale_fill_identity() + theme_bw()

Final Step: Run Your Code

After all of that hard work, make sure ggplot2 is loaded, run your code and behold the image of our furry friend, ggoldy ggopher.

library(ggplot2)

ggplot(data = goldy) + geom_tile(aes(x = Gopher, y = Goldy, fill = color, width = 1)) + 
    scale_fill_identity() + theme_bw()

plot of chunk fullplot

10 comments:

  1. Replies
    1. Sure, Romain! Thanks for asking.

      Delete
    2. Here it is: http://gallery.r-enthusiasts.com/graph/ggoldy_ggopher_172

      Delete
  2. This is great. You can save a lot of typing by using rep() in your code. For example, color <- c(rep("burlywood2",108), rep("Maroon", 14), rep("burlywood2",11))

    ReplyDelete
    Replies
    1. Thanks, Jeffrey! That's a great suggestion; I'm all for tedium reduction in R!

      Delete
  3. Just for reference, if you are only drawing the blocks, you can work a little more concisely by switching over to a more base set of functions like the 'grid' package:

    # Load library and start new plot.
    library(grid)
    grid.newpage()

    # Define plotting area.
    pushViewport(viewport(x=0, y=0, height=1, width=1, just=c("left","bottom"),xscale=c(min(x)-1,max(x)+1), yscale=c(min(y)-1,max(y)+1)))

    # Plot it!
    for (i in 1:length(x)){
    grid.rect(x=x[i], y=y[i], width=1, height=1, default.units="native", gp=gpar(col=color[i], fill=color[i]))
    }

    ReplyDelete
    Replies
    1. I suppose I should have mentioned that this works after Step 2.

      Delete
    2. Thanks for the suggestion, dinre, your code looks intriguing. The love/hate relationship I have with R simultaneously makes me want to play with the code you provided and uninstall all things R related from my computer...I'm not sure which urge will win. Learning a different package is tempting, however.

      Delete
  4. fun !
    you could also save a lot of time in the dataset creation using plot digitizer softwares like http://plotdigitizer.sourceforge.net/ to capture dots coordinates

    regards
    PAscal

    ReplyDelete
    Replies
    1. That's amazing; I have a special project in mind that this will be very helpful in setting up. Thanks, PAscal!

      Delete