This is an old version of the class, kept for posterity. If you're seeing this, you probably want to check the URL and go to the latest version.

Visualization challenge

The goal

So far, the exercises have done a lot of hand-holding in helping you to import and visualize networks. For this project, you will identify a network-based question that you think a visualization could help to answer, and create a visualization.

There are three parts of the assignment:

First, I want you to think about a question and draw on a piece of paper what you want the outcome to look like. For example, if you want to visualize whether people who have a certain attribute are more popular, you may decide to color nodes by in-degree and change their size or shape based on that attribute.
Second, you will do your best to recreate your idea using tidygraph and ggraph. I would like you to turn in both your drawing and your visualization.
Write a paragraph or two explaining what the visualization tells us about the question that you asked.

You are welcome to work with a partner if you would like. Just make it clear who you worked with and I would encourage you to be a bit more ambitious in what you try to do.

Dutch School Data

In 2003 and 2004, researchers repeatedly surveyed a number of Dutch school students about their friendships and their behavior. They were particularly interested in the relationship between friendships and drinking behavior. They recorded information about alcohol use, gender, age, ethnicity (whether Dutch or not), and religion.

You may want to ask questions like:

Are people who drink more more popular?
Are males or females more likely to have the same drinking behavior as their friends?
Are people of the dominant religion more likely to be popular? More likely to be friends with each other?

Resources

The Introduction to ggraph and tidygraph reading actually uses this dataset. You can look to that for examples to build on.

Data:

Descriptions of what each measure means are at this site, maintained by the people who collected the data.

To import the data you can right-click on and save the edge and node data files above to your computer and then import them into R.

Alternatively, the following code will download the files and create a graph object. You are welcome to reuse it.

library(tidyverse)
library(igraph)
library(tidygraph)
library(ggraph)

nodes = read_csv('https://canf24.jeremydfoote.com/resources/school_graph_nodes.csv')
edges = read_csv('https://canf24.jeremydfoote.com/resources/school_graph_edges.csv')

G = graph_from_data_frame(d=edges, v=nodes) %>% as_tbl_graph()

Harry Potter Data

The authors of this paper coded all of the peer support relationships in Harry Potter books. For each student, they also recorded their school year, gender and house.

You may want to ask questions like:

Are students of the same gender more likely to provide support?
Are Slytherins less likely to provide support?
Are older students more likely to provide support to younger students?

As with the Dutch school data, I’ve prepared some files and some code to get you started. These data files are for the fifth book. You may want to remove nodes who don’t have any edges (since they might not even appear in the book).

library(tidyverse)
library(igraph)
library(tidygraph)
library(ggraph)

nodes = read_csv('https://canf24.jeremydfoote.com/resources/harry_potter/hp_node_atts.csv')
edges = read_csv('https://canf24.jeremydfoote.com/resources/harry_potter/hp_edgelist.csv')

G = graph_from_data_frame(d=edges, v=nodes) %>% as_tbl_graph()