Communities in Networks

Networks are “groupy”

  • Network formation processes lead to “groupy” networks
  • Foci (Scott Feld)
  • Triadic closure
  • Homophily

We may want to identify groups

  • Understanding where information is likely to flow / be held in common
  • Social media marketers identifying “interest groups”
  • Researchers understanding a scientific field
  • Understanding relationships between groups

Early approaches to identify groups

  • Components
    • Everyone connected in any way
    • Usually not useful - (nearly) everyone is connected
  • Cliques
    • Everyone is connected to everyone else in group
    • Expanded to n-cliques (everyone in group is at a distance of n or less from everyone else)
    • Long, stringy groupings

Clique Example

Early approaches (continued)

  • K-plexes
    • All members of group are connected to all but k other members

Modern approaches - community detection

  • Modularity
    • Maximize edges within communities and minimize edges between them
  • Agglomeration
    • Add nodes/groups that are closely connected
  • Reduction
    • Remove edges with high betweenness
  • Random walk
    • “Walk” through the network many times
  • Often quite similar

Fast Greedy Example

Core and Periphery

Many groups have a core and a periphery

  • New people start on the periphery (Legitimate Peripheral Participation)
  • People leaving move to the periphery
  • Different levels of dedication / resources
  • Evolutionary perspective
    • In order for groups to survive, there has to be a core

Implications

  • Those in core more likely to have knowledge, experience, and to stick around
  • Identify the most dedicated group
  • Identify those in danger of leaving / needing help

Ways of measuring

  • Block modeling

Ways of measuring

  • K-cores
    • Every node gets a value: the largest k where they are in a subgraph where everyone else has degree of at least k
  • Rich-club coefficient
    • How much are those with high degree connected to each other