Image credit: Made with Midjourney
7. Random walks on graphs and Markov chains#
In this chapter, we continue to analyze datasets in the form of networks. A powerful way to extract information about the structure of a network is to analyze the behavior of a random walk “diffusing” on it. We will frame such walks in the more general context of Markov chains, a class of stochastic processes with many applications in data science, including Markov chain Monte Carlo and PageRank. Here is a more detailed overview of the main sections of the chapter.
“Background: elements of finite Markov chains” This section introduces the concept of discrete-time Markov chains on a finite state space, focusing on the time-homogeneous case. It defines the transition matrix \(P\), which encodes the probabilities of transitioning from one state to another. The section also introduces the concept of the transition graph, a directed graph representation of the Markov chain. It emphasizes the role of the Markov Property, which states the past and future are independent given the present. In particular, the distribution of a sample path is derived using the Multiplication Rule and Markov Property. It then derives the formula for the time marginals, showing that the marginal distribution of \(X_s\) is given by a matrix power of \(P\). The section concludes with a numerical example demonstrating how to simulate a Markov chain.
“Limit behavior 1: stationary distributions” This section discusses the concept of a stationary distribution for finite-space, discrete-time, time-homogeneous Markov chains. It defines a stationary distribution as a probability distribution over the state space that remains unchanged after a transition step. The existence of a stationary distribution implies a form of equilibrium for the Markov chain. The section then introduces the notion of irreducibility, which means that every state can be reached from every other state, and proves that a finite irreducible Markov chain has a unique stationary distribution with all entries being strictly positive. Finally, the section provides numerical methods for computing the stationary distribution.
“Limit behavior 2: convergence to equilibrium” This section discusses notions of convergence for finite-space discrete-time Markov chains. After introducing the concept of aperiodicity, two key theorems are presented: the Convergence to Equilibrium Theorem, which establishes convergence of the distribution at time \(t\) to the stationary distribution for irreducible and aperiodic chains, and the Ergodic Theorem, which applies to irreducible chains and states that the frequency of visits to any state converges to the stationary distribution. The section also provides numerical examples.
“Application: random walks on graphs and PageRank” This section explores random walks on graphs, a powerful tool for understanding network structures. It begins by defining random walks on directed and undirected graphs, discussing their transition matrices, stationary distributions, and the concept of reversibility. The section then introduces PageRank as a way to identify central nodes in a directed graph, modeling the behavior of a random surfer clicking through web pages. The PageRank vector is computed as the stationary distribution of a modified random walk that incorporates a damping factor to ensure irreducibility. The section also discusses degree centrality for undirected graphs and introduces Personalized PageRank, which tailors the result to specific interests by modifying the teleportation distribution.
“Further applications: Gibbs sampling and generating images” This section discusses Markov Chain Monte Carlo (MCMC) methods, focusing on the Metropolis-Hastings algorithm and Gibbs sampling, which are used to generate samples from complex probability distributions. The Metropolis-Hastings algorithm constructs a Markov chain with the desired stationary distribution by using a proposal distribution and an acceptance probability that depends on the target distribution. Gibbs sampling is a special case of the Metropolis-Hastings algorithm that exploits the conditional independence structure of the target distribution, making it particularly useful for high-dimensional problems. The section illustrates the application of Gibbs sampling to generate natural-looking images of handwritten digits using a restricted Boltzmann machine (RBM) trained on the MNIST dataset.