## Peter Robinson

I'm interested in designing new distributed and parallel algorithms, the distributed processing of big data, achieving fault-tolerance in networks, and secure distributed computing in dynamic environments such as peer-to-peer networks and mobile ad-hoc networks.

## News

- General Chair of ACM PODC 2019
- Program committee member of BGP 2017, SPAA 2016, SIROCCO 2016
- Giving a talk at a workshop on Dynamic Graphs in Distributed Computing (co-located with DISC 2016)
- Co-chairing the program committee of ICDCN 2016
- Giving a talk at ADGA 2015, (4th Workshop on Advances in Distributed Graph Algorithms, co-located with DISC 2015)

## Keywords (Show all)

Asynchrony Big Data Byzantine Failures Churn Communication Complexity Distributed Agreement Distributed Storage Dynamic Network Fault-Tolerance Gossip Communication Graph Algorithm Haskell Leader Election Machine Learning Mobile Ad-Hoc Network Natural Language Processing P2P Secure Computation Self-Healing Symmetry Breaking## Publications

2017

- Symmetry Breaking in the Congest Model: Message- and Time-Efficient Algorithms for Ruling Sets.

Shreyas Pai, Gopal Pandurangan, Sriram V. Pemmaraju, Talal Riaz, Peter Robinson. (under review)

AbstractWe study local symmetry breaking problems in the Congest model, focusing on ruling set problems, which generalize the fundamental Maximal Independent Set (MIS) problem. The time (round) complexity of MIS (and ruling sets) have attracted much attention in the Local model. Indeed, recent results (Barenboim et al., FOCS 2012, Ghaffari SODA 2016) for the MIS problem have tried to break the long-standing $O(\log n)$-round ``barrier'' achieved by Luby's algorithm, but these yield $o(\log n)$-round complexity only when the maximum degree $\Delta$ is somewhat small relative to $n$. More importantly, these results apply only in the Local model. In fact, the best known time bound in the Congest model is still $O(\log n)$ (via Luby's algorithm) even for somewhat small $\Delta$. Furthermore, message complexity has been largely ignored in the context of local symmetry breaking. Luby's algorithm takes $O(m)$ messages on $m$-edge graphs and this is the best known bound with respect to messages. Our work is motivated by the following central question: can we break the $\Theta(m)$ message bound and the $\Theta(\log n)$ time bound in the Congest model for MIS or closely-related symmetry breaking problems? This paper presents progress towards this question for the distributed ruling set problem in the Congest model. A $\beta$-ruling set is an independent set such that every node in the graph is at most $\beta$ hops from a node in the independent set. We present the following results: 1. Time Complexity: We show that we can break the $O(\log n)$ ``barrier'' for 2- and 3-ruling sets. We compute 3-ruling sets in $O\left(\log n/\log \log n\right)$ rounds with high probability (whp). More generally we show that 2-ruling sets can be computed in $O\left(\log \Delta \cdot (\log n)^{1/2 + \varepsilon} + \log n/\log\log n\right)$ rounds for any $\varepsilon > 0$, which is $o(\log n)$ for a wide range of $\Delta$ values (e.g., $\Delta = 2^{(\log n)^{1/2-\varepsilon}}$). These are the first 2- and 3-ruling set algorithms to improve over the $O(\log n)$-round complexity of Luby's algorithm in the Congest model. 2. Message Complexity: We show an $\Omega(n^2)$ lower bound on the message complexity of computing an MIS (i.e., 1-ruling set) which holds also for randomized algorithms and present a contrast to this by showing a randomized algorithm for 2-ruling sets that, whp, uses only $O(n \log^2 n)$ messages and runs in $O(\Delta \log n)$ rounds. This is the first message-efficient algorithm known for ruling sets, which takes near-linear message complexity (which is optimal up to a polylogarithmic factor). Our results are a step toward understanding the time and message complexity of symmetry breaking problems in the Congest model.

2016

- Efficient Computation of Sparse StructuresDOI

David G. Harris, Ehab Morsy, Gopal Pandurangan, Peter Robinson, Aravind Srinivasan. Random Structures & Algorithms (RSA).

AbstractBasic graph structures such as maximal independent sets (MIS's) have spurred much theoretical research in randomized and distributed algorithms, and have several applications in networking and distributed computing as well. However, the extant (distributed) algorithms for these problems do not necessarily guarantee fault-tolerance or load-balance properties. We propose and study ''low-average degree'' or ``sparse'' versions of such structures. Interestingly, in sharp contrast to, say, MIS's, it can be shown that checking whether a structure is sparse, will take substantial time. Nevertheless, we are able to develop good sequential/distributed (randomized) algorithms for such sparse versions. We also complement our algorithms with several lower bounds. Randomization plays a key role in our upper and lower bound results.

2015

- On the Complexity of Universal Leader ElectionPDFDOI

Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan. Journal of the ACM, vol. 62(1), 7:1-7:27 (JACM).

AbstractElecting a leader is a fundamental task in distributed computing. In its implicit version, only the leader must know who is the elected leader. This paper focuses on studying the message and time complexity of randomized implicit leader election in synchronous distributed networks. Surprisingly, the most ''obvious'' complexity bounds have not been proven for randomized algorithms. The ``obvious'' lower bounds of $\Omega(m)$ messages ($m$ is the number of edges in the network) and $\Omega(D)$ time ($D$ is the network diameter) are non-trivial to show for randomized (Monte Carlo) algorithms. (Recent results that show that even $\Omega(n)$ ($n$ is the number of nodes in the network) is not a lower bound on the messages in complete networks, make the above bounds somewhat less obvious). To the best of our knowledge, these basic lower bounds have not been established even for deterministic algorithms (except for the limited case of comparison algorithms, where it was also required that some nodes may not wake up spontaneously, and that $D$ and $n$ were not known). We establish these fundamental lower bounds in this paper for the general case, even for randomized Monte Carlo algorithms. Our lower bounds are universal in the sense that they hold for all universal algorithms (such algorithms should work for all graphs), apply to every $D$, $m$, and $n$, and hold even if $D$, $m$, and $n$ are known, all the nodes wake up simultaneously, and the algorithms can make any use of node's identities. To show that these bounds are tight, we present an $O(m)$ messages algorithm. An $O(D)$ time algorithm is known. An interesting fundamental problem is whether both upper bounds (messages and time) can be reached simultaneously in the randomized setting for all graphs. (The answer is known to be negative in the deterministic setting). We answer this problem partially by presenting a randomized algorithm that matches both complexities in some cases. This already separates (for some cases) randomized algorithms from deterministic ones. As first steps towards the general case, we present several universal leader election algorithms with bounds that trade-off messages versus time. We view our results as a step towards understanding the complexity of universal leader election in distributed networks.

2014

- Optimal Bounds for Randomized Leader ElectionPDFDOI

Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan. Special Issue of Theoretical Computer Science, Elsevier. (TCS).

AbstractThis paper concerns randomized leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete $n$-node networks that runs in $O(1)$ rounds and (with high probability) uses only $O(\sqrt{n}\log^{3/2} n)$ messages to elect a unique leader (with high probability). When considering the ''explicit'' variant of leader election where eventually every node knows the identity of the leader, our algorithm yields the asymptotically optimal bounds of $O(1)$ rounds and $O(n)$ messages. This algorithm is then extended to one solving leader election on any connected non-bipartite $n$-node graph $G$ in $O(\tau(G))$ time and $O(\tau(G)\sqrt{n}\log^{3/2} n)$ messages, where $\tau(G)$ is the mixing time of a random walk on $G$. The above result implies highly efficient (sublinear running time and messages) leader election algorithms for networks with small mixing times, such as expanders and hypercubes. In contrast, previous leader election algorithms had at least linear message complexity even in complete graphs. Moreover, super-linear message lower bounds are known for time-efficient deterministic leader election algorithms. Finally, we present an almost matching lower bound for randomized leader election, showing that $\Omega(\sqrt{n})$ messages are needed for any leader election algorithm that succeeds with probability at least $1/e + \epsilon$, for any small constant $\epsilon > 0$. We view our results as a step towards understanding the randomized complexity of leader election in distributed networks.

2013

- Sublinear Bounds for Randomized Leader ElectionPDFDOI

Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan. 14th International Conference on Distributed Computing and Networking (ICDCN 2013). Best Paper Award.

AbstractThis paper concerns randomized leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete n-node networks that runs in $O(1)$ rounds and (with high probability) takes only $O(\sqrt{n}\log^{3/2}n)$ messages to elect a unique leader (with high probability). This algorithm is then extended to solve leader election on any connected non-bipartite n-node graph $G$ in $O(\tau(G))$ time and $O(\tau(G)\sqrt{n}\log^{3/2}n)$ messages, where $\tau(G)$ is the mixing time of a random walk on $G$. The above result implies highly efficient (sublinear running time and messages) leader election algorithms for networks with small mixing times, such as expanders and hypercubes. In contrast, previous leader election algorithms had at least linear message complexity even in complete graphs. Moreover, super-linear message lower bounds are known for time-efficient deterministic leader election algorithms. Finally, an almost-tight lower bound is presented for randomized leader election, showing that $\Omega(\sqrt{n})$ messages are needed for any $O(1)$ time leader election algorithm which succeeds with high probability. It is also shown that $\Omega(n^{1/3})$ messages are needed by any leader election algorithm that succeeds with high probability, regardless of the number of the rounds. We view our results as a step towards understanding the randomized complexity of leader election in distributed networks. - On the Complexity of Universal Leader ElectionPDFDOI

Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan. 32nd ACM Symposium on Principles of Distributed Computing (PODC 2013).

AbstractElecting a leader is a fundamental task in distributed computing. In its implicit version, only the leader must know who is the elected leader. This paper focuses on studying the message and time complexity of randomized implicit leader election in synchronous distributed networks. Surprisingly, the most ''obvious'' complexity bounds have not been proven for randomized algorithms. The ``obvious'' lower bounds of $\Omega(m)$ messages ($m$ is the number of edges in the network) and $\Omega(D)$ time ($D$ is the network diameter) are non-trivial to show for randomized (Monte Carlo) algorithms. (Recent results that show that even $\Omega(n)$ ($n$ is the number of nodes in the network) is not a lower bound on the messages in complete networks, make the above bounds somewhat less obvious). To the best of our knowledge, these basic lower bounds have not been established even for deterministic algorithms (except for the limited case of comparison algorithms, where it was also required that some nodes may not wake up spontaneously, and that $D$ and $n$ were not known). We establish these fundamental lower bounds in this paper for the general case, even for randomized Monte Carlo algorithms. Our lower bounds are universal in the sense that they hold for all universal algorithms (such algorithms should work for all graphs), apply to every $D$, $m$, and $n$, and hold even if $D$, $m$, and $n$ are known, all the nodes wake up simultaneously, and the algorithms can make any use of node's identities. To show that these bounds are tight, we present an $O(m)$ messages algorithm. An $O(D)$ time algorithm is known. An interesting fundamental problem is whether both upper bounds (messages and time) can be reached simultaneously in the randomized setting for all graphs. (The answer is known to be negative in the deterministic setting). We answer this problem partially by presenting a randomized algorithm that matches both complexities in some cases. This already separates (for some cases) randomized algorithms from deterministic ones. As first steps towards the general case, we present several universal leader election algorithms with bounds that trade-off messages versus time. We view our results as a step towards understanding the complexity of universal leader election in distributed networks.

## Code

I'm interested in parallel and distributed programming and related technologies such as software transactional memory. Below is a (non-comprehensive) list of software that I have written.

- I extended Haskell's Cabal, for using a "world" file to keep track of installed packages. (Now part of the main distribution.)
- data dispersal: an implementation of an (m,n)-threshold information dispersal scheme that is space-optimal.
- secret sharing: an implementation of a secret sharing scheme that provides information-theoretic security.
- tskiplist: a data structure with range-query support for software transactional memory.
- stm-io-hooks: An extension of Haskell's Software Transactional Memory (STM) monad with commit and retry IO hooks.
- Mathgenealogy: Visualize your (academic) genealogy! A program for extracting data from the Mathematics Genealogy project.
- In my master thesis I developed a system for automatically constructing events out of log files produced by various system programs. One of the core components of my work was a part-of-speech (POS) tagger, which assigns word classes (e.g. noun, verb) to the previously parsed tokens of the log file. To cope with noisy input data, I modeled the POS tagger as a hidden Markov model. I developed (and proved the correctness of) a variant of the maximum likelihood estimation algorithm for training the Markov model and smoothing the state transition distributions.

## Misc

- Program committee membership: BGP 2017, ICDCN 2016, SPAA 2016, SIROCCO 2016, ICDCN 2015, SIROCCO 2014, FOMC 2014.
- DBLP entry (Shows a subset of my publications.)
- Google scholar profile
- My profile on StackExchange