I'm interested in designing new distributed and parallel algorithms, distributed processing of large-scale data, and security in dynamic communication networks such as peer-to-peer networks and mobile ad-hoc networks.

News

Publications

2016 (7)
  • Fast Distributed Algorithms for Connectivity and MST in Large Graphs
    by Gopal Pandurangan, Peter Robinson, Michele Scquizzato.
    28th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2016). Asilomar State Beach, California.
    Abstract...
    Motivated by the increasing need to understand the algorithmic foundations of distributed large-scale graph computations, we study a number of fundamental graph problems in a message-passing model for distributed computing where $k \geq 2$ machines jointly perform computations on graphs with $n$ nodes (typically, $n \gg k$). The input graph is assumed to be initially randomly partitioned among the $k$ machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation. Our main result is an (almost) optimal distributed randomized algorithm for graph connectivity. Our algorithm runs in $\tilde{O}(n/k^2)$ rounds ($\tilde{O}$ notation hides a $\text{polylog}(n)$ factor and an additive $\text{polylog}(n)$ term). This improves over the best previously known bound of $\tilde{O}(n/k)$ [Klauck et al., SODA 2015], and is optimal (up to a polylogarithmic factor) in view of an existing lower bound of $\tilde{\Omega}(n/k^2)$. Our improved algorithm uses a bunch of techniques, including linear graph sketching, that prove useful in the design of efficient distributed graph algorithms. We then present fast randomized algorithms for computing minimum spanning trees, (approximate) min-cuts, and for many graph verification problems. All these algorithms take $\tilde{O}(n/k^2)$ rounds, and are optimal up to polylogarithmic factors. We also show an almost matching lower bound of $\tilde{\Omega}(n/k^2)$ for many graph verification problems using lower bounds in random-partition communication complexity.
  • Efficient Computation of Sparse Structures
    by David G. Harris, Ehab Morsy, Gopal Pandurangan, Peter Robinson, Aravind Srinivasan.
    Random Structures & Algorithms (RSA). Wiley.
    Abstract...
    Basic graph structures such as maximal independent sets (MIS's) have spurred much theoretical research in randomized and distributed algorithms, and have several applications in networking and distributed computing as well. However, the extant (distributed) algorithms for these problems do not necessarily guarantee fault-tolerance or load-balance properties. We propose and study ''low-average degree'' or ``sparse'' versions of such structures. Interestingly, in sharp contrast to, say, MIS's, it can be shown that checking whether a structure is sparse, will take substantial time. Nevertheless, we are able to develop good sequential/distributed (randomized) algorithms for such sparse versions. We also complement our algorithms with several lower bounds. Randomization plays a key role in our upper and lower bound results.
  • DEX: Self-Healing Expanders DOI
    by Gopal Pandurangan, Peter Robinson, Amitabh Trehan.
    Distributed Computing (DC). Springer.
    Abstract...
    We present a fully-distributed self-healing algorithm DEX, that maintains a constant degree expander network in a dynamic setting. To the best of our knowledge, our algorithm provides the first efficient distributed construction of expanders --- whose expansion properties hold deterministically --- that works even under an all-powerful adaptive adversary that controls the dynamic changes to the network (the adversary has unlimited computational power and knowledge of the entire network state, can decide which nodes join and leave and at what time, and knows the past random choices made by the algorithm). Previous distributed expander constructions typically provide only probabilistic guarantees on the network expansion which rapidly degrade in a dynamic setting; in particular, the expansion properties can degrade even more rapidly under adversarial insertions and deletions. Our algorithm provides efficient maintenance and incurs a low overhead per insertion/deletion by an adaptive adversary: only $O(\log n)$ rounds and $O(\log n)$ messages are needed with high probability ($n$ is the number of nodes currently in the network). The algorithm requires only a constant number of topology changes. Moreover, our algorithm allows for an efficient implementation and maintenance of a distributed hash table (DHT) on top of DEX, with only a constant additional overhead. Our results are a step towards implementing efficient self-healing networks that have guaranteed properties (constant bounded degree and expansion) despite dynamic changes.
  • Distributed Algorithmic Foundations of Dynamic Networks
    by John Augustine, Gopal Pandurangan, Peter Robinson.
    SIGACT News Distributed Computing Column 1/2016 (SIGACT).
  • Gossiping with Latencies
    by Seth Gilbert, Peter Robinson, Suman Sourav.
    (under review)
  • DConstructor: Network Construction with Polylogarithmic Overhead
    by Seth Gilbert, Gopal Pandurangan, Peter Robinson, Amitabh Trehan.
    (under review)
  • Tight Bounds for Distributed Graph Computations
    by Gopal Pandurangan, Peter Robinson, Michele Scquizzato.
    (under review)
2015 (5)
  • Enabling Efficient and Robust Distributed Computation in Highly Dynamic Networks DOI
    by John Augustine, Gopal Pandurangan, Peter Robinson, Scott Roche, Eli Upfal.
    56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015). Berkeley, California.
    Abstract...
    Motivated by the need for designing efficient and robust fully-distributed computation in highly dynamic networks such as Peer-to-Peer (P2P) networks, we study distributed protocols for constructing and maintaining dynamic network topologies with good expansion properties. Our goal is to maintain a sparse (bounded degree) expander topology despite heavy churn (i.e., nodes joining and leaving the network continuously over time). We assume that the churn is controlled by an adversary that has complete knowledge and control of what nodes join and leave and at what time and has unlimited computational power, but is oblivious to the random choices made by the algorithm. Our main contribution is a randomized distributed protocol that guarantees with high probability the maintenance of a constant degree graph with high expansion even under continuous high adversarial churn. Our protocol can tolerate a churn rate of up to $O(n/\text{polylog}(n))$ per round (where $n$ is the stable network size). Our protocol is efficient, lightweight, and scalable, and it incurs only $O(\text{polylog}(n))$ overhead for topology maintenance: only polylogarithmic (in $n$) bits needs to be processed and sent by each node per round and any node's computation cost per round is also polylogarithmic. The given protocol is a fundamental ingredient that is needed for the design of efficient fully-distributed algorithms for solving fundamental distributed computing problems such as agreement, leader election, search, and storage in highly dynamic P2P networks and enables fast and scalable algorithms for these problems that can tolerate a large amount of churn.
  • On the Complexity of Universal Leader Election PDF DOI
    by Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan.
    Journal of the ACM, vol. 62(1), 7:1-7:27 (JACM).
    Abstract...
    Electing a leader is a fundamental task in distributed computing. In its implicit version, only the leader must know who is the elected leader. This paper focuses on studying the message and time complexity of randomized implicit leader election in synchronous distributed networks. Surprisingly, the most ''obvious'' complexity bounds have not been proven for randomized algorithms. The ``obvious'' lower bounds of $\Omega(m)$ messages ($m$ is the number of edges in the network) and $\Omega(D)$ time ($D$ is the network diameter) are non-trivial to show for randomized (Monte Carlo) algorithms. (Recent results that show that even $\Omega(n)$ ($n$ is the number of nodes in the network) is not a lower bound on the messages in complete networks, make the above bounds somewhat less obvious). To the best of our knowledge, these basic lower bounds have not been established even for deterministic algorithms (except for the limited case of comparison algorithms, where it was also required that some nodes may not wake up spontaneously, and that $D$ and $n$ were not known). We establish these fundamental lower bounds in this paper for the general case, even for randomized Monte Carlo algorithms. Our lower bounds are universal in the sense that they hold for all universal algorithms (such algorithms should work for all graphs), apply to every $D$, $m$, and $n$, and hold even if $D$, $m$, and $n$ are known, all the nodes wake up simultaneously, and the algorithms can make any use of node's identities. To show that these bounds are tight, we present an $O(m)$ messages algorithm. An $O(D)$ time algorithm is known. An interesting fundamental problem is whether both upper bounds (messages and time) can be reached simultaneously in the randomized setting for all graphs. (The answer is known to be negative in the deterministic setting). We answer this problem partially by presenting a randomized algorithm that matches both complexities in some cases. This already separates (for some cases) randomized algorithms from deterministic ones. As first steps towards the general case, we present several universal leader election algorithms with bounds that trade-off messages versus time. We view our results as a step towards understanding the complexity of universal leader election in distributed networks.
  • Distributed Computation of Large-scale Graph Problems PDF DOI
    by Hartmut Klauck, Danupon Nanongkai, Gopal Pandurangan, Peter Robinson.
    26th ACM-SIAM Symposium on Discrete Algorithms (SODA 2015). San Diego, California.
    Abstract...
    Motivated by the increasing need for fast distributed processing of large-scale graphs such as the Web graph and various social networks, we study a number of fundamental graph problems in the message-passing model, where we have $k$ machines that jointly perform computation on an arbitrary $n$-node (typically, $n \gg k$) input graph. The graph is assumed to be randomly partitioned among the $k \geq 2$ machines (a common implementation in many real world systems). The communication is point-to-point, and the goal is to minimize the time complexity, i.e., the number of communication rounds, of solving various fundamental graph problems. We present lower bounds that quantify the fundamental time limitations of distributively solving graph problems. We first show a lower bound of $\Omega(n/k)$ rounds for computing a spanning tree (ST) of the input graph. This result also implies the same bound for other fundamental problems such as computing a minimum spanning tree (MST), breadth-first tree (BFS), and shortest paths tree (SPT). We also show an $\Omega(n/k^2)$ lower bound for connectivity, ST verification and other related problems. Our lower bounds develop and use new bounds in random-partition communication complexity. To complement our lower bounds, we also give algorithms for various fundamental graph problems, e.g., PageRank, MST, connectivity, ST verification, shortest paths, cuts, spanners, covering problems, densest subgraph, subgraph isomorphism, finding triangles, etc. We show that problems such as PageRank, MST, connectivity, and graph covering can be solved in $\tilde{O}(n/k)$ time (the notation $\tilde O$ hides $\text{polylog}(n)$ factors and an additive $\text{polylog}(n)$ term); this shows that one can achieve almost linear (in $k$) speedup, whereas for shortest paths, we present algorithms that run in $\tilde{O}(n/\sqrt{k})$ time (for $(1+\epsilon)$-factor approximation) and in $\tilde{O}(n/k)$ time (for $O(\log n)$-factor approximation) respectively. Our results are a step towards understanding the complexity of distributively solving large-scale graph problems.
  • Fast Byzantine Leader Election in Dynamic Networks
    by John Augustine, Gopal Pandurangan, Peter Robinson.
    29th International Symposium on Distributed Computing (DISC 2015). Tokyo.
    Abstract...
    Motivated by robust, secure, and efficient distributed computation in Peer-to-Peer (P2P) networks, we study fundamental Byzantine problems in dynamic networks where the topology can change from round to round and nodes can also experience heavy churn (i.e., nodes can join and leave the network continuously over time). We assume the full information model where the Byzantine nodes have complete knowledge about the entire state of network at every round (including random choices made by all the nodes), have unbounded computational power and can deviate arbitrarily from the protocol. The churn is controlled by an adversary that has complete knowledge and control of what nodes join and leave and at what time and also may rewire the topology in every round and has unlimited computational power, but is oblivious to the random choices made by the algorithm. Byzantine protocols for fundamental distributed computing problems such as agreement and leader election have been studied extensively for the last three decades in static networks; however, these solutions do not work in dynamic networks which characterize many real-world networks such as P2P networks. Our main contribution is an $O(\log^3 n)$ round algorithm that achieves Byzantine leader election under the presence of up to $O({n}^{1/2 - \epsilon})$ Byzantine nodes (for a small constant $\epsilon > 0$) and a churn of up to $O(\sqrt{n}/\text{poly}\log(n))$ nodes per round (where $n$ is the stable network size). The algorithm elects a leader with probability at least $1-n^{-\Omega(1)}$ and guarantees that it is an honest node with probability at least $1-n^{-\Omega(1)}$; assuming the algorithm succeeds, the leader's identity will be known to a $1-o(1)$ fraction of the honest nodes. Our algorithm is fully-distributed, localized (does not require any global knowledge), lightweight, and is simple to implement. It is also scalable, as it runs in polylogarithmic time and requires nodes to send and receive messages of only polylogarithmic size per round. To the best of our knowledge, our algorithm is the first scalable solution for Byzantine leader election in a dynamic network with a high rate of churn; our protocol can also be used to solve Byzantine agreement in a straightforward way. We also show how to implement an (almost-everywhere) public coin with constant bias in a dynamic network with Byzantine nodes and provide a mechanism for enabling honest nodes to store information reliably in the network, which might be of independent interest. In decentralized and dynamic P2P systems where a substantial part of the network may be controlled by malicious nodes, the presented algorithm and techniques can serve as building blocks for designing robust and secure distributed protocols.
  • Gracefully Degrading Consensus and k-Set Agreement in Directed Dynamic Networks DOI
    by Martin Biely, Peter Robinson, Ulrich Schmid, Manfred Schwarz, Kyrill Winkler.
    2nd International Conference on Networked Systems (NETYS 2015).
    Abstract...
    We present the first consensus/k-set agreement algorithm for synchronous dynamic networks with unidirectional links, controlled by an omniscient message adversary, which automatically adapts to the actual network properties in a run: If the network is sufficiently well-connected, it solves consensus, while it degrades gracefully to general k-set agreement in less well-connected communication graphs. The actual number k of system-wide decision values is determined by the number of certain vertex-stable root components occurring in a run, which are strongly connected components without incoming links from outside. Related impossibility results reveal that our condition is reasonably close to the solvability border for k-set agreement.
2014 (5)
  • DEX: Self-Healing Expanders PDF DOI
    by Gopal Pandurangan, Peter Robinson, Amitabh Trehan.
    28th IEEE International Parallel Distributed Processing Symposium (IPDPS 2014). Phoenix, Arizona.
    Abstract...
    We present a fully-distributed self-healing algorithm DEX, that maintains a constant degree expander network in a dynamic setting. To the best of our knowledge, our algorithm provides the first efficient distributed construction of expanders --- whose expansion properties hold deterministically --- that works even under an all-powerful adaptive adversary that controls the dynamic changes to the network (the adversary has unlimited computational power and knowledge of the entire network state, can decide which nodes join and leave and at what time, and knows the past random choices made by the algorithm). Previous distributed expander constructions typically provide only probabilistic guarantees on the network expansion which rapidly degrade in a dynamic setting; in particular, the expansion properties can degrade even more rapidly under adversarial insertions and deletions. Our algorithm provides efficient maintenance and incurs a low overhead per insertion/deletion by an adaptive adversary: only $O(\log n)$ rounds and $O(\log n)$ messages are needed with high probability ($n$ is the number of nodes currently in the network). The algorithm requires only a constant number of topology changes. Moreover, our algorithm allows for an efficient implementation and maintenance of a distributed hash table (DHT) on top of DEX, with only a constant additional overhead. Our results are a step towards implementing efficient self-healing networks that have guaranteed properties (constant bounded degree and expansion) despite dynamic changes.
  • Distributed Symmetry Breaking in Hypergraphs PDF DOI
    by Shay Kutten, Danupon Nanongkai, Gopal Pandurangan, Peter Robinson.
    28th International Symposium on Distributed Computing (DISC 2014). Austin, Texas.
    Abstract...
    Fundamental local symmetry breaking problems such as Maximal Independent Set (MIS) and coloring have been recognized as important by the community, and studied extensively in (standard) graphs. In particular, fast (i.e., logarithmic run time) randomized algorithms are well-established for MIS and $\Delta +1$-coloring in both the LOCAL and CONGEST distributed computing models. On the other hand, comparatively much less is known on the complexity of distributed symmetry breaking in hypergraphs. In particular, a key question is whether a fast (randomized) algorithm for MIS exists for hypergraphs. In this paper, we study the distributed complexity of symmetry breaking in hypergraphs by presenting distributed randomized algorithms for a variety of fundamental problems under a natural distributed computing model for hypergraphs. We first show that MIS in hypergraphs (of arbitrary dimension) can be solved in $O(\log^2 n)$ rounds ($n$ is the number of nodes of the hypergraph) in the LOCAL model. We then present a key result of this paper --- an $O(\Delta^{\epsilon}\text{poly} \log n)$-round hypergraph MIS algorithm in the CONGEST model where $\Delta$ is the maximum node degree of the hypergraph and $\epsilon > 0$ is any arbitrarily small constant. To demonstrate the usefulness of hypergraph MIS, we present applications of our hypergraph algorithm to solving problems in (standard) graphs. In particular, the hypergraph MIS yields fast distributed algorithms for the balanced minimal dominating set problem (left open in Harris et al. [ICALP 2013]) and the minimal connected dominating set problem. We also present distributed algorithms for coloring, maximal matching, and maximal clique in hypergraphs. Our work shows that while some local symmetry breaking problems such as coloring can be solved in polylogarithmic rounds in both the LOCAL and CONGEST models, for many other hypergraph problems such as MIS, hitting set, and maximal clique, it remains challenging to obtain polylogarithmic time algorithms in the CONGEST model. This work is a step towards understanding this dichotomy in the complexity of hypergraph problems as well as using hypergraphs to design fast distributed algorithms for problems in (standard) graphs.
  • Optimal Bounds for Randomized Leader Election PDF DOI
    by Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan.
    Special Issue of Theoretical Computer Science, Elsevier. (TCS).
    Abstract...
    This paper concerns randomized leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete $n$-node networks that runs in $O(1)$ rounds and (with high probability) uses only $O(\sqrt{n}\log^{3/2} n)$ messages to elect a unique leader (with high probability). When considering the ''explicit'' variant of leader election where eventually every node knows the identity of the leader, our algorithm yields the asymptotically optimal bounds of $O(1)$ rounds and $O(n)$ messages. This algorithm is then extended to one solving leader election on any connected non-bipartite $n$-node graph $G$ in $O(\tau(G))$ time and $O(\tau(G)\sqrt{n}\log^{3/2} n)$ messages, where $\tau(G)$ is the mixing time of a random walk on $G$. The above result implies highly efficient (sublinear running time and messages) leader election algorithms for networks with small mixing times, such as expanders and hypercubes. In contrast, previous leader election algorithms had at least linear message complexity even in complete graphs. Moreover, super-linear message lower bounds are known for time-efficient deterministic leader election algorithms. Finally, we present an almost matching lower bound for randomized leader election, showing that $\Omega(\sqrt{n})$ messages are needed for any leader election algorithm that succeeds with probability at least $1/e + \epsilon$, for any small constant $\epsilon > 0$. We view our results as a step towards understanding the randomized complexity of leader election in distributed networks.
  • Distributed Agreement in Dynamic Peer-to-Peer Networks PDF DOI
    by John Augustine, Gopal Pandurangan, Peter Robinson, Eli Upfal.
    Journal of Computer and System Sciences, Elsevier. (JCSS).
  • The Generalized Loneliness Detector and Weak System Models for k-Set Agreement PDF DOI
    by Martin Biely, Peter Robinson, Ulrich Schmid.
    IEEE Transactions on Parallel and Distributed Systems, vol. 25(4), 1078-1088 (IEEE TPDS).
    Abstract...
    This paper presents two weak partially synchronous system models MAnti[n-k] and MSink[n-k], which are just strong enough for solving $k$-set agreement: We introduce the generalized $(n-k)$-loneliness failure detector $\mathcal{L}(k)$, which we first prove to be sufficient for solving $k$-set agreement, and show that $\mathcal{L}(k)$ but not $\mathcal{L}(k-1)$ can be implemented in both models. MAnti[n-k] and MSink[n-k] are hence the first message passing models that lie between models where $\Omega$ (and therefore consensus) can be implemented and the purely asynchronous model. We also address $k$-set agreement in anonymous systems, that is, in systems where (unique) process identifiers are not available. Since our novel $k$-set agreement algorithm using $\mathcal{L}(k)$ also works in anonymous systems, it turns out that the loneliness failure detector $\mathcal{L}=\mathcal{L}(n-1)$ introduced by Delporte et al. is also the weakest failure detector for set agreement in anonymous systems. Finally, we analyze the relationship between $\mathcal{L}(k)$ and other failure detectors suitable for solving $k$-set agreement.
2013 (6)
  • Sublinear Bounds for Randomized Leader Election PDF DOI
    by Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan.
    14th International Conference on Distributed Computing and Networking (ICDCN 2013). Mumbai. Best Paper Award.
    Abstract...
    This paper concerns randomized leader election in synchronous distributed networks. A distributed leader election algorithm is presented for complete n-node networks that runs in $O(1)$ rounds and (with high probability) takes only $O(\sqrt{n}\log^{3/2}n)$ messages to elect a unique leader (with high probability). This algorithm is then extended to solve leader election on any connected non-bipartite n-node graph $G$ in $O(\tau(G))$ time and $O(\tau(G)\sqrt{n}\log^{3/2}n)$ messages, where $\tau(G)$ is the mixing time of a random walk on $G$. The above result implies highly efficient (sublinear running time and messages) leader election algorithms for networks with small mixing times, such as expanders and hypercubes. In contrast, previous leader election algorithms had at least linear message complexity even in complete graphs. Moreover, super-linear message lower bounds are known for time-efficient deterministic leader election algorithms. Finally, an almost-tight lower bound is presented for randomized leader election, showing that $\Omega(\sqrt{n})$ messages are needed for any $O(1)$ time leader election algorithm which succeeds with high probability. It is also shown that $\Omega(n^{1/3})$ messages are needed by any leader election algorithm that succeeds with high probability, regardless of the number of the rounds. We view our results as a step towards understanding the randomized complexity of leader election in distributed networks.
  • Efficient Computation of Balanced Structures PDF DOI
    by David G. Harris, Ehab Morsy, Gopal Pandurangan, Peter Robinson, Aravind Srinivasan.
    40th International Colloquium on Automata, Languages and Programming (ICALP 2013). Riga.
    Abstract...
    Basic graph structures such as maximal independent sets (MIS’s) have spurred much theoretical research in distributed algorithms, and have several applications in networking and distributed computing as well. However, the extant (distributed) algorithms for these problems do not necessarily guarantee fault-tolerance or load-balance properties: For example, in a star-graph, the central vertex, as well as the set of leaves, are both MIS’s, with the latter being much more fault-tolerant and balanced - existing distributed algorithms do not handle this distinction. We propose and study "low-average degree" or "balanced" versions of such structures. Interestingly, in sharp contrast to, say, MIS’s, it can be shown that checking whether a structure is balanced, will take substantial time. Nevertheless, we are able to develop good sequential and distributed algorithms for such "balanced" versions. We also complement our algorithms with lower bounds.
  • On the Complexity of Universal Leader Election PDF DOI
    by Shay Kutten, Gopal Pandurangan, David Peleg, Peter Robinson, Amitabh Trehan.
    32nd ACM Symposium on Principles of Distributed Computing (PODC 2013). Montreal.
    Abstract...
    Electing a leader is a fundamental task in distributed computing. In its implicit version, only the leader must know who is the elected leader. This paper focuses on studying the message and time complexity of randomized implicit leader election in synchronous distributed networks. Surprisingly, the most ''obvious'' complexity bounds have not been proven for randomized algorithms. The ``obvious'' lower bounds of $\Omega(m)$ messages ($m$ is the number of edges in the network) and $\Omega(D)$ time ($D$ is the network diameter) are non-trivial to show for randomized (Monte Carlo) algorithms. (Recent results that show that even $\Omega(n)$ ($n$ is the number of nodes in the network) is not a lower bound on the messages in complete networks, make the above bounds somewhat less obvious). To the best of our knowledge, these basic lower bounds have not been established even for deterministic algorithms (except for the limited case of comparison algorithms, where it was also required that some nodes may not wake up spontaneously, and that $D$ and $n$ were not known). We establish these fundamental lower bounds in this paper for the general case, even for randomized Monte Carlo algorithms. Our lower bounds are universal in the sense that they hold for all universal algorithms (such algorithms should work for all graphs), apply to every $D$, $m$, and $n$, and hold even if $D$, $m$, and $n$ are known, all the nodes wake up simultaneously, and the algorithms can make any use of node's identities. To show that these bounds are tight, we present an $O(m)$ messages algorithm. An $O(D)$ time algorithm is known. An interesting fundamental problem is whether both upper bounds (messages and time) can be reached simultaneously in the randomized setting for all graphs. (The answer is known to be negative in the deterministic setting). We answer this problem partially by presenting a randomized algorithm that matches both complexities in some cases. This already separates (for some cases) randomized algorithms from deterministic ones. As first steps towards the general case, we present several universal leader election algorithms with bounds that trade-off messages versus time. We view our results as a step towards understanding the complexity of universal leader election in distributed networks.
  • Fast Byzantine Agreement in Dynamic Networks PDF DOI
    by John Augustine, Gopal Pandurangan, Peter Robinson
    32nd ACM Symposium on Principles of Distributed Computing (PODC 2013). Montreal.
    Abstract...
    We study Byzantine agreement in dynamic networks where topology can change from round to round and nodes can also experience heavy churn (i.e., nodes can join and leave the network continuously over time). Our main contributions are randomized distributed algorithms that guarantee almost-everywhere Byzantine agreement with high probability even under a large number of Byzantine nodes and continuous adversarial churn in a number of rounds that is polylogarithmic in $n$ (where $n$ is the stable network size). We show that our algorithms are essentially optimal (up to polylogarithmic factors) with respect to the amount of Byzantine nodes and churn rate that they can tolerate by showing lower bound. In particular, we present the following results: \begin{enumerate} \item An $O(\log^3 n)$ round randomized algorithm that achieves almost-everywhere Byzantine agreement with high probability under a presence of up to $O(\sqrt{n}/\text{polylog}(n))$ Byzantine nodes and up to a churn of $O(\sqrt{n}/\text{polylog}(n))$ nodes per round. We assume that the Byzantine nodes have knowledge about the entire state of network at every round (including random choices made by all the nodes) and can behave arbitrarily. We also assume that an adversary controls the churn --- it has complete knowledge and control of what nodes join and leave and at what time and has unlimited computational power (but is oblivious to the topology changes from round to round). Our algorithm requires only polylogarithmic in $n$ bits to be processed and sent (per round) by each node. \item We also present an $O(\log^3 n)$ round randomized algorithm that has same guarantees as the above algorithm, but works even when the churn and network topology is controlled by an adaptive adversary (that can choose the topology based on the current states of the nodes). However, this algorithm requires up to polynomial in $n$ bits to be processed and sent (per round) by each node. \item We show that the above bounds are essentially the best possible, if one wants fast (i.e., polylogarithmic run time) algorithms, by showing that any (randomized) algorithm to achieve agreement in a dynamic network controlled by an adversary that can churn up to $\Theta(\sqrt{ n \log n})$ nodes per round should take at least a polynomial number of rounds. \end{enumerate} Our algorithms are the first-known, fully-distributed, Byzantine agreement algorithms in highly dynamic networks. We view our results as a step towards understanding the possibilities and limitations of highly dynamic networks that are subject to malicious behavior by a large number of nodes.
  • Search and Storage in Dynamic Peer-to-Peer Networks PDF DOI
    by John Augustine, Anisur Molla, Ehab Morsy, Gopal Pandurangan, Peter Robinson, Eli Upfal.
    25th ACM Symposium on Parallelism in Algorithms and Architectures (SPAA 2013). Montreal.
    Abstract...
    We study robust and efficient distributed algorithms for searching, storing, and maintaining data in dynamic Peer-to-Peer (P2P) networks. P2P networks are highly dynamic networks that experience heavy node churn (i.e., nodes join and leave the network continuously over time). Our goal is to guarantee, despite high node churn rate, that a large number of nodes in the network can store, retrieve, and maintain a large number of data items. Our main contributions are fast randomized distributed algorithms that guarantee the above with high probability even under high adversarial churn. In particular, we present the following main results: \begin{enumerate} \item A randomized distributed search algorithm that with high probability guarantees that searches from as many as $n - o(n)$ nodes ($n$ is the stable network size) succeed in ${O}(\log n )$-rounds despite ${O}(n/\log^{1+\delta} n)$ churn, for any small constant $\delta > 0$, per round. We assume that the churn is controlled by an oblivious adversary (that has complete knowledge and control of what nodes join and leave and at what time and has unlimited computational power, but is oblivious to the random choices made by the algorithm). \item A storage and maintenance algorithm that guarantees, with high probability, data items can be efficiently stored (with only $\Theta(\log{n})$ copies of each data item) and maintained in a dynamic P2P network with churn rate up to ${O}(n/\log^{1+\delta} n)$ per round. Our search algorithm together with our storage and maintenance algorithm guarantees that as many as $n - o(n)$ nodes can efficiently store, maintain, and search even under ${O}(n/\log^{1+\delta} n)$ churn per round. Our algorithms require only polylogarithmic in $n$ bits to be processed and sent (per round) by each node. \end{enumerate} To the best of our knowledge, our algorithms are the first-known, fully-distributed storage and search algorithms that provably work under highly dynamic settings (i.e., high churn rates per step). Furthermore, they are localized (i.e., do not require any global topological knowledge) and scalable. A technical contribution of this paper, which may be of independent interest, is showing how random walks can be provably used to derive scalable distributed algorithms in dynamic networks with adversarial node churn.
  • Robust Leader Election in a Fast-Changing World
    by John Augustine, Tejas Kulkarni, Paresh Nakhe, Peter Robinson.
    9th International Workshop on Foundations of Mobile Computing (FOMC 2013).
    Abstract...
    We consider the problem of electing a leader among nodes in a highly dynamic network where the adversary has unbounded capacity to insert and remove nodes (including the leader) from the network and change connectivity at will. We present a randomized algorithm that (re)elects a leader in $O(D\log n)$ rounds with high probability, where $D$ is a bound on the dynamic diameter of the network and $n$ is the maximum number of nodes in the network at any point in time. We assume a model of broadcast-based communication where a node can send only $1$ message of $O(\log n)$ bits per round and is not aware of the receivers in advance. Thus our results also apply to mobile wireless ad-hoc networks, improving over the optimal (for deterministic algorithms) $O(Dn)$ solution presented at FOMC 2011. We show that our algorithm is optimal by proving that any randomized algorithm takes at least $\Omega(D\log n)$ rounds to elect a leader with high probability, which shows that our algorithm yields the best possible (up to constants) termination time.
2012 (2)
  • Towards Robust and Efficient Computation in Dynamic Peer-to-Peer Networks PDF DOI
    by John Augustine, Gopal Pandurangan, Peter Robinson, Eli Upfal.
    23rd ACM-SIAM Symposium on Discrete Algorithms (SODA 2012). Kyoto.
    Abstract...
    Motivated by the need for robust and fast distributed computation in highly dynamic Peer-to-Peer (P2P) networks, we study algorithms for the fundamental distributed agreement problem. P2P networks are highly dynamic networks that experience heavy node churn (i.e., nodes join and leave the network continuously over time). Our goal is to design fast algorithms (running in a small number of rounds) that guarantee, despite high node churn rate, that almost all nodes reach a stable agreement. Our main contributions are randomized distributed algorithms that guarantee stable almost-everywhere agreement with high probability even under high adversarial churn in a polylogarithmic number of rounds. In particular, we present the following results: \begin{enumerate} \item An $O(\log^2 n)$-round ($n$ is the stable network size) randomized algorithm that achieves almost-everywhere agreement with high probability under up to linear churn per round (i.e., $\epsilon n$, for some small constant $\epsilon > 0$), assuming that the churn is controlled by an oblivious adversary (that has complete knowledge and control of what nodes join and leave and at what time and has unlimited computational power, but is oblivious to the random choices made by the algorithm). \item An $O(\log m\log^3 n)$-round randomized algorithm that achieves almost-everywhere agreement with high probability under up to $\epsilon \sqrt{n}$ churn per round (for some small $\epsilon > 0$), where $m$ is the size of the input value domain, that works even under an adaptive adversary (that also knows the past random choices made by the algorithm). \item We also show that no deterministic algorithm can guarantee almost-everywhere agreement (regardless of the number of rounds), even under constant churn rate. \end{enumerate} Our algorithms are the first-known, fully-distributed, agreement algorithms that work under highly dynamic settings (i.e., high churn rates per step). Furthermore, they are localized (i.e., do not require any global topological knowledge), simple, and easy to implement. These algorithms can serve as building blocks for implementing other non-trivial distributed computing tasks in dynamic P2P networks.
  • Agreement in Directed Dynamic Networks PDF DOI
    by Martin Biely, Peter Robinson, Ulrich Schmid.
    19th International Colloquium on Structural Information and Communication Complexity (SIROCCO 2012). Reykjavík.
    Abstract...
    We study distributed computation in synchronous dynamic networks where an omniscient adversary controls the unidirectional communication links. Its behavior is modeled as a sequence of directed graphs representing the active (i.e. timely) communication links per round. We prove that consensus is impossible under some natural weak connectivity assumptions and introduce vertex-stable root components as a means for circumventing this impossibility. Essentially, we assume that there is a short period of time during which an arbitrary part of the network remains strongly connected, while its interconnect topology may keep changing continuously. We present a consensus algorithm that works under this assumption and prove its correctness. Our algorithm maintains a local estimate of the communication graphs and applies techniques for detecting stable network properties and univalent system configurations. Our possibility results are complemented by several impossibility results and lower bounds for consensus and other distributed computing problems like leader election, revealing that our algorithm is asymptotically optimal.
2011 (5)
  • Weak System Models for Fault-Tolerant Distributed Agreement Problems
    by Peter Robinson.
    PhD Thesis in Computer Science.
    Abstract...
    This thesis investigates various aspects of weak system models for agreement problems in fault-tolerant distributed computing. In Part~I we provide an introduction to the context of this work, discuss related literature and describe the basic system assumptions. In Part~II of this thesis, we introduce the Asynchronous Bounded-Cycle (ABC) model, which is entirely time-free. In contrast to existing system models, the ABC model does not require explicit time-based synchrony bounds, but rather stipulates a graph-theoretic synchrony condition on the relative lengths of certain causal chains of messages in the space-time graph of a run. We compare the ABC model to other models in literature, in particular to the classic models by Dwork, Lynch, and Stockmeyer. Despite Byzantine failures, we show how to simulate lock-step rounds, and therefore make consensus solvable, and prove the correctness of a clock synchronization algorithm in the ABC model. We then present the technically most involved result of this thesis: We prove that any algorithm working correctly in the partially synchronous $\Theta$-Model by Le Lann and Schmid, also works correctly in the time-free ABC model. In the proof, we use a variant of Farkas' Theorem of Linear Inequalities and develop a non-standard cycle space on directed graphs in order to guarantee the existence of a certain message delay transformation for finite prefixes of runs. This shows that any time-free safety property satisfied by an algorithm in the $\Theta$-Model also holds in the ABC model. By employing methods from point-set topology, we can extend this result to liveness properties. In Part~III, we shift our attention to the borderland between models where consensus is solvable and the purely asynchronous model. To this end, we look at the $k$-set agreement problem where processes need to decide on at most $k$ distinct decision values. We introduce two very weak system models MAnti and MSink and prove that consensus is impossible in these models. Nevertheless, we show that $(n-1)$-set agreement is solvable in MAnti and MSink, by providing algorithms that implement the weakest failure detector $\mathcal{L}$. We also discuss how models MAnti and MSink relate to the $f$-source models by Aguilera et al. for solving consensus. In the subsequent chapter, we present a novel failure detector $\mathcal{L}(k)$ that generalizes $\mathcal{L}$, and analyze an algorithm for solving $k$-set agreement with $\mathcal{L}(k)$, which works even in systems without unique process identifiers. Moreover, We explore the relationship between $\mathcal{L}(k)$ and existing failure detectors for $k$-set agreement. Some aspects of $\mathcal{L}(k)$ relating to anonymous systems are also discussed. Next, we present a generic theorem that can be used to characterize the impossibility of achieving $k$-set agreement in various system models. This enables us to show that $(\Sigma_k,\Omega_k)$ is not sufficient for solving $k$-set agreement. Furthermore, we instantiate our theorem with a partially synchronous system model. Finally, we consider the $k$-set agreement problem in round-based systems. First, we introduce a novel abstraction that encapsulates the perpetual synchrony of a run, the so called stable skeleton graph, which allows us to express the solvability power of a system via graph-theoretic properties. We then present an approximation algorithm where processes output an estimate of their respective component of the stable skeleton graph. We define a class of communication predicates PSources(k) in this framework, and show that PSources(k) tightly captures the amount of synchrony necessary for $k$-set agreement, as $(k-1)$-set agreement is impossible with PSources(k). Based on the stable skeleton approximation, we present an algorithm that solves $k$-set agreement when PSources(k) holds.
  • Optimal Regional Consecutive Leader Election in Mobile Ad-Hoc Networks PDF DOI
    by Hyun Chul Chung, Peter Robinson, Jennifer L. Welch.
    7th ACM SIGACT/SIGMOBILE International Workshop on Foundations of Mobile Computing (part of FCRC 2011). San Jose.
    Abstract...
    The regional consecutive leader election (RCLE) problem requires mobile nodes to elect a leader within bounded time upon entering a specific region. We prove that any algorithm requires $\Omega(Dn)$ rounds for leader election, where D is the diameter of the network and $n$ is the total number of nodes. We then present a fault-tolerant distributed algorithm that solves the RCLE problem and works even in settings where nodes do not have access to synchronized clocks. Since nodes set their leader variable within $O(Dn)$ rounds, our algorithm is asymptotically optimal with respect to time complexity. Due to its low message bit complexity, we believe that our algorithm is of practical interest for mobile wireless ad-hoc networks. Finally, we present a novel and intuitive constraint on mobility that guarantees a bounded communication diameter among nodes within the region of interest.
  • Solving k-Set Agreement with Stable Skeleton Graphs PDF DOI
    by Martin Biely, Peter Robinson, Ulrich Schmid.
    16th IEEE International Symposium on Parallel and Distributed Processing Workshops and PhD Forum (IPDPS 2011). Anchorage.
  • Easy Impossibility Proofs for k-Set Agreement in Message Passing Systems PDF DOI
    by Martin Biely, Peter Robinson, Ulrich Schmid.
    15th International Conference On Principles Of Distributed Systems (OPODIS 2011). Toulouse.
    Abstract...
    Despite of being quite similar agreement problems, distributed consensus ($1$-set agreement) and general $k$-set agreement require surprisingly different techniques for proving their impossibility in asynchronous systems with crash failures: Rather, than the relatively simple bivalence arguments as in the impossibility proof for consensus in the presence of a single crash failure, known proofs for the impossibility of $k$-set agreement in shared memory systems with $f\geq k>1$ crash failures use algebraic topology or a variant of Sperner's Lemma. In this paper, we present a generic theorem for proving the impossibility of $k$-set agreement in various message passing settings, which is based on a reduction to the consensus impossibility in a certain subsystem resulting from a partitioning argument. We demonstrate the broad applicability of our result by exploring the possibility/impossibility border of $k$-set agreement in several message passing system models: (i) asynchronous systems with crash failures, (ii) partially synchronous processes with (initial) crash failures, and, most importantly, (iii) asynchronous systems augmented with failure detectors. Furthermore, by extending the algorithm for initial crashes of Fisher, Lynch and Patterson (1985) to general $k$-set agreement, we show that the impossibility border of (i) is tightly matched. The impossibility proofs in cases (i), (ii), and (iii) are instantiations of our main theorem. Regarding (iii), applying our technique reveals the exact border for the parameter $k$ where $k$-set agreement is solvable with the failure detector class $(\Sigma_k,\Omega_k)_{1\le k\le n-1}$ of Bonnet and Raynal. As $\Sigma_k$ was shown to be necessary for solving $k$-set agreement, this result yields new insights on the quest for the weakest failure detector
  • The Asynchronous Bounded-Cycle Model PDF DOI
    by Peter Robinson and Ulrich Schmid.
    Theoretical Computer Science 412 (2011) 5580–5601. (TCS).
    Abstract...
    This paper shows how synchrony conditions can be added to the purely asynchronous model in a way that avoids any reference to message delays and computing step times, as well as any global constraints on communication patterns and network topology. Our Asynchronous Bounded-Cycle (ABC) model just bounds the ratio of the number of forward- and backward-oriented messages in certain ''relevant'' cycles in the space-time diagram of an asynchronous execution. We show that clock synchronization and lock-step rounds can easily be implemented and proved correct in the ABC model, even in the presence of Byzantine failures. Furthermore, we prove that any algorithm working correctly in the partially synchronous $\Theta$-Model also works correctly in the ABC model. In our proof, we first apply a novel method for assigning certain message delays to asynchronous executions, which is based on a variant of Farkas' theorem of linear inequalities and a non-standard cycle-space of graphs. Using methods from point-set topology, we then prove that the existence of this delay assignment implies model indistinguishability for time-free safety and liveness properties. Finally, we introduce several weaker variants of the ABC model and relate our model to the existing partially synchronous system models, in particular, the classic models of Dwork, Lynch and Stockmayer. Furthermore, we discuss aspects of the ABC model's applicability in real systems, in particular, in the context of VLSI Systems-on-Chip.
2010 (1)
  • Regional Consecutive Leader Election in Mobile Ad-Hoc Networks
    by Hyun Chul Chung, Peter Robinson, Jennifer L. Welch.
    6th ACM SIGACT/SIGMOBILE Workshop on Foundations of Mobile Computing (DIALM-POMC 2010). Boston.
2009 (1)
  • Weak Synchrony Models and Failure Detectors for Message Passing k-Set Agreement DOI
    by Martin Biely, Peter Robinson, Ulrich Schmid.
    13th International Conference On Principles Of Distributed Systems (OPODIS 2009). Nimes.
2008 (1)
  • The Asynchronous Bounded-Cycle Model DOI
    by Peter Robinson and Ulrich Schmid.
    10th International Symposium on Stabilization, Safety, and Security of Distributed Systems (SSS 2008). Detroit. Best Paper Award.
    Abstract...
    This paper shows how synchrony conditions can be added to the purely asynchronous model in a way that avoids any reference to message delays and computing step times, as well as any global constraints on communication patterns and network topology. Our Asynchronous Bounded-Cycle (ABC) model just bounds the ratio of the number of forward- and backward-oriented messages in certain ''relevant'' cycles in the space-time diagram of an asynchronous execution. We show that clock synchronization and lock-step rounds can easily be implemented and proved correct in the ABC model, even in the presence of Byzantine failures. Furthermore, we prove that any algorithm working correctly in the partially synchronous $\Theta$-Model also works correctly in the ABC model. In our proof, we first apply a novel method for assigning certain message delays to asynchronous executions, which is based on a variant of Farkas' theorem of linear inequalities and a non-standard cycle-space of graphs. Using methods from point-set topology, we then prove that the existence of this delay assignment implies model indistinguishability for time-free safety and liveness properties. Finally, we introduce several weaker variants of the ABC model and relate our model to the existing partially synchronous system models, in particular, the classic models of Dwork, Lynch and Stockmayer. Furthermore, we discuss aspects of the ABC model's applicability in real systems, in particular, in the context of VLSI Systems-on-Chip.
2006 (1)
  • Log File Processing by Machine Learning and Information Extraction
    by Peter Robinson.
    Master Thesis. TU Vienna, Institute of Computer Languages, 2006. Nominated for Distinguished Young Alumnus Award.
    Abstract...
    In today's computer network systems lots of events are constantly written to log files. Unfortunately there is no common standard defining the structure of these event messages which are partly in human readable natural language form. Obviously, this lack of structure makes automatic processing a lot more difficult. This master thesis describes the architecture and implementation of the LoP-System, a system that attempts to create machine readable event structures from ordinary log file events by natural language processing. The thesis explains implementational details as well as the theoretical concepts used. The core of the system consists of a series of cascaded but independent components, partly enhanced with machine learning techniques. The raw input is first processed by a simple recursive descent parser which recognizes syntactical features (e.g. IP addresses) and is then passed on to a part-of-speech tagger based on a hidden Markov model. Applying regular expression patterns to the tagged words is used to combine them to basic word groups (e.g. noun groups), which are subsequently semantically analyzed. The final step is the construction of the output events by a rule based event constructor. All components are implemented in Haskell, a purely functional programming language. Some of the components developed during this thesis, especially the part-of-speech tagger, are general natural language processing tools and can be applied to other domains.

Code

I'm interested in parallel and distributed programming and related technologies such as software transactional memory. My language of choice is Haskell, a purely functional programming language with outstanding support for parallelism. Below is a (non-comprehensive) list of software that I have written.
  • I extended Cabal, for using a "world" file to keep track of installed packages. (Now part of the main distribution.)
  • data dispersal: an implementation of an (m,n)-threshold information dispersal scheme that is space-optimal.
  • secret sharing: an implementation of a secret sharing scheme that provides information-theoretic security.
  • dice-entropy: a library that provides cryptographically secure dice rolls implemented by bit-efficient rejection sampling.
  • TSkipList: a data structure with range-query support for software transactional memory.
  • stm-io-hooks: An extension of Haskell's Software Transactional Memory (STM) monad with commit and retry IO hooks.
  • Mathgenealogy: Visualize your (academic) genealogy! A program for extracting data from the Mathematics Genealogy project.
  • In my master thesis I developed a system for automatically constructing events out of log files produced by various system programs. One of the core components of my work was a part-of-speech (POS) tagger, which assigns word classes (e.g. noun, verb) to the previously parsed tokens of the log file. To cope with noisy input data, I modeled the POS tagger as a hidden Markov model. I developed (and proved the correctness of) a variant of the maximum likelihood estimation algorithm for training the Markov model and smoothing the state transition distributions.

Misc

  • Conferences at which you might have seen me: PODC 2008 (Toronto, Canada); SSS 2008 (Detroit, USA); OPODIS 2009 (Nimes, France); ALGOSENSORS 2010 (Bordeaux, France); DISC 2010; (Boston, USA) IPDPS 2011 (Anchorage, USA); FOMC 2011 (San Jose, USA); SODA 2012 (Kyoto, Japan); SIROCCO 2012 (Reykjavik, Iceland); ICDCN 2013 (Mumbai, India); ICALP 2013 (Riga, Latvia); SPAA 2013 (Montreal, Canada); PODC 2013 (Montreal, Canada); Shonan Workshop (Shonan Village, Japan); DISC 2015 (Tokyo, Japan).
  • Program committee membership: ICDCN 2016, SPAA 2016, SIROCCO 2016, ICDCN 2015, SIROCCO 2014, FOMC 2014
  • DBLP entry.
  • Google Scholar profile.
  • Profile on StackExchange.