Deadlock Models in Distributed Computation: Foundations ...

4
Deadlock Models in Distributed Computation: Foundations, Design, and Computational Complexity * Valmir Carneiro Barbosa COPPE Universidade Federal do Rio de Janeiro Rio de Janeiro, Brazil [email protected] Alan Diêgo A. Carneiro Instituto de Computação Universidade Federal Fluminense Niterói, Brazil [email protected] Fábio Protti Instituto de Computação Universidade Federal Fluminense Niterói, Brazil [email protected] Uéverton S. Souza Instituto de Computação Universidade Federal Fluminense Niterói, Brazil [email protected] ABSTRACT Distributed systems consist of a set of independent processors interconnected by a communication network that supports resource sharing. A deadlock occurs in a distributed system when a group of processes waits indefinitely for resources from each other. Distributed systems are usually represented by wait-for graphs, where the behavior of a process is determined by a deadlock model. In this paper, we revisit deadlock model concepts, and present a new deadlock model as a simpler alternative to the And/Or model. Using also computational complexity and circuit complexity aspects, we provide a novel analysis of the hierarchy of classical deadlock models, where we identify how expressive each model is from the point of view of polynomial computations. Finally we present a generic graph structure to characterize deadlock situations. CCS Concepts Theory of computation Data structures design and analysis; Circuit complexity; Software and its en- gineering Deadlocks; Keywords And/Or graph, computational complexity, deadlock, dis- tributed system, wait-for graph 1. INTRODUCTION A deadlock situation is characterized by the permanent impossibility for a group of processes to progress with their * Research partially supported by Brazilian agency CNPq. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. SAC 2016, April 04 - 08, 2016, Pisa, Italy Copyright is held by the owner/author(s). Publication rights licensed to ACM. ACM 978-1-4503-3739-7/16/04…$15.00 DOI: http://dx.doi.org/10.1145/2851613.2851880 tasks due to the occurrence of a condition that prevents at least one needed resource from being granted to each of the processes in that group [3]. Let V denote a set of processes in a distributed computation. Informally, as described in [2], a deadlock is said to exist in this computation if a subset S V can be identified whose members are all blocked due to the occurrence of some condition that can only be relieved by members of the same subset S. Note that a necessary condition for the existence of a deadlock in a computation is the existence of cycles of dependency. A useful abstraction to analyse deadlock situations is the wait-for graph G =(V,E), where E is a set of directed edges such that an edge exists in E directed away from vi V towards vj V if vi is blocked for some condition that vj may relieve [1]. The graph G changes dynamically as the computation progresses, and what determines the evolution of G by allowing for changes in the set of its directed edges is the deadlock model that holds for the computation [2]. In essence, what a deadlock model does is to specify rules for nodes that are not sink nodes in G to become sink nodes. (A sink is a node with out-degree zero.). As deadlocks are stable properties once they take hold of a group of processes, only the external intervention that eventually follows detection may break them. Whenever we refer to G we mean the wait- for graph that corresponds to a “snapshot” of the distributed computation in the usual sense of a consistent global state [1, 4]. Additional concepts and notation. For vi V , let Oi Di be the set of immediate descendants of vi G (descendants that are one edge away from vi ) and Ii Ai its set of immediate ancestors in G (ancestors that are one edge away from vi ). 2. HIERARCHY OF DEADLOCK MODELS The main deadlock models investigated in the literature are described below. AND model – a process vi can only become a sink when its wait state is relieved by all processes in Oi [5, 6]. OR model – to exit from its wait state, it suffices for a process vi to be relieved by one of the processes in Oi [5, 6, 538

Transcript of Deadlock Models in Distributed Computation: Foundations ...

Page 1: Deadlock Models in Distributed Computation: Foundations ...

Deadlock Models in Distributed Computation:Foundations, Design, and Computational Complexity∗

Valmir Carneiro BarbosaCOPPE

Universidade Federal do Riode Janeiro

Rio de Janeiro, [email protected]

Alan Diêgo A. CarneiroInstituto de Computação

Universidade FederalFluminense

Niterói, [email protected]

Fábio ProttiInstituto de Computação

Universidade FederalFluminense

Niterói, [email protected]

Uéverton S. SouzaInstituto de Computação

Universidade FederalFluminense

Niterói, [email protected]

ABSTRACTDistributed systems consist of a set of independent processorsinterconnected by a communication network that supportsresource sharing. A deadlock occurs in a distributed systemwhen a group of processes waits indefinitely for resources fromeach other. Distributed systems are usually represented bywait-for graphs, where the behavior of a process is determinedby a deadlock model. In this paper, we revisit deadlock modelconcepts, and present a new deadlock model as a simpleralternative to the And/Or model. Using also computationalcomplexity and circuit complexity aspects, we provide a novelanalysis of the hierarchy of classical deadlock models, wherewe identify how expressive each model is from the pointof view of polynomial computations. Finally we present ageneric graph structure to characterize deadlock situations.

CCS Concepts•Theory of computation → Data structures designand analysis; Circuit complexity; •Software and its en-gineering → Deadlocks;

KeywordsAnd/Or graph, computational complexity, deadlock, dis-tributed system, wait-for graph

1. INTRODUCTIONA deadlock situation is characterized by the permanent

impossibility for a group of processes to progress with their

∗Research partially supported by Brazilian agency CNPq.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected] 2016, April 04 - 08, 2016, Pisa, ItalyCopyright is held by the owner/author(s). Publication rights licensed to ACM.ACM 978-1-4503-3739-7/16/04…$15.00DOI: http://dx.doi.org/10.1145/2851613.2851880

tasks due to the occurrence of a condition that prevents atleast one needed resource from being granted to each of theprocesses in that group [3]. Let V denote a set of processesin a distributed computation. Informally, as described in [2],a deadlock is said to exist in this computation if a subsetS ⊆ V can be identified whose members are all blocked dueto the occurrence of some condition that can only be relievedby members of the same subset S. Note that a necessarycondition for the existence of a deadlock in a computation isthe existence of cycles of dependency.

A useful abstraction to analyse deadlock situations is thewait-for graph G = (V,E), where E is a set of directed edgessuch that an edge exists in E directed away from vi ∈ Vtowards vj ∈ V if vi is blocked for some condition that vjmay relieve [1]. The graph G changes dynamically as thecomputation progresses, and what determines the evolutionof G by allowing for changes in the set of its directed edgesis the deadlock model that holds for the computation [2]. Inessence, what a deadlock model does is to specify rules fornodes that are not sink nodes in G to become sink nodes. (Asink is a node with out-degree zero.). As deadlocks are stableproperties once they take hold of a group of processes, onlythe external intervention that eventually follows detectionmay break them. Whenever we refer to G we mean the wait-for graph that corresponds to a “snapshot” of the distributedcomputation in the usual sense of a consistent global state [1,4].Additional concepts and notation. For vi ∈ V , letOi ⊆ Di be the set of immediate descendants of vi ∈ G(descendants that are one edge away from vi) and Ii ⊆ Ai

its set of immediate ancestors in G (ancestors that are oneedge away from vi).

2. HIERARCHY OF DEADLOCK MODELSThe main deadlock models investigated in the literature

are described below.AND model – a process vi can only become a sink whenits wait state is relieved by all processes in Oi [5, 6].OR model – to exit from its wait state, it suffices for aprocess vi to be relieved by one of the processes in Oi [5, 6,

538

Page 2: Deadlock Models in Distributed Computation: Foundations ...

7].X-Out-Of-Y model – there are two integers xi and yiassociated with each process vi. Also, yi = |Oi|, meaningthat process vi is, in principle, waiting for communicationfrom every process in Oi. However, in order to be relievedfrom its wait condition, it suffices that such communicationarrives from any xi of those yi processes [8].AND-OR model – there are ti ≥ 1 subsets of Oi associ-ated with each process vi. These subsets are denoted byO1

i , · · · , Otii and must be such that Oi = O1

i ∪ · · · ∪Otii . In

order for process vi to be relieved from its wait condition,it must receive grant messages from all the processes in atleast one of O1

i , · · · , Otii . For this reason, these ti subsets

of Oi are assumed to be such that no one is contained inanother [6, 2].Disjunctive X-Out-Of-Y model – there are ui ≥ 1 pairsof integers, denoted by (x1

i , y1i ), · · · , (xui

i , yuii ), associated

with each process vi. These integers are such that y1i =

|Q1i |, · · · , yui

i = |Quii |, where Q1

i , · · · , Quii are subsets of Oi

such that Oi = Q1i ∪ · · · ∪ Qui

i . In order to be relievedfrom its wait condition, vi must be granted access to sharedresources by either x1

i of the y1i processes in Q1i , or x2

i of they2i processes in Q2

i , and so on. Of course, it makes no sensefor Q′

i, Q′′i ∈ {Q1

i , · · · , Quii } to exist such that Q′

i ⊆ Q′′i and

x′i ≥ x′′

i , which is then assumed not to be the case [5].In the literature, these five models are generally classified

in a hierarchy in which a model generalizes the previousone in the sense that it contains as special cases all thepossible wait conditions of the other. For example, the X-Out-Of-Y model generalizes the AND model with xi = yiand the OR model with xi = 1 for all vi ∈ V . Several worksclaim that the AND-OR model is more general than theX-Out-Of-Y model, while the converse is not true. Theseworks assumes that the AND-OR model expresses a generalX-Out-Of-Y condition taking for all vi ∈ V , ti =

(yixi

)and

|O1i | = · · · = |Oti

i | = xi, while the converse is not known.To conclude this introduction, we remark that the AND-ORmodel and the Disjunctive X-Out-Of-Y model are consideredequivalent to each other in the literature.

2.1 Power of Expression in Polynomial TimeIn this subsection we add more formalism to the analysis

of the hierarchy of deadlock models. We start by observingthat if efficient transformation time is added as an additionalcriterion to be satisfied by a given deadlock model to ex-press the wait conditions of a network described in anothermodel, then the previously proposed hierarchy is inappropri-ate. Hence, we propose a restructuring of the hierarchy ofdeadlock models in order to satisfy the property that eachmodel expresses the wait conditions of its special cases inpolynomial time. Some concepts and definitions are needed.Efficient computation time. In general, an algorithmdesigned to solve a problem is considered efficient if suchan algorithm is executable in polynomial time, because non-polynomial algorithms are generally not applicable in prac-tice, especially for large instances.

When analyzing the power of expression of a deadlockmodel and whether it contains as special cases all the possiblewait conditions of another model, we are actually verifying ifthere is a reduction algorithm that receives as input a wait-for graph G in which computation works according a specificdeadlock model, and outputs an equivalent wait-for graphH in which computation works according another deadlock

model and contains all wait conditions expressed in G.Such reductions can be very useful in practice – it is

always interesting to know whether it is possible to translatea wait-for graph built upon a complex deadlock model intoanother equivalent wait-for graph operating on a simplermodel. However, for such reductions to be effective, thiscomputation must be performed in polynomial time.

At this point, we have the necessary background to intro-duce our definition of reduction between deadlocks models.

Definition 1. Given two distinct deadlock models A andB, B generalizes A in polynomial time if there exists a polyno-mial time algorithm that receives a wait-for graph G = (V,E)in which computation works according the deadlock model A,and returns a wait-for graph G′ = (V ′, E′) in which compu-tation works according the deadlock model B, such that:

1. there is an injective function f : V → V ′;

2. the wait-for graph G′ transitively contains all the pos-sible wait conditions of G, that is:

(a) if for all processes in S1 ⊆ V to be relieved fromtheir wait conditions it is necessary/sufficient thatall processes in S2 ⊆ V have been relieved, thenfor all processes in f(S1) ⊆ V ′ to be relieved fromtheir wait conditions it is necessary/sufficient thatall processes in f(S2) ⊆ V ′ have been relieved;

(b) conversely, if for all processes in f(S1) to be re-lieved from their wait conditions it is necessary/sufficient that all processes in f(S2) have been re-lieved, then for all processes in S1 to be relievedfrom their wait conditions it is necessary/sufficientthat all processes in S2 have been relieved.

Definition 1 assumes that dependency between processesis a transitive relation, i.e. if a process vi depends on vj andvj depends on vk then vi depends on vk. In addition, Notethat definition 1 allows G′ to posses some auxiliary nodes(processes) that did not exist in G, provided that the numberof auxiliary processes created is polynomial.

According to Definition 1, if a deadlock model B generalizesin polynomial time a deadlock model A then the model Bhas at least the same power of expression as model A, andin polynomial time all wait conditions expressible by A canbe expressed by B.

Lemma 1. The classical transformation from the X-Out-Of-Y model to the AND-OR model is not performed in poly-nomial time.

Sketch of proof. The property key of this proof is that,if xi, yi = O(n) where n = |V (G)|, the running time of theclasical transformation is O(nn).

Now, inspired by the representation of Ryang [6], we definea new deadlock model based on the AND-OR model asfollows.Simplified AND-OR model – there are two types ofprocesses, AND and OR. An AND process vi can only becomea sink when its wait state is relieved by all the processesin Oi, while for an OR process vj to leave its wait state itsuffices to be relieved by one of the processes in Oj .

At this point, we have all the elements to propose a re-structuring of the hierarchy of deadlock models.

539

Page 3: Deadlock Models in Distributed Computation: Foundations ...

Lemma 2.

1. The OR model does not generalize the AND model.

2. The AND model does not generalize the OR model.

3. The Simplified AND-OR model generalizes in polyno-mial time the AND and OR models.

4. The AND-OR model generalizes in polynomial time theSimplified AND-OR model.

Sketch of proof. The proff is given by: (1) and (2) ifwe suppose that the wait-for graph is a tree, will never bepossible to represent the correct wait condition in the othermodel in order to relief the root; (3) a special case of theSimplified AND-OR model consists of labeling all processeswith AND (or OR); (4) by showing that the Simplified AND-OR model is a special case of the AND-OR model.

The next result shows that the AND-OR model can beeasily simplified to an easier handling structure.

Lemma 3. The Simplified AND-OR model generalizes inpolynomial time the AND-OR model.

Sketch of proof. Consists in create auxiliary nodes foreach subset Oj

i that operates in a similar way. All the waitconditions are preserved in the resultant wait-for graph.

Now, we deal with the X-Out-Of-Y model and the Dis-junctive X-Out-Of-Y model.

Lemma 4. 1. The X-Out-Of-Y model generalizes inpolynomial time the Simplified AND-OR model.

2. The Disjunctive X-Out-Of-Y model generalizes in poly-nomial time the X-Out-Of-Y model.

3. The X-Out-Of-Y model generalizes in polynomial timethe Disjunctive X-Out-Of-Y model.

Sketch of proof. (1) and (2) by showing that they arespecial cases. (3) is by creating auxiliary nodes for eachsubset Qj

i .Finally, we focus our attention to the following question:(*) “Does the AND-OR model generalize in polynomial

time the X-Out-Of-Y model?”According to Lemma 1, the well-known transformation

from the X-Out-Of-Y model to the AND-OR model demandsexponential time in the worst case. Hence, to answer thisquestion, we introduce some concepts on circuit complexity.

2.1.1 Circuit ComplexityAccording to [9], a boolean circuit with n input bits is a

directed acyclic graph in which every node (usually calledgate in this context) is either: (a) an input node of in-degree0 labeled by one of the n input bits; (b) an AND gate; (c) anOR gate; (d) a NOT gate. One of these gates is designatedas the output gate. Note that such a circuit computes afunction of its n inputs.

The size of a circuit is the number of gates it contains, andits depth is the maximal length of a path from an input gateto the output gate. Monotone boolean circuits are booleancircuits that use only AND and OR gates.Observation 1. We can interpret a monotone booleancircuit as a distributed system operating in accordance withthe Simplified AND-OR model, where the input nodes aresink nodes and an output ‘1’ from a node is equivalent tothe dispatch of grant messages from the associated process.

A threshold gate g with n inputs can be described by a gateconsisting of a sequence w1, · · · , wt of real numbers calledweights and a threshold t. The threshold gate computes thethreshold function:

ft(x1, · · · , xn) =

1, if

n∑1

wi.xi ≥ t;

0, otherwise.

Observation 2. A monotone threshold circuit can be in-terpreted as a distributed system operating in accordancewith the X-Out-Of-Y model. On the other hand, the wait-condition of a process in such model is, in fact, a thresholdfunction.

Lemma 5. [10, 11] A threshold function f on n variablescan be computed by a monotone boolean circuit of polynomialsize and depth O(logn).

At this point, we have all the necessary elements to answerthe question (*).

Theorem 6. The AND-OR model generalizes in polyno-mial time the X-Out-Of-Y model.

Sketch of proof. in obtained by replacing each process pof G by a network representing the monotone boolean circuit(of polynomial size and depth O(logn)).

As we can see, the AND-OR and the X-Out-Of-Y models,in fact, have the same power of expression even by addingpolynomial time computations as additional restrictions.

To conclude our analysis we observe that the circuit con-structed in [10, 11] for a threshold function on n variableshas depth O(log n). It is well known in the literature thatsome threshold functions, as the majority function, cannot beexpressed by boolean circuits with constant depth and poly-nomial size. Thus, in our analysis on the power of expressionof a deadlock model, if besides the polynomial time com-putation we also add preservation of order of the network’sdepth as a requested criterion, we obtain the hierarchy illus-trated in Figure 1, which is quite distinct from the currentorganization of deadlock models in the literature.

And   Or  

AND-­‐OR  |  Simplified  AND-­‐OR  

X-­‐Out-­‐Of-­‐Y  |  Disjunc;ve  X-­‐Out-­‐Of-­‐Y  

Figure 1: Hierarchy of power of expression in poly-nomial time preserving the order of the depth.

3. DEADLOCK CHARACTERIZATION BYMEANS OF GRAPH STRUCTURES

540

Page 4: Deadlock Models in Distributed Computation: Foundations ...

In this section we make use of a well-known data structure,the x-y graph, as a tool to exactly characterize deadlocks inthe X-Out-Of-Y model.

In the literature, an x-y graph [12] is a generalization ofand/or graphs, and/or graphs appeared in the 60s withinthe domain of Artificial Intelligence. In a xy-graph, everyvertex vi of an x-y graph has a label xi − yi to mean that videpends on xi of its yi out-neighbors. They are equivalentto the wait-for graphs in the X-Out-Of-Y Model: every nodevi of an x-y graph has a label xi-yi to mean that vi dependson xi of its yi out-neighbours.

Definition 2. Given an x-y graph G = (V,E) and anode s ∈ V , a solution subgraph of G is an acyclic subgraphH = (V ′, E′) satisfying the following properties:

1. s ∈ V ′.

2. For every non-sink node vi in V ′, exactly xi of its yiout-edges belong to E′.

Lemma 7. Let s be a node of a wait-for graph G = (V,E)in the X-Out-Of-Y model. If G has a solution subgraph thens is not in deadlock.

Sketch of proof. Exploring the definition 2, and the waythe rules are in the X-Out-Of-Y model of a vertex becomesink, we prove that the subgraph solution indicates a way tos become sink.

Now we describe graph structure that provide a necessaryand sufficient condition for the existence of deadlock.

Definition 3. Given a x-y graph G = (V,E) and a nodes ∈ V , a non-solution subgraph of G is a strongly connectedsubgraph H = (V ′, E′) satisfying the following properties:

1. s ∈ V ′.

2. For every non-sink node vi in V ′, at least yi − xi + 1of its yi out-edges belong to E′.

Corollary 8. A wait-for graph G in the X-Out-Of-Ymodel contains nodes in deadlock if and only if G contains anon-solution subgraph.

Sketch of proof. Consistis in show that the defenitionof non-solution subgraph ensure that if existis deadlock in await-for graph, the minimum set of nodes that characterizesthe deadlock is in fact a non-solution subgraph.

The definition of non-solution subgraph leads to a charac-terization of deadlock in a wait-for graph in the X-Out-Of-Ymodel.

4. CONCLUSIONSIn this work, a complete study on deadlock models has been

made, and a new hierarchy of classical deadlock models isprovided. In our analysis we consider the power of expressionand polynomial computational time as fundamental aspects.To the best of our knowledge, this is a novel approach to thistype of study.

We introduce a formal and appropriate definition to char-acterize the fact that a certain deadlock model generalizesanother. We also propose a new deadlock model, the Simpli-fied AND-OR model, which simplifies the classical AND-ORmodel keeping the same power of expression.

We show that the classical analysis of deadlock models isinaccurate with respect to computational time. We showthat the AND-OR, Simplified AND-OR, X-Out-Of-Y, andDisjunctive X-Out-Of-Y models have the same power ofexpression in polynomial time. However if we take intoaccount the preservation of the order of the network’s depth,we observe a slight change in the hierarchy, suggesting thatthe X-Out-Of-Y and Disjunctive X-Out-Of-Y models are themost expressive models.

Finally, we point out the relation between a well-knowndata structure and a characterization of deadlock in wait-forgraphs.

5. REFERENCES[1] V. C. Barbosa, An Introduction to Distributed

Algorithms, MIT Press, 1996.

[2] V. C. Barbosa, M. R. F. Benevides, A graph-theoreticcharacterization of AND–OR deadlocks, TechnicalReport COPPE ES-472/98, Federal University of Riode Janeiro, Rio de Janeiro, Brazil, 1998.

[3] V. C. Barbosa, The combinatorics of resource sharing,in: M. Fialles, F. Gomes (Eds.), Models for Paralleland Distributed Computation: Theory, AlgorithmicTechniques and Applications, Kluwer AcademicPublishers, The Netherlands, 2002.

[4] K. M. Chandy, L. Lamport, Distributed snapshots:determining global states of distributed systems, ACMT. Comput. Syst., v. 3, 1, 63–75, 1985.

[5] A. D. Kshemkalyani, M. Singhal, Efficient detectionand resolution of generalized distributed deadlocks,IEEE Trans. on Software Engineering, v. 20, 43–54,1994.

[6] D. S. Ryang, K. H. Park, A two level distributeddetection algorithm of AND/OR deadlocks, J. ofParallel and Distributed Computing, v. 28, 149–161,1995.

[7] J. Misra, K. M. Chandy, A distributed graph algorithm:Knot detection, ACM Trans. on ProgrammingLanguages and Systems, v. 4, 678–686, 1982.

[8] G. Bracha, S. Toueg, Distributed deadlock detection,Distributed Computing, v. 2, 127–138, 1987.

[9] H. Vollmer, Introduction to Circuit Complexity, Berlin:Springer, 1999.

[10] L. G. Valiant, Short monotone formulae for themajority function, J. Algorithms, v. 5, 3, 363-366, 1984.

[11] S. Hoory, A. Magen, T. Pitassi, Monotone circuits forthe majority function, in: Approximation,Randomization, and Combinatorial Optimization,Algorithms and Techniques, Springer Berlin Heidelberg,410-425, 2006.

[12] U. S. Souza, F. Protti, M. Dantas da Silva, Revisitingthe complexity of and/or graph solution, Journal ofComputer and System Sciences 79.7 (2013): 1156-1163.

541