5.2 Approximate algorithm for responsibility
If
is the selected contingency set for edge
,
must satisfy two constraints: (a) after removing all edges in
, the diffusion remains but the removal of
would make the diffusion fail; and (b)
must be the minimum set satisfying
. Based on these two constraints, we propose a greedy algorithm named Appresp; the main aspects of this algorithm are as follows:
(1) Get the covered set
and the uncovered set
.
(2) Choose edge
which satisfies these two rules: (a)
contains as many as possible of the sets in ST, and (b)
.
(3) Add
to
, remove
from SA, and remove
from ST.
(4) Repeat Steps (2) and (3) until ST gets empty, and compute the responsibility of
.
The key aspects of our algorithm are the two rules listed in Step 2. The first rule is a heuristic for ranking the edges to be added to the contingency set. The second rule is a constraint to ensure that
is still a cause of the diffusion (in accordance with Definition 3) after removing the calculated contingency set. This selection guarantees a feasible solution, provided that the propagation history contains no redundancy (Redundancy is defined by Ref. [
21
], i.e., a propagation trace
is redundant if there exists another trace
whose edge set is a subset of
’s edge set. After removing all redundancies, the remaining edges are causes (Definition 3). This is also the PTIME solution for the causality checking problem in propagation histories). We formalize this property in Proposition 4 and present the pseudo code in Algorithm 1.
Proposition 4 Appresp guarantees a feasible solution, provided that the propagation history contains no redundancy.
Proof We continue with the definitions of
,
,
, sc, ST, and SA from Algorithm 5.2. Suppose the propagation history does not contain any redundant traces. We calculate the responsibility of edge
as an example.
Case A (If Appresp returns a contingency set
). Suppose we get the initial covered set
and the uncovered set
. In this case, ST is covered by
. For simplicity, here "the removal of edge
" refers to removing all sets in
from both SA and ST. In accordance with our constraint rule, the removal of
makes ST empty but cannot render SA empty. In addition, if we first remove all edges from
, the further removal of
renders SA empty. Consequently, according to Definition 3,
is a feasible contingency set for
.
Case B (If Appresp cannot find a contingency set). Suppose we have gotten temporary results
,
, and
, when no edge satisfies our constraint rule, i.e., for each left edge
, we get
. Suppose
is the edge set of trace
in
and
is the edge set of trace
in
. For each edge
in
, we will find
contains
. Thus, we get
, i.e.,
is redundant. This is opposite to our hypothesis of non-redundancy.
Therefore, Appresp guarantees a feasible solution for a propagation history without redundancy.
Complexity of Appresp. Suppose the propagation history has
traces and
edges, the average length of traces is
, and the corresponding contingency set size is
. On average, the time complexity of Appresp is
. Note that, both
and
are usually small numbers, i.e., Appresp has linear time complexity (We verify the small values of
in our experiments, and
is also small according to the concept of six degrees of separation[
24
]).
Proof Given the input edge
, to calculate its responsibility we need to loop the following steps
times.
(1) Line 6 of Algorithm 1 performs two tasks. Firstly, it calculates
for each edge
; we can use an inverted index to speed up this calculation (the time complexity is
). It then selects edge
, which satisfies the two rules (the time complexity is at most
, since SA is usually small).
(2) Line 9 removes all sets in
from ST. The time complexity is
, since on average we need to repeat this removal
times to empty ST.
(3) Line 10 does a similar task on SA to that performed on ST in Line 9. Therefore, the time complexity is also
.
Therefore, the overall time complexity is
=
.
Example 8 (Example of Algorithm 1). Continuing with the same propagation history
in Example 1, we use edge
as an example and employ Algorithm 1 to compute its responsibility. We first get the covered set
and the uncovered set
. Then, we find that
cannot be selected because
. We can select either of
or
, since both satisfy our constraint rule and have the same priority according to our heuristic rule. Suppose we choose
first and then choose
in the next round. Finally, we get the contingency set
, so the responsibility of
is
.
Unlike the approach of our Appresp method, Qin et al.[
27
] directly adopted a greedy strategy to solve the corresponding SCP problem, i.e., they ignore the constraint rule in our method. However, without this rule, the greedy strategy cannot guarantee a feasible solution for the responsibility problem (as shown in Example 7). We will show this property and compare the two methods in presenting the results of our experiments.