Abstract
Set reconciliation between two nodes is widely used in network applications. The basic idea is that each member of a node pair has an object set and seeks to deliver its unique objects to the other member. The Standard Bloom Filter (SBF) and its variants, such as the Invertible Bloom Filter (IBF), are effective approaches to solving the set reconciliation problem. The SBF-based method requires each node to represent its objects using an SBF, which is exchanged with the other node. A receiving node queries the received SBF against its local objects to identify the unique objects. Finally, each node exchanges its unique objects with the other node in the node pair. For the IBF-based method, each node represents its objects using an IBF, which is then exchanged. A receiving node subtracts the received IBF from its local IBF so as to decode the different objects between the two sets. Intuitively, it would seem that the IBF-based method, with only one round of communication, entails less communication overhead than the SBF-based method, which incurs two rounds of communication. Our research results, however, indicate that neither of these two methods has an absolute advantages over the others. In this paper, we aim to provide an in-depth understanding of the two methods, by evaluating and comparing their communication overhead. We find that the best method depends on parameter settings. We demonstrate that the SBF-based method outperforms the IBF-based method in most cases. But when the number of different objects in the two sets is below a certain threshold, the IBF-based method outperforms the SBF-based method.