Threshold union: generalizing unions and intersections

Part I

So, yesterday I was wondering about operations on a set of sets (I’ll explain the Tweet content below, so you can just ignore it for now).

For example, consider that we have 3 sets:

  • $A = \{a, b, c\}$
  • $B = \{a, b, d\}$
  • $C = \{a\}$

and we build a set $S = \{A, B, C\}$ with the previous sets.

The union of all sets has all elements:

$$\bigcup S = \{a, b, c, d\}$$

The intersection of all sets has the elements common to all sets:

$$\bigcap S = \{a\}$$

Now, we can generalize these definitions and have:

  • an element belongs to $\bigcup S$ if it belongs to at least 1 set in $S$
  • an element belongs to $\bigcap S$ if it belongs to at least $|S|$ sets in $S$ 1

Once we have this, we can define $\bigcup_n$:

  • an element belongs to $\bigcup_n S$ if it belongs to at least $n$ sets in $S$

and $\bigcup$ and $\bigcap$ can be written in terms of $\bigcup_n$:

  • $\bigcup S = \bigcup_1 S$
  • $\bigcap S = \bigcup_{|S|} S$

So, my question was: What’s the name of $\bigcup_n$?

After much googling and no success, it came to mind that maybe this was not about operations on sets, but instead on multisets:

Unlike a set, a multiset, (and quoting Wikipedia) “allows for multiple instances for each of its elements”.

In a multiset we keep count of the number of instances of each element - this number is called multiplicity. I’ll use $e_m$ to represent an element $e$ with multiplicity $m$.

So, with sets $A$, $B$ and $C$, we can build a multiset $\{a_3, b_2, c_1, d_1\}$.

Using these concepts, we can redefine $\bigcup_n$:

  • an element belongs to $\bigcup_n S$ if its multiplicity in the multiset is at least $n$

Part II

Today, I was back on my quest. This concept should be something classical, with like 100 years. I was convinced that with this new insight (the multiset), I would finally find the name of $\bigcup_n$.

What was I not expecting, was to find it in a recent (2005) multi-party computation (MPC) paper. The paper is called Privacy-Preserving Set Operations and is authored by Lea Kissner (@LeaKissner) and Dawn Song (@dawnsongtweets).

The high-level idea of MPC is to have a set of players computing some function on some inputs. The inputs are to be kept private, and only the function output should be revealed.

So, if the inputs are the sets $A, B, C$ in $S$, and we’re computing their intersection, then only $\bigcap S = \{a\}$ should be known in the end.

In Section 6.2, I found the following2:

We define the Threshold Set-Union problem as follows: all players learn which elements appear in the combined private input of the players at least a threshold number $t$ times. For example, assume that $a$ appears in the combined private input of the players $15$ times. If $t = 10$, then all players learn $a$. However, if $t = 16$, then no player learns $a$.

With this, I declare my quest over, and I’ll be calling $\bigcup_n$ threshold union.

What do you think? Is there a better name? Has this concept been named elsewhere?

  1. $|S|$ represents the size of the set. In the example, $|S| = 3$. ^
  2. Although, there’s no mention of multiset in this definition, nor in the title, the paper contains several and I guess that’s why it came up in my Google search. ^
Vitor Enes
PhD Student