Generics and Sets

JosAH

Recognized Expert MVP

Join Date: Mar 2007

Posts: 11453
#1

Generics and Sets

Sep 17 '07, 08:47 AM

Greetings,

Introduction

This week I'll write a bit about generics (those funny angular brackets). I need
an example and decided to use sets and some of their operations. This weeks'
article discusses one set class and two interesting operations on the set:
combinations and set partitioning. First a description of the algorithms is
given and then we'll have some fun with the generic implementation of them.

Combinations

For a given set S = [x, y, z] there are eight sets representing all possible
combinations of those three distinct elements:

[]

[x]

[y]

[z]

[x, y]

[x, z]

[y, z]

[x, y, z]

Given that set S, how can we find all these combinations? Here's how: suppose
we can find all combinations of set S'= [y, z] which are: [], [y], [z] and [y, z]
Of course these subsets are combinations of the original set S as well, but
there's no 'x' to be found in them (of course not, because I left it out).

Simply add that element 'x' to all the subsets that make up the combinations
of set [y, z]; This is the result:

[x], [x, y], [x, z] and [x, y, z]

Together with the other four subsets we have constructed all eight combinations
of set S= [x, y, z]

But how did we find all combinations of set S'= [y, z]? We apply a similar
recursive reasoning here: suppose we can find all combinations of set S''= [z];
they are [] and [c]; they are part of all combinations of set S'= [y, z] and
we add element 'y' to it: [y], [y, z].

This may seem futile but here's how we find all combinations of set S''= [z]:
we find all combinations of the empty set, which is the empty set itself: [],
and we add element [z] to all the combinations (just one), which makes [z].

For a set with n elements there are 2^n combinations. This can easily be seen:
the empty set (zero elements) has one combination: the empty set itself. If
we add an element to the set we end up with twice the number of combinations:
the original combinations plus the original combinations with that new element
added to every combination. A set of one element has two combinations, a set of
two elements has four combinations etc. etc.

The recursive algorithm described above generates them all.

Partitions

A partition of a set S are one or more disjunct non-empty subsets and the union
of those sets form the set S again. For the set S= [x, y, z] the partitions are:

[x, y, z]

[x], [y, z]

[x, y], [z]

[x, z], [y]

[x], [y], [z]

How do we find all partitions of a set? We apply a similar recursive reasoning:
suppose we can find all partitions of set S'= [y, z] which is:

[y, z]

[y], [z]

First we add the set [x] to every partition of set S':

[x], [y, z]

[x], [y], [z]

Next we add element 'x' to every set in the partitions of S', one by one:

[x, y, z]

[x, y], [z]

[y], [x, z]

We end up with the same five partitions again. Try to find all partitions of
set S'= [y, z] given the partitions of set S''= [z]. The partitions of a single
element set is the single element set itself, so all partitions of set S'' is
the set S''= [z] itself.

By definition all partitions of an empty set is the empty set itself.

Sets

As you might have noticed by now this article is about sets. The Java programming
language has Set implementations available. I'm going to use the HashSet
implementation for the example in this article. I am going to change the behaviour
of the implementation a bit though: I want 'COW' sets. 'COW' is an acronym for
'Copy On Write'. COW implies that the set copies itself when it needs to be
altered; the copy is going to be altered instead. Let's get our hands dirty.
Here are the constructors of our (generic) set:

[code=java]
public class S<T> extends HashSet<T> {

private static final long serialVersionUI D = -887517932819850 7416L;

public S(Collection<T> collection) { super(collectio n); }

public S(T arg) { this.add(arg); }

public S(T ... args) { super(Arrays.as List(args)); }

...
[/code]

I added the serialVerionUID to keep Eclipse's mouth shut. This class has three
constructors: a form of a copy constructor taking a Collection<T> as its
argument; a constructor that takes a single element T as its argument and
a constructor that takes a (possibly empty) list of T arguements.

The last constructor allows us to write new S<T>() i.e. no elements at all
which results in the empty set.

I included the second constructor again to keep Eclipse's mouth shut. I need
sets of sets as in:

[code=java]
S<S<T>> set= new S<S<T>>(new S<T>());
[/code]

Suppose I didn't include the second contructor then the Java compiler has to
deduce that given the last constructor a generic array of type S<T> needs to be
constructed to put that single argument in. A type S<T> is not a type T so
Eclipse warns about its own creativity. If I hadn't included that second
constructor everything would still be fine but I just don't like warnings.

The last constructor simply turns the varargs array to a List (which is a
collection too) so a superclass constructor can handle it all.

The Set interface defines 'add()' and 'addAll()' methods that return true or false
indicating whether or not the element(s) could be added. The same applies to the
'remove' and 'removeAll' methods. I don't want that behaviour, I want methods
that do the same but return the new set instead. Here goes:

[code=java]
private S<T> unionHere(T arg) {

add(arg);

return this;
}

public S<T> union(T arg) {

return new S<T>(this).unio nHere(arg);
}

public S<T> union(Collectio n<T> collection) {

S<T> copy= new S<T>(this);

copy.addAll(col lection);

return copy;
}

public S<T> difference(T arg) {

S<T> copy= new S<T>(this);

copy.remove(arg );

return copy;
}

public S<T> difference(Coll ection<T> collection) {

S<T> copy= new S<T>(this);

copy.removeAll( collection);

return copy;
}
[/code]

Note that I kept the first method private: I don't want the user to fiddle with
the set itself; if they want to do that they can fiddle with the superclass
methods that change the set itself.

The public methods above simply copy the set and apply the wanted operation on
the copy instead and return the copied set instead of returning a boolean value.

Note that all methods are generic: they don't care about what type T actually is.

So far so good, but I haven't done much yet. Here are the other set operations:
intersection, difference and symmetric difference. First a few examples of the
operations; A= [x, y, z], B=[y, z, t]

intersection: [y, z]

difference: [x]

symmetric difference: [x, t]

The intersection of A and B are the elements that occur in both A and B. The
difference of A and B are those elements that occur in A but not in B. The
symmetric difference of A and B are those elements in A or B bot not in both.

The symmetric difference plus the intersection of two sets is the union of those
two sets. The symmetric difference of sets A and B is the union of the difference
of A and B and the difference of B and A.

Here's the source code:

[code=java]
public S<T> difference(Coll ection<T> collection) {

S<T> copy= new S<T>(this);

copy.removeAll( collection);

return copy;
}

public S<T> symDifference(C ollection<T> collection) {

return difference(coll ection).union(n ew S<T>(collection ).difference(th is));
}

public S<T> intersection(Co llection<T> collection) {

S<T> copy= new S<T>(this);

copy.retainAll( collection);

return copy;
}
[/code]

There's not much more to tell about those methods than that they first make a
copy of the set and apply the operation on the copy of the set.

Combinations

The implementation of the 'combinations' method closely follows the description
in the paragraph above. The method returns a set of sets. The sets being the
individual combinations of the original set.

First the method constructs a set that contains the empty set. If the original
set isn't empty it walks over each element, removes it and generates all
combinations of the reduced set. It copies the combinations to a 'result' set
and copies them again after it had added that current element to the sets again.

Here's the code:

[code=java]
public S<S<T>> combinations() {

S<S<T>> result= new S<S<T>>(new S<T>());

for (T current : this) {
S<S<T>> comb= difference(curr ent).combinatio ns();
result.addAll(c omb);
for (S<T> set : comb)
result.unionHer e(set.union(cur rent));
}

return result;
}
[/code]

It comes in quite handy that my methods return the modified set itself instead
of a true/false value. Not the generic type S<S<T>>, i.e. a set or sets of type T.

The method itself is defined in a generic definition of class S. The generic
type is T, so a (partial) specialization of S<S<T>> is valid too. In other
words: if, say, a definition of the union method for a type T is valid, then
automatically it is valid for a generic type S<T>. The above code makes use of
Java's generic behaviour.

I'm going to (ab)use that feature even more for the next algorithm implementation:

Partitions

This method uses the same implementation strategy: it recursively invokes itself
with a smaller set and uses the return value to collect the partitions of the
current set in an set of partitions. Here it is:

[code=java]
public S<S<S<T>>> partitions() {

S<S<S<T>>> result= new S<S<S<T>>>();

if (size() < 2)
return result.unionHer e(new S<S<T>>(this)) ;

T current= iterator().next ();
S<S<S<T>>> part= difference(curr ent).partitions ();

for (S<S<T>> set : part) {
result.unionHer e(set.union(new S<T>(current))) ;

for (S<T> elem : set) {
S<S<T>> copy= set.difference( elem);
result.unionHer e(copy.unionHer e(elem.union(cu rrent)));
}
}

return result;
}
[/code]

A partition is a set of sets, so all partitions together is a set of sets of
sets; the last sets contain elements of type T.

For sets containing at most one element a set containing a set that contains
that (almost) empty set is returned. Otherwise the reasoning as explained in
the paragraph above is applied and the filled 'result' set is returned.

As you can see, in a generic definition of a method for type <T> a type S<T>,
S<S<T>> and even S<S<S<T>>> are defined automatically because an S<T>, S<S<T>>
and an S<S<S<T>>> are pure generic types themselves as well.

This makes it possible to find intersections, combinations or whatever not just
for sets of, say, Strings, i.e. S<String> but also for sets of sets of Strings,
i.e. S<S<String>> or whatever elements in a set you want.

Examples

After all this programming I'd like to see some results; I tested the above
methods like this:

[code=java]
public static void main(String[] args) {

S<String> s= new S<String>("x", "y", "z");
S<String> t= new S<String>("y", "z", "t");

System.out.prin tln("union : "+s.union(t ));
System.out.prin tln("intersecti on : "+s.intersectio n(t));
System.out.prin tln("difference : "+s.difference( t));
System.out.prin tln("symDiffere nce: "+s.symDifferen ce(t));
System.out.prin tln("combinatio ns : "+s.combination s());
System.out.prin tln("partitions : "+s.partitions( ));
}
[/code]

And this was the (correct) output:

Code:

union : [t, z, y, x] intersection : [z, y] difference : [x] symDifference: [t, x] combinations : [[], [z, x], [z, y], [z], [y], [z, y, x], [x], [y, x]] partitions : [[[z, x], [y]], [[z, y], [x]], [[z], [y], [x]], [[z, y, x]], [[z], [y, x]]]

Concluding remarks

The chapter described a bit of set juggling and fiddling with the generics
feature of Java. I'm sure you'll have to read this little article a couple of
times to fully appreciate the generic feature and the recursive implementation
of both operations.

I hope to see you next week and,

kind regards,

Jos
Tags: None
Shubhendu Sinha

New Member

Join Date: Jun 2011

Posts: 1
#2

Jun 27 '11, 09:14 AM

partitions() algorithm

Hi Jos

Thanks for your post. I wish to implement partitions() in c++ to find all partitions of a set. Can you suggest me the best way. If you could also send me the source of partitions algorithm, i could try to write that in c++.
Comment

Previous template Next

Generics and Sets

Comment