Introduction | Initialization | Closing | Adding | Deleting | Fetching | Parameters | Advanced | Obscure

SET and BAG

Sets and bags is data structures that hold an unlimited number of elements. The only difference between them is that the elements in a set are unique, while a bag might hold a multiple copy of the same element. Each has its uses. I shall refer to the set only in this discussion, highlighting any differences with a bag if applicable.

A set is a collection of elements. Operations that are of interest with sets are UNION of two sets which computes the union of the two sets (it combines them) to produce a new set that contains the elements of the original two sets; CONTAINS, which check if a set contains a particular element; INTERSECT, which computes the intersection of the two sets (those elements that are contained in both sets); XINTERSECT, which computes the inverse intersection of the two sets (those elements that are not contained in both sets); and DIFFERENCE, which computes the difference betweeen two sets, defined as those elements of the first set that are not contained in the second set.

Introduction

LibDS has a single data type to represent both a set and a bag: the SET, although you can use BAG as an alias (beware though, LibDS does not actually check if a BAG is really a BAG, this is just a semantic convenience). Since a set needs to keep track of whether the elements it contains are unique, a LibDS SET needs as comparison function to be passed to it at init time. This function is type SETCMPFUN, which is declared as int(*)(void*,void*).   The convention in this case is that the comparison function returns 1 if the compared elements are equal, and 0 otherwise (i.e. we do not require that elements in a set can be ordered). In addition a SET must be told to ensure disjointness (force that elements that are added to the SET are not in it yet). NOTE: the elements are not compared by LibDS! You need to define an own comparison function, and the LibDS SET itself has no notion of keys, as does the AVLTREE or the HEAP. This is left entirely to the caller's discretion.

Note on efficiency: if the elements of the set can be ordered, then you can make LibDS use a more efficient representation, by providing a flag at set creation time (see below), and using a comparison function that returns -1 if the first parameter is smaller than the second, 0 if both parameters are equal, and 1 if the first parameter is larger than the second. Beware of some special cases, as explained below.

The difference between a SET and a BAG is determined by a single parameter, ensure_disjoint in the creation routine given below.

Initialization

SET setMake(void)

Create a new set. The set will not check for disjointness so it is in fact a BAG. Furthermore, no comparison function is assigned to the new set.

SET setNew(SETCMPFUN f,int ensure_disjoint,int ord)

Creates a new set. The comparison function is f, and if you want to ensure disjointness (i.e. use this as a SET rather than a BAG), set ensure_disjoint to true. If this is the case, then you can also set ord to anything else but 0, to let LibDS know that the elements can be ordered (like numbers or strings). In this case, LibDS expects f to behave as the C function strcmp(): that is, return -1 if the first element is smaller than the second, 1 if the first element is larger than the second, and zero otherwise. If ensure_disjoint is 0, then the ord flag is ignored, and f should behave as the set comparison function mentioned in the Introduction to this page.

Closing

void setClose(SET set)

Close the set, releasing all memory it was using. User data is left untouched.

void setCloseWithFunction(SET set,void (*fun)(void*))

Close the set and call fun on each data in the set.

Adding

int setAdd(SET set,void *data)

Adds this element to the set. If the set was created with ensure_disjoint set to true (i.e. a true SET), then the element will not be added if it is already in the set. If ensure_disjoint is false, then the element is blindly added to the set (this is the BAG behavior). Returns 0 on success, -1 on failure.

Deleting

void* setRemove(SET set,void * key)

Removes the element that matches the given key. Returns the given element, or NULL on error (if element was not found in the given set).

Fetching

int setContains(SET set,void *elt)

Returns true if this set contains the element elt. This will only work if a comparison function has been given.

void * setFind(SET set,void *elt)

Find the element in this set that is equal to the elt passed. If nothing is found, NULL is returned. This will only work if a comparison function has been passed to the SET.

void * setFirst(SET set)

Returns the first element in this set. "First" in this case can be arbitrary. Returns NULL if nothing is in the set.

void * setNext(SET set)

Returns the arbitrary next element in the set. If setFirst() has not been called prior to this function, then its effect is identical to setFirst().

Parameters

int setSize(SET set)

Returns the number of items currently in the set.

int setEmpty(SET set)

Returns 0 if the set still has items in it, 1 if it is empty.

Advanced functionality

These are the most SET-ish operations of all the ones described on this page.  The presuppose that the two sets are compatible: this means, they contain the same kind of elements. The restriction is not that big, however: all this means is that the comparison functions of the sets are capable on operating on all types of elements present in either set.

In all cases, a new set is created and returned. Upon failure, these functions return NULL.

SET setAppend(SET s1,SET s2)
SET setUnion(SET s1,SET s2)

NOTE: setAppend() is just an alias for setUnion().
This function makes a new set that is the union of the two sets given. The second set is left untouched, while the first set now contains the elements of both s1 and s2. If any elements is s2 were already in s1 and s1 was created with ensure_disjoint set to true, no duplicates will result, else elements of s2 are added blindly to s1, which might lead to duplicates. The representative of the new set is the representative of s1 prior to this operation. NOTE: This is in fact equivalent to simpy adding all the elements of s2 into s1 and then getting rid of s2.

SET setUnion1(SET s1,SET s2)

Same as setUnion() only now s1 and s2 are left untouched, and a completely new set is created and returned. The new set inherits the parameters of s1.

SET setIntersect(SET s1,SET s2)

Computes the intersection of the two sets. This will only work if at least s2 has a comparison function. This is so because this function checks to see if an element in s1 is contained within s2, and only then assigns it to the resulting set. Consequently, s2 must be able to compare elements. As usual, NULL is returned upon failure.

SET setXIntersect(SET s1,SET s2)

Computes the exclusive intersection of the two sets. This means, the resulting set contains all the elements of the two sets that are not contained in both sets.  I.e. if A contains 1,2,3,4 and B containes 3,4,5,6,7, after setXIntersect() the resulting set will contain 1,3,5,6,7. In contrast setIntersect() will produce a set that contains 3,4.

SET setDifference(SET s1,SET s2)

Compute the difference between s1 and s2. This is defined as those elements of s1 that are not part of s2.

Obscure functionality

None whatsoever.

Mon Jan  7 12:12:53 EST 2002
Sat Jun 22 14:14:15 CEST 2002