Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 2 of 2

Thread: How to use Sets

  1. #1
    Administrator copeg's Avatar
    Join Date
    Oct 2009
    Location
    US
    Posts
    5,318
    Thanks
    181
    Thanked 833 Times in 772 Posts
    Blog Entries
    5

    Default How to use Sets

    How to use a Set

    Put simply, a Set is a Collection that contains no duplicate elements. Sets are a fantastic utility to have in the arsenal, allowing one to find, sort, and iterate over unique values quickly and easily.

    A little background: a Collection is an interface which is used to groups object. The interfaces Set and List both extend from Collection, and thus the most commonly used Collection implementation - ArrayList (or its interface List) - is somewhat related to a Set, with mutual functions inherited from the Collection interface. That being said, a Set and List differ in some key ways.

    On to a demonstration:
        public static void main(String[] args){
            String[] strings = {"1","2", "3", "4" , "1", "2"};
            //create the Collection objects
            java.util.List<String> list = new ArrayList<String>();
            Set<String> set = new HashSet<String>();
            //Add the same values to each object
            for ( String s : strings ){
               list.add(s);
               set.add(s);
            }
            //print out the contents of each collection
            System.out.println("List Contents:");
            printCollection(list);
            System.out.println("Set Contents:");
            printCollection(set);
         }
        /**
        *Prints the values of the parameter collection to the command line.
        */
        public static void printCollection(Collection<String> c){
            Iterator<String> it = c.iterator();
            while ( it.hasNext() ){
                System.out.println(it.next());
            }
     
        }

    Results:
    List Contents:
    1
    2
    3
    4
    1
    2
    Set Contents:
    3
    2
    1
    4
    This code adds the same values to both a List and a Set, and outputs the values contained within those Collections. Two important things to note:

    1) as defined by the implementation the Set has no duplicate values, as opposed to the List which contains all values placed within the Collection.

    2) the order of the Set output does not equal the order of its input. Based inherently upon how it is implemented, a Set (at least a HashSet - see below) has no guarantee of the order when the values are retrieved (see below).


    Guarantee Set Order:

    One advantage to using a List lies in the fact that order is maintained. The basic implementation of a Set - the HashSet - cannot guarantee any order. This limitation is fixed with a LinkedHashSet, an implementation of the Set interface backed by a LinkedList. This object guarantees no duplicates, whose iterator iterates over its objects in the order of entry. I will leave it up to the reader to test the demonstration of a LinkedHashSet by substituting the HashSet in the example above with a LinkedHashSet.

    How to define equality:

    Equality in a Set is defined by two values: the equals() and hashCode() methods of an Object. These methods could be a subject of an entirely separate tutorial, suffice it to say these two methods define object equality. Typically for custom made object, these methods default to those defined in object - this behavior defining equality based upon object references - and not their values. The following example extends off the previous, using a custom object as a demonstration:
    public static void main(String[] args){
            Tester[] strings = {new Tester("1"),new Tester("2"), new Tester("3"), new Tester("4") , new Tester("1"), new Tester("2")};
            //create the Collection objects
            java.util.List<Tester> list = new ArrayList<Tester>();
            Set<Tester> set = new LinkedHashSet<Tester>();
            //Add the same values to each object
            for ( Tester s : strings ){
               list.add(s);
               set.add(s);
            }
            //print out the contents of each collection
            System.out.println("List Contents:");
            printCollection(list);
            System.out.println("Set Contents:");
            printCollection(set);
         }
        /**
        *Prints the values of the parameter collection to the command line.
        */
        public static void printCollection(Collection<Tester> c){
            Iterator<Tester> it = c.iterator();
            while ( it.hasNext() ){
                Tester s = it.next();
                System.out.println(s.value);
            }
     
        }
        /**
        *A Custom class with only a single variable for demonstration.
        */
        private static class Tester {
            final String value;
            public Tester(String s){
                value = s;
            }
            public String getValue(){
                return value;
            }
        }

    Results:
    List Contents:
    1
    2
    3
    4
    1
    2
    Set Contents:
    1
    2
    3
    4
    1
    2
    First, notice the Set contains no unique values (relative to the value variable in Tester). This is because the the hashCode()/equals() is defaulting to the implementation in Object - testing for unique instances, and not the values of those instances. To solve this, one can override these methods to define equality:
            @Override
            public int hashCode() {
                final int prime = 31;
                int result = 1;
                result = prime * result + ((value == null) ? 0 : value.hashCode());
                return result;
            }
            @Override
            public boolean equals(Object obj) {
                if (this == obj)
                    return true;
                if (obj == null)
                    return false;
                if (getClass() != obj.getClass())
                    return false;
                Tester other = (Tester) obj;
                if (value == null) {
                    if (other.value != null)
                        return false;
                } else if (!value.equals(other.value))
                    return false;
                return true;
            }
    This addition results in unique values in the Set, based upon the value variable in the class Tester.

    If I want a unique collection, why not test if a List contains the object before adding it again?

    you mean like this?

            List<String> list = new ArrayList<String>();
            for ( String s : Strings){
                if ( list.contains(s) ){
                    continue;
                }
                list.add(s);
     
            }

    Yes, but lets push this to the limit a bit. In the example below, identical values are placed in both a List and a Set. These Collections are then searched for values with time for both operations being tallied and printed.
    final int MAX = 10000;
            java.util.List<Integer> list = new ArrayList<Integer>();
            Set<Integer> set = new HashSet<Integer>();
            for ( int i = 0; i < MAX; i++ ){
                list.add(i);
                set.add(i);
            }
     
            long start = System.currentTimeMillis();
            //calculate the time for a list search
            for ( int i = 0; i < MAX; i++ ){
                if ( list.contains(i) ){
                    //do nothing for the demo
                }
            }
     
     
            long end = System.currentTimeMillis();
            System.out.println("Time for List: " + (end-start));
            start = System.currentTimeMillis();
            //calculate the time for the set search
            for ( int i = 0; i < MAX; i++ ){
                if ( set.contains(i)  ){
                    //do nothing for the demo
                }
            }
            end = System.currentTimeMillis();
            System.out.println("Time for Set: " + (end-start));

    System dependent of course, but a quick output on my current system:
    Time for List: 161
    Time for Set: 4

    The reasoning behind the differences in time values is based the differences in how values are stored. While the details of this is beyond the scope of this tutorial, but suffice it to say the lookup in a Set is MUCH faster. A simple analogy (and only an analogy) to the difference: searching an array list is like looping over an array to find a value, whereas searching a set is similar to inherently knowing the index of the value in the array (based upon its 'hash' - eg the hashCode() method). For more information, see WIKI - HashTable

    Yes this is pushing things to a limit, but more than once I have noticed huge benefits in runtime performance by using a Set when it is more appropriate to do so.

    Uses
    Unique lines in a file, unique url's in a webpage, unique values in a dataset. Both Set's and List's have their own uses in different situations: Set's for unique values, and Lists for sequential, and duplicate values.

  2. The Following User Says Thank You to copeg For This Useful Post:

    JavaPF (September 29th, 2010)


  3. #2
    mmm.. coffee JavaPF's Avatar
    Join Date
    May 2008
    Location
    United Kingdom
    Posts
    3,336
    My Mood
    Mellow
    Thanks
    258
    Thanked 294 Times in 227 Posts
    Blog Entries
    4

    Default Re: How to use Sets

    Very clear and informative tutorial copeg. Thanks
    Please use [highlight=Java] code [/highlight] tags when posting your code.
    Forum Tip: Add to peoples reputation by clicking the button on their useful posts.

Similar Threads

  1. [SOLVED] Maps and adding to sets as values
    By mds1256 in forum Collections and Generics
    Replies: 3
    Last Post: March 26th, 2010, 09:12 AM
  2. Bulk operations on sets/ map problem
    By kyuss in forum What's Wrong With My Code?
    Replies: 0
    Last Post: March 14th, 2010, 12:48 PM
  3. [SOLVED] Sets and creating a new set containing a common number
    By mds1256 in forum Collections and Generics
    Replies: 2
    Last Post: February 26th, 2010, 06:00 PM
  4. Lists of Sets and Sets of Lists
    By Newoor in forum Collections and Generics
    Replies: 2
    Last Post: December 8th, 2009, 08:13 PM