Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 3 of 3

Thread: Creating A Count Matrix In Java

  1. #1
    Junior Member
    Join Date
    Jul 2009
    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Creating A Count Matrix In Java

    Hi all,

    I do some programming here and there, but it is been a long time since I've sat down and done anything with Java and its data structures.

    I have a large number of .txt files (19 right now, but it could grow to as many as 160ish). These .txt files contain lines with genes and their locations in a certain biological plant species. Here is how one of these .txt files starts......

    Name: Zea Mays
    FileName: NC_001666
    bp: 160719
    Genes: rps12 (90173..91321)
    rps7 (92413..94170)
    ndhB -(95619..96101)


    And this file continues with lines of genes and their respective locations. I need a Java program to look at each one of these .txt files and to create a matrix that contains every single gene name that appears (i.e. rps12) throughout the .txt files on the left and each possible species FileName across the top (i.e NC_001666). Each intersection should then contain the number of times that specific gene appears in that FileName. So, if rps12 appears 4 times in NC_001666.txt, then a 4 would be put at the intersection of rps12 and NC_001666.

    Like I said before, it's been so long since I've worked with Java, I really have no idea how to start this or what data structures would be most useful, but if anyone could give me some help, that would be great. Thanks!

    -statsman5


  2. #2
    Super Moderator helloworld922's Avatar
    Join Date
    Jun 2009
    Posts
    2,895
    Thanks
    23
    Thanked 619 Times in 561 Posts
    Blog Entries
    18

    Default Re: Creating A Count Matrix In Java

    It'd be really easy to use a modified set. I'd recommend a hash table to do it the fastest, and probably the easiest. Once you have all your data read in, you can change it to pretty much any data structure that would fit your needs.

  3. #3
    Super Moderator helloworld922's Avatar
    Join Date
    Jun 2009
    Posts
    2,895
    Thanks
    23
    Thanked 619 Times in 561 Posts
    Blog Entries
    18

    Default Re: Creating A Count Matrix In Java

    Lucky you, i got bored
    import java.io.File;
    import java.util.ArrayList;
    import java.util.Scanner;
     
    /**
     * @author Andrew
     * 
     */
    public class GeneCounter
    {
        String                     fileName;
        ArrayList<ArrayList<Gene>> geneHash;
     
        /**
         * A simple test handler
         * 
         * @param args
         * @throws Exception
         */
        public static void main (String[] args) throws Exception
        {
            Scanner reader = new Scanner(System.in);
            boolean quit = false;
            ArrayList<GeneCounter> listOfGenes = new ArrayList<GeneCounter>();
            int size = 0;
            while (!quit)
            {
                System.out.println("Input next file to add to matrix, or quit: ");
                String input = reader.nextLine();
                if (input.equals("quit"))
                {
                    quit = true;
                }
                else
                {
                    listOfGenes.add(new GeneCounter(input));
                    listOfGenes.get(size).printOut();
                    size++;
                }
            }
        }
     
        /**
         * Builds a geneCounter for a file
         * 
         * @param fileName
         */
        public GeneCounter (String fileName) throws Exception
        {
            this.fileName = fileName;
     
            Scanner file = new Scanner(new File(fileName));
            // header stuff
            file.nextLine();
            file.nextLine();
            file.nextLine();
            file.next();
            // determine a good hash table size
            int numGenes = 0;
            while (file.hasNext())
            {
                numGenes++;
                file.nextLine();
            }
            file.close();
            geneHash = new ArrayList<ArrayList<Gene>>();
            for (int i = 0; i < numGenes; i++)
            {
                geneHash.add(new ArrayList<Gene>());
            }
            // read in genes and hash them
            file = new Scanner(new File(fileName));
            // header
            file.nextLine();
            file.nextLine();
            file.nextLine();
            file.next();
            while (file.hasNext())
            {
                addHash(file.next(), numGenes / 10 + 1);
                // skip range
                file.nextLine();
            }
        }
     
        /**
         * Prints out a simple statistics list for this gene counter
         */
        public void printOut ()
        {
            for (int i = 0; i < geneHash.size(); i++)
            {
                for (int j = 0; j < geneHash.get(i).size(); j++)
                {
                    System.out.println(geneHash.get(i).get(j).name + " occured " + geneHash.get(i).get(j).occurences);
                }
            }
        }
     
        /**
         * adds a gene to the hash table. Duplicates just increases the number of occurences
         * 
         * @param element
         */
        public void addHash (String name, int hashSize)
        {
            int index = hash(new Gene(name), hashSize);
            for (int i = 0; i < geneHash.get(index).size(); i++)
            {
                // try to find the match
                if (geneHash.get(index).get(i).name.equals(name))
                {
                    geneHash.get(index).get(i).increaseOccurance();
                    return;
                }
            }
            // didn't find it, add
            geneHash.get(index).add(new Gene(name));
        }
     
        /**
         * Hashes a gene, returns the key
         * 
         * @param element
         * @param tableSize
         * @return
         */
        public int hash (Gene element, int tableSize)
        {
            int hash = 0;
            for (int i = 0; i < element.name.length(); i++)
            {
                hash = ((int) element.name.charAt(i)) + hash * 2;
            }
            return hash % tableSize;
        }
     
        /**
         * Simple gene representation class
         * 
         * @author Andrew
         * 
         */
        private class Gene
        {
            String name;
            int    occurences;
     
            public Gene (String name)
            {
                this.name = name;
                this.occurences = 1;
            }
     
            public void increaseOccurance ()
            {
                occurences++;
            }
        }
    }

Similar Threads

  1. select count(*)
    By jacinto in forum JDBC & Databases
    Replies: 4
    Last Post: March 2nd, 2010, 10:30 PM
  2. Object creation from a input file and storing in an Array list
    By LabX in forum File I/O & Other I/O Streams
    Replies: 4
    Last Post: May 14th, 2009, 03:52 AM
  3. Creating and displaying a JPanel inside another JPanel
    By JayDuck in forum AWT / Java Swing
    Replies: 1
    Last Post: April 7th, 2009, 08:02 AM

Tags for this Thread