Hello, i have a simple program that seems to take a lot of memory, about 700MB. As the program starts it keeps growing slowly until it reaches 700-1000MB and then it doesn't grow anymore. What my program does is open a webpage and analize its data, extract certain strings from its text and store it in a file. It stores each string in a separate line and the biggest of those files is about 4MB. I temporary store this strings in a set, manipulate it, compare it with others, read them from files and write them to files. The whole code is to big to share so i will just post some important things from it and hopefuly u will figure out where the problem is. Alternatively if there is a way for me to monitor which function or part of program is taking all this space, that would be ok too.
This is how i read Strings from a file
// fill positives Set<String> positives = new HashSet<String>(); try { Scanner scanner = new Scanner(new File("positives.txt")); while (scanner.hasNextLine()) { String line = scanner.nextLine(); positives.add(line); } scanner.close(); } catch (FileNotFoundException e) { e.printStackTrace(); }
This is how i write Strings from a Set to file
// write positives to file f1 = new FileWriter("positives.txt"); newLine = System.getProperty("line.separator"); for (String s : positives) { f1.write(s + newLine); } f1.close();
This is how i remove files from set that are in the other set. Keep in mind that there is never more then 4MB of Strings in any of this set
// remove from candidates all processed candidates.removeAll(checkedTemp);
This is how i get strings from a site.
public static void readPage(String url, int pageNumber) { String temp; try { URL oracle; oracle = new URL(url + String.valueOf(pageNumber)); BufferedReader in = new BufferedReader( new InputStreamReader(oracle.openStream())); String inputLine; int line= 0; while ((inputLine = in.readLine()) != null) { temp = extractStrings(inputLine); if(temp != null) { candidates.add(temp); candidatesCounter++; } line++; } in.close(); } catch(Exception e) { System.out.print("error: "); System.out.println(e); } }
This is how i process the webpage. It ready about 40 pages full of text and uses StringUtils library to find strings that are in a certain place (and example would be, find all strings that start with /ri and end with .jpg)