I would like to know how to code a program which will remove stop words and perform stemming on the given input file eg: minutes of a meeting.I'm new to Java so not getting any ideas on how to program in java.
Welcome to the Java Programming Forums
The professional, friendly Java community. 21,500 members and growing!
The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.
>> REGISTER NOW TO START POSTING
Members have full access to the forums. Advertisements are removed for registered users.
I would like to know how to code a program which will remove stop words and perform stemming on the given input file eg: minutes of a meeting.I'm new to Java so not getting any ideas on how to program in java.
What parts of your program design are you having problems with coding in java?ideas on how to program in java
Is your problem with designing a computer program
or with coding a program design in java?
If you don't understand my answer, don't ignore it, ask a question.
One of the easy classes for reading a file is the Scanner class. It has many useful methods that allow you to read the file in different ways.input file for reading
The String class has many methods for searching a String.how to find the stop words and stem
can you describe in more detail what you are trying to code?i'm having trouble with coding.
Also an example might help. Post some text that contains what you are trying to find and describe what in that text you want to find.
If you don't understand my answer, don't ignore it, ask a question.
I'm doing a project in opinion mining. The basic idea is to identify human behavior based on the interactions in a meeting eg proposal of an idea, comment, acknowledge. I plan to give few minutes of meetings as input and in the first module have to preprocess the data using stop word removal, stemming and POS tagging. say for example "This is a short sentence "
This DT
is VBZ
short JJ
sentence NN.
I want to use pos tagging mainly to identify the names of people in the meetings. Hope you get an idea on what I'm trying to explain?
Is this what you are trying to do:
Take a sentence of words, separate the words and assign some tag to each word.
In your example, the tags were the UPPERCASE letters after each word.
DT is the tag for the word: "This""This is a short sentence "
This DT
is VBZ
short JJ
sentence NN.
Where do the tags come from?
If you don't understand my answer, don't ignore it, ask a question.
can you tell me what is the code to write inorder to read data from a folder
See the Scanner class for methods that can be used to read data from a single file.
If you want to read all the files in a folder, you need to use the File class to get a list of the files in the folder and then use the Scanner class to read each file in the list.
If you don't understand my answer, don't ignore it, ask a question.
this is the code i have for stop word removal. can you help me to read the data from files in a folder istead of directly giving it in the program
package split; import java.util.*; import java.io.*; public class Split { public static void main(String[] args) { for(String word : Split.words("The rain in spain falls mainly on the plains, except when it's not exactly working that way! And I need+some= way. How~ will \"(this)\" \"work\"?")) System.out.println(word); } static HashSet stopwords = new HashSet(); public static void addStopwords() { try{ BufferedReader br = new BufferedReader(new FileReader("stopwords.txt")); while(br.ready()) { stopwords.add(br.readLine()); } } catch(Exception e){System.out.println(e);} } public static ArrayList<String> words(String line) { if(stopwords.size() == 0) addStopwords(); ArrayList result = new ArrayList(); String[] words = line.split("[ \t\n,\\.\"!?$~()\\[\\]\\{\\}:;/\\\\<>+=%*]"); for(int i=0; i < words.length; i++) { if(words[i] != null && !words[i].equals("")) { String word = words[i].toLowerCase(); if(!stopwords.contains(word)) { result.add(Stemmer.stem(word)); } } } return result; } }
Last edited by jps; September 3rd, 2013 at 12:28 AM. Reason: code tags
I see that the code is using the BufferedReader class's methods to read a file.can you help me to read the data from files in a folder
What problems are you having reading the data?
Please edit your post and wrap the code with code tags. Be sure the code is properly formatted. Nested statements should be indented. All statements SHOULD NOT start in the first column.
If you don't understand my answer, don't ignore it, ask a question.
sir what i want is to read information from different word documents which are stored in a folder individually. eg: the folder XYZ will have 4 word documents a.txt,b.txt, c.txt, d.txt. can you please help me to write a code so i can read from this folder
Did you miss post#8?
The File class has methods that return a list of the files in a folder. You can use that list to read the files in the folder one at a time.
The steps are:
get a list of the files in the folder
begin loop
get next file in the list
read data from that file
end loop
If you don't understand my answer, don't ignore it, ask a question.
Ok sir. can you code in java.
I'm also getting an exception in this line BufferedReader br = new BufferedReader(new FileReader("stopwords.txt"))
it is :java.io.FileNotFoundException: stopwords.txt (The system cannot find the file specified). I'm using netbeans where do i save the txt file
The program can not find the file in the location where you are looking. To find where the program is looking for the file, create a File object for the file you are trying to read and print that File object's absolute path value which will show you where the program is looking for the file.The system cannot find the file
I don't know what your IDE does with the location of files when a program tries to read a file.
If you don't understand my answer, don't ignore it, ask a question.
i do not know how to code in java can some one help me to read data from folder instead of this sentence
"for(String word : Split.words("The rain in spain falls mainly on the plains, except when it's not exactly working that way! And I need+some= way. How~ will \"(this)\" \"work\"?"))
System.out.println(word);"
See the Scanner class for methods to read data from a file. There are many examples of code using the Scanner class's methods here on the forum. Do a Search.
What have you tried? What problems are you having with it?
If you don't understand my answer, don't ignore it, ask a question.