Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 6 of 6

Thread: Website content extractor

  1. #1
    Junior Member
    Join Date
    May 2012
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Website content extractor

    I have been trying to get a running program to display text from a site that is mainly text, in this case hackaday.com. I just can't find the right methods. Can anyone get it working AND explain it to me?

    import javax.swing.JLabel;
    import java.io.BufferedReader;
    import java.io.InputStreamReader;
    import java.net.URL;
    import javax.swing.JFrame;
    public class SeeWebsite
    {
    public static void main(String[] argv) throws Exception
    {
    URL url = new URL("http://www.hackaday.com");
    System.out.println(url.toExternalForm());
    }
    }


  2. #2
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,139
    Thanks
    65
    Thanked 2,720 Times in 2,670 Posts

    Default Re: Website content extractor

    What is printed out when you execute your program?
    Why do you expect your code to get the contents of the html file from the server?
    If you don't understand my answer, don't ignore it, ask a question.

  3. #3
    Junior Member
    Join Date
    May 2012
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Website content extractor

    All it prints is "http://www.hackaday.com".

    I don't. Not yet. That's why I posted here. Also, I'd like to be able to view the text as the browser shows it, not html.

  4. #4
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,139
    Thanks
    65
    Thanked 2,720 Times in 2,670 Posts

    Default Re: Website content extractor

    One way would be to use the HttpURLConnection class to connect to a server and read what the server returns. That would be the html.
    If you want a browser-like display (simple html only) use the JEditorPane class.
    If you don't understand my answer, don't ignore it, ask a question.

  5. #5
    Junior Member
    Join Date
    May 2012
    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Website content extractor

    Yes, but how would i convert between the HTML and plain text like would be shown in a browser?

  6. #6
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,139
    Thanks
    65
    Thanked 2,720 Times in 2,670 Posts

    Default Re: Website content extractor

    Are you asking how to parse an html page and extract text into Strings? I think there are third party packages that do some of that. Try asking Google.
    If you don't understand my answer, don't ignore it, ask a question.

Similar Threads

  1. Changing content
    By Aivaras in forum Java Servlet
    Replies: 2
    Last Post: March 16th, 2012, 11:43 AM
  2. What exactly is a content pane?
    By TP-Oreilly in forum Java Theory & Questions
    Replies: 2
    Last Post: December 7th, 2011, 09:07 AM
  3. Java program for downloading contents from the web
    By alley in forum Java Networking
    Replies: 8
    Last Post: June 17th, 2009, 01:13 PM
  4. How to display the contents of my queue?
    By rocafella5007 in forum What's Wrong With My Code?
    Replies: 1
    Last Post: April 30th, 2009, 11:46 AM
  5. Replies: 3
    Last Post: March 9th, 2009, 09:47 AM

Tags for this Thread