Welcome to the Java Programming Forums


The professional, friendly Java community. 21,500 members and growing!


The Java Programming Forums are a community of Java programmers from all around the World. Our members have a wide range of skills and they all have one thing in common: A passion to learn and code Java. We invite beginner Java programmers right through to Java professionals to post here and share your knowledge. Become a part of the community, help others, expand your knowledge of Java and enjoy talking with like minded people. Registration is quick and best of all free. We look forward to meeting you.


>> REGISTER NOW TO START POSTING


Members have full access to the forums. Advertisements are removed for registered users.

Results 1 to 13 of 13

Thread: Scraping data from the web

  1. #1
    Junior Member
    Join Date
    Aug 2019
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Question Scraping data from the web

    Hi, thanks in advance to whomever can help me.

    I am building a java app with Eclipse that will return data to the user depending on what the user types in a text field. The interface is almost done, now I am trying to find the best way to do this.

    I understand HTML form is one way to do this, so depending on what the user inputs in the field, the script would then go to the chosen site, input that in the form and then submit. Once found, I coup scrape certain data with Jsoup.

    I am uncertain how to store what the user has input, then send it to the site once either “Enter” has been pressed or a button. Once either of those action has happened I need the script to input it in a form on the site.

    Any help would be greatly appreciated as I am new to java

    Greetings

  2. #2
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,165
    Thanks
    65
    Thanked 2,725 Times in 2,675 Posts

    Default Re: Scraping data from the web

    Your program appears to be a replacement for a browser program - it gets URL from user, submits it to site, user fills in returned form and submits, returned page is displayed.
    I don't know of any easy way for your program to "fill in the form" and submit it. Some websites will prevent your program from working.
    If you don't understand my answer, don't ignore it, ask a question.

  3. #3
    Junior Member
    Join Date
    Aug 2019
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Scraping data from the web

    Thanks for the reply.

    It wouldn’t be a browser replacement. The user input would be the ticker symbol (stock market) and input that into one or two websites and collect certain data and return it to the GUI window.

  4. #4
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,165
    Thanks
    65
    Thanked 2,725 Times in 2,675 Posts

    Default Re: Scraping data from the web

    Isn't that what a browser does? Get URL from user, get page from server using that URL, get user input on that page, submit that page to server, get data from server and display to user.

    What will your program do that is different?

    Do the websites have a programming API that your program can communicate with? Or will the communications be via existing HTML?
    If you don't understand my answer, don't ignore it, ask a question.

  5. #5
    Junior Member
    Join Date
    Aug 2019
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Scraping data from the web

    I see what you’re saying. Perhaps I don’t explain it right.

    The goal of the application is to allow the user to pull 3 data points from 2 different sites simultaneously. The data will be displayed side by side and the user will be able to compare those numbers at a glance.

    Instead of having to open a website, do the research and then take notes and compare. I want to automate this process so that the user can have a little window, input the stock symbol and fetch that data automatically. This is all to save time and effort for traders.

    The websites are yahoo finance and finviz.

  6. #6
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,165
    Thanks
    65
    Thanked 2,725 Times in 2,675 Posts

    Default Re: Scraping data from the web

    pull 3 data points from 2 different sites simultaneously
    Do those sites have APIs that you can write code to interface with?
    Or are you trying to replicate what a browser does?
    If you don't understand my answer, don't ignore it, ask a question.

  7. #7
    Junior Member
    Join Date
    Aug 2019
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Scraping data from the web

    I believe Yahoo finance has shutdown their API hence why I want to replicate what the browser does.

  8. #8
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,165
    Thanks
    65
    Thanked 2,725 Times in 2,675 Posts

    Default Re: Scraping data from the web

    I want to replicate what the browser does.
    That can be very hard and maybe not allowed by the server you are connecting to.
    Do you know HTML? Have you looked at what the server at those sites returns for a HTTP GET request?
    There will probably be lots of javascript methods that build the text to be displayed.
    Then the code needs to build and return to the server the contents of the form's data in maybe a POST request.
    If that works, then you'll need to parse what is returned.
    If you don't understand my answer, don't ignore it, ask a question.

  9. #9
    Junior Member
    Join Date
    Aug 2019
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Scraping data from the web

    I'm very new to JAVA in general, so any help will be greatly appreciated.

    I'm trying to scrape certain values from a site using Jsoup and I have had success to a certain extent. I am able to scrape a whole row, but I can't seem to figure out how to target one specific cell.

    Here is the url: https://finviz.com/quote.ashx?t=nflx
    Here is a snapshot of the cell I want, the value.

    https://ibb.co/2ZKph4C

    Here are the values returned from my code,, which is the whole row.
    https://ibb.co/BjK8v8X

    here is the code
    https://ibb.co/X5swBVZ

    Lastly here is the inspection of the element in my web browser. The value seems to be cotained in <b> </b>, how do I access it?

    https://ibb.co/xFFNTtw
    Thank you

  10. #10
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,165
    Thanks
    65
    Thanked 2,725 Times in 2,675 Posts

    Default Re: Scraping data from the web

    Please copy and paste the text and code here, not as images.
    If you don't understand my answer, don't ignore it, ask a question.

  11. #11
    Junior Member
    Join Date
    Aug 2019
    Posts
    6
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Re: Scraping data from the web

    sorry, here it is

    import java.util.Scanner;
    import org.jsoup.Jsoup;
    import org.jsoup.nodes.Document;
    import org.jsoup.nodes.Element;
     
    public class WebScrape {
     
    	public static void main(String[] args) throws Exception {
     
    		// User Input Scanner
    		Scanner scanner = new Scanner(System.in);
     
    		// Ask for ticker + read user input
    		System.out.println("Ticker: ");
    		String userInput = scanner.next();
     
    		// Main url
    		final String url = "https://finviz.com/quote.ashx?t=" + userInput;
     
    		// Get data
    		try {
    			final Document document = Jsoup.connect(url).get();
    			for (Element row : document.select("table.snapshot-table2 tr")) {
    				if (row.select("td.snapshot-td2:nth-of-type(10)").text().contentEquals("")) {
    					continue;
    				} else {
    					final String data = row.select("td.snapshot-td2:nth-of-type(10)").text();
    					System.out.println(data);
    					{
     
    					}
    				}
    			}
    		} catch (Exception ex) {
    			ex.printStackTrace();
     
    		}
    	}
    }

    The program runs fine and extracts the whole column from the table. Like this

    Ticker:
    nflx
    437.59M
    430.62M
    5.12%
    3.18
    388.40
    231.23 - 386.80
    -24.65%
    26.04%
    34.42
    0.86
    6.94M
    5,939,911


    What I am trying to do is extract the second value from what the program returns. Or instead of the whole row or column, return a single value.
    Last edited by AD2; August 24th, 2019 at 07:36 AM.

  12. #12
    Super Moderator Norm's Avatar
    Join Date
    May 2010
    Location
    Eastern Florida
    Posts
    25,165
    Thanks
    65
    Thanked 2,725 Times in 2,675 Posts

    Default Re: Scraping data from the web

    Please edit your post and wrap your code with code tags:

    [code]
    **YOUR CODE GOES HERE**
    [/code]

    to get highlighting and preserve formatting.

    Can you also post the data that is the problem and something that shows what the program does with that data and where results are bad.

    --- Update ---

    extract the second value from what the program returns. Or instead of the whole row or column, return a single value.
    Where is the second row read?
    How would the program identify the single value to return?


    There are missing packages when I try to compile the code.
    If you don't understand my answer, don't ignore it, ask a question.

  13. #13
    Junior Member
    Join Date
    Dec 2019
    Posts
    10
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Default Re: Scraping data from the web

    Hello,

    As far as I understand you will read data from a website/page.
    I had this question also. I have a webserver running on a ESP32 with a BME280 sensor which give temperature, humidity and air pressure.
    With the help of the OkHttp librari which example can be find on the internet. https://www.bing.com/videos/search?q...6FORM%3DHDRSC3

    With help of string manipulation and unique letter combination I was able to retract the temp, humi, and airpress data.
    [code]
    package com.example.okhttprequestexample;

    import androidx.appcompat.app.AppCompatActivity;
    import android.os.Bundle;
    import android.widget.TextView;
    import java.lang.String;
    import java.io.IOException;
    import okhttp3.Call;
    import okhttp3.Callback;
    import okhttp3.OkHttpClient;
    import okhttp3.Request;
    import okhttp3.Response;




    public class MainActivity extends AppCompatActivity {



    private TextView mTextViewResult;


    @Override
    protected void onCreate(Bundle savedInstanceState) {
    super.onCreate(savedInstanceState);
    setContentView(R.layout.activity_main);

    mTextViewResult = findViewById(R.id.text_view_tempb);



    OkHttpClient client = new OkHttpClient();

    String url = "http://192.168.2.11";

    Request request = new Request.Builder()
    .url(url)
    .build();

    client.newCall(request).enqueue(new Callback() {
    @Override
    public void onFailure(Call call, IOException e) {
    e.printStackTrace();
    }

    @Override
    public void onResponse(Call call, Response response) throws IOException {
    if (response.isSuccessful()) {
    final String myResponse = response.body().string();

    MainActivity.this.runOnUiThread(new Runnable() {
    @Override
    public void run() {
    int index_tempb = myResponse.indexOf("Temperature");
    // mTextViewResult.setText(myResponse);
    String tempb = myResponse.substring(index_tempb-102,index_tempb-97);

    System.out.println(myResponse.substring(index_temp b-102,index_tempb-97));
    mTextViewResult.setText(myResponse.substring(index _tempb-102,index_tempb-97));


    int index_tempi = myResponse.indexOf("Humidity");
    String tempi = myResponse.substring(index_tempi-98,index_tempi-93);
    //mTextViewResult1.setText(myResponse.substring(inde x_tempi-98,index_tempi-93));
    mTextViewResult.setText(myResponse.substring(index _tempi-98,index_tempi-93));

    }

    });
    }


    }


    });
    }

    }
    [code/]
    Succes,

    ilioSS

Similar Threads

  1. Replies: 1
    Last Post: August 4th, 2014, 05:47 PM
  2. Implementation of web data into non-web Java Application
    By MikeJones in forum What's Wrong With My Code?
    Replies: 0
    Last Post: August 4th, 2014, 05:43 PM
  3. android screen scraping error
    By brandon66 in forum Android Development
    Replies: 4
    Last Post: January 25th, 2013, 06:31 AM
  4. Get data from web page
    By inmar32 in forum Web Frameworks
    Replies: 4
    Last Post: July 21st, 2010, 09:09 AM
  5. How to print out data from web page to printer????
    By verma86 in forum JavaServer Pages: JSP & JSTL
    Replies: 0
    Last Post: April 2nd, 2010, 10:26 PM