@Norm - parsing HTML
Java's Standard Library has this class:
ParserDelegator (Java Platform SE 7 )
Which can do HTML parsing for you. However, most Java programmers I know tell me steer clear of it and use one of the many 3rd party HTML parsers available: TagSoup, JTidy, NekoHTML Jsoup etc.
And out of those Jsoup seems to be the populist choice at the moment because of it's flexible and easy to use API. You can traverse a document using the standard DOM model or use CSS (JQuery like) selectors.
Hope this helps.