Originally Posted by
KevinWorkman
What about them doesn't suit your requirements?
Thanks for response, I didn't notice I have not wrote some additional requirements:
1. The program must be written in Java;
2. If the table contains spans, the value of the spanned cell should be put only ones in array in the left up corner of cells of this array, the other cells, correspondent to the spanned cell in HTML table should be nulled (maybe it is hard to understand, see examples below).
So, the JS_Extractor is not valid, as it is written on PHP and doesn't handle inherited tables. Java HTML Table parser Simbiosis doesn't handle even spans.
Today I tried to use HTTPUnit, but the results are disappointing too. The simple tables are parsed correctly, but the complex one not.
E.g.
Table code:
HTML Code:
<html>
<body>
<table border="2" width="20%" height="20%">
<tr bgcolor="red">
<td colspan="2" rowspan="2">
<span>1</span>
</td>
<td>
<span>2</span>
</td>
<td>
<span>3</span></td>
<td>
<span>4.1</span>
</td>
<td>
<span>5.1</span>
</td>
<td>
<span>6 last</span>
</td>
</tr>
<tr bgcolor="green">
<td rowspan="2">
<span>1</span>
</td>
<td>
<span>2.4x</span>
</td>
<td>
<span>3.3x</span>
</td>
<td>
<span>4</span>
</td>
<td>
<span>5 last</span>
</td>
</tr>
<tr bgcolor="ffcc00">
<td>
<span>1x</span>
</td>
<td>
<span>2</span>
</td>
<td>
<span>3</span>
</td>
<td>
<span>4</span>
</td>
<td>
<span>5.8</span>
</td>
<td>
<span>6 last</span>
</td>
</tr>
<tr bgcolor="yellow">
<td><span>1</span></td>
<td><span>2</span></td>
<td><span>3</span></td>
<td><span>4</span></td>
<td><span>5</span></td>
<td><span>6</span></td>
<td><span>7 last</span></td>
</tr>
</table>
</body>
</html>
This is a table, shown in Chrome:
An array I want to see as a result:
Here you can see what I meant in the requirement number 2. "1" from the fist span and "1" from the second areas are put in the left upper corner of the spanned area, while the rest cells in this area are null.
The result, given by HTTPUnit:
As you can see, even if we throw the requirement 2 away, we have an error in the third row here.
And this is a rather simple example, without inherited tables, with them it is terribly wrong.
What can you recomend me in that case?