How to read a table from an url?
32 views (last 30 days)
Show older comments
Hi, I'd need some help. How is it possible to read a table from an url?
The following sequence allows constructing a URL object, opening a URL connection, setting up a buffered stream reader, and reading lines (line by line):
url = java.net.URL('http://www.mathworks.com')
is = openStream(url);
isr = java.io.InputStreamReader(is);
br = java.io.BufferedReader(isr);
s = char(readLine(br)); % can be repeated
I think bufferedReader is only appropriate to read contents row by row. In case the webpage contains a table, this code works, but does not read all the elements of the table, i.e. tbody
Example (java contents)
<div class="table-responsive no-padding-top"> : start of table, displayed in Matlab (e.g. command window)
<table width=... > : table formatting, displayed in Matlab
<thead>: start of table header, displayed in Matlab
<tr>: entire row related to table header, displayed in Matlab
<th> ... </th>: 1st element of header, displayed in Matlab
<th> ... </th>: 2nd element of header, displayed in Matlab
...
</tr>, displayed in Matlab
</thead>: end of header/description of column names, displayed in Matlab
<tbody>: full table with its contents, "<tbody>" displayed in Matlab
<tr>: 1st row of table, *not displayed* in Matlab
<td>...</td>: 1st cell of 1st row, *not displayed* in Matlab
<td>...</td>: 2nd cell of 1st row, *not displayed* in Matlab
</tr>: end of 1st row, *not displayed* in Matlab
<tr>: 2nd row of table, *not displayed* in Matlab
<td>...</td>: 1st cell of 1st row, *not displayed* in Matlab
<td>...</td>: 2nd cell of 1st row, *not displayed* in Matlab
</tr>: end of 2nd row, *not displayed* in Matlab
</tbody>: end of table contents, "</tbody>" displayed in Matlab
</table>: end of table object, displayed in Matlab
How can we read the details behind a table body (tbody)?
Many thanks for your support!
Thomas
0 Comments
Accepted Answer
More Answers (1)
Toshiaki Takeuchi
on 24 Oct 2023
You can use readtable https://www.mathworks.com/help/matlab/ref/readtable.html
url = "https://www.mathworks.com/help/matlab/text-files.html";
T = readtable(url,TableSelector="//TABLE[contains(.,'readtable')]", ...
ReadVariableNames=false)
0 Comments
See Also
Categories
Find more on Text Files in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!