How do you extract from a website table?
24 views (last 30 days)
Show older comments
Christopher Taylor on 3 Jun 2022
Answered: Christopher Creutzig on 7 Jun 2022
I'm trying to extract data from the table on this page(http://www.newyorkschools.com/districts/nyc-district-11.html).
I've tried tp uses webread but it isn't quite working for me. I'm attempting to extract the school names and the grade level and them place that into an excel file. (Helping a friend starting a stem program)
How do you think I should do?
data = webread(url)
selector = 'School Name'
subtrees = findElement(tree,selector)
Christopher Creutzig on 7 Jun 2022
The problem with this page is that it is not using an HTML <table> for the data you are looking for. Otherwise, you would be able to simply use readtable(url) or maybe readtable(url,TableIndex=2).
Also, the selector needs to follow what is found in the HTML source, which again in this particular case is not made easy. MATLAB does not control what you need in there.
Here's something to get you started with:
data = webread(url);
tree = htmlTree(data);
tabs = findElement(tree,"#myTabContent > div");
schools = tabs(1);
rows = findElement(schools,".p_div");
schoolnames = findElement(schools,".pp-col-40");
More Answers (1)
Seth Furman on 6 Jun 2022
Try the approach suggested in the following MATLAB Answers post.
Find more on Managing Data in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!Start Hunting!