Extracting data from pdf files
Show older comments
Hi,
I have around 300 pdf files with 19 pages each. I want to extract from each of them a fraction of a table on page 4 in order to build a research data set. Is i possible to do so using matlab? if so,which toolboxes and functions I need. I have matlab 2013a.
Accepted Answer
More Answers (1)
Christopher Creutzig
on 27 Apr 2021
0 votes
JFTR, since R2017b, extractFileText('filename.pdf','Pages',4) from Text Analytics Toolbox gives you the text on ("physical") page 4 of the PDF, from which you can then extract the parts you need with string operations (extractBetween, regexp, etc.).
Categories
Find more on Characters and Strings in Help Center and File Exchange
Products
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!