Does matlab support parquet partitions
16 views (last 30 days)
Show older comments
I have a large data set written using parquet partitioning. The partition variable is called 'mdRun', and I have 10 parquet files created in 10 directories as follows:
.../events/mdRun=0/events-0.parquet
../events/mdRun=1/events-0.parquet
and so on. I created these files using pyarrow Hive partitioning.
Using pyarrow, I can read the parquet file corresponding to a single partition using the filter argument, which will read only the parquet file stored in the appropriate directory. As a nice side effect, the mdRun column is not stored in the parquet file, but it is automatically included when I read a partition file(s).
Is it possible to read a parquet partitioned dataset in matlab in the same way?
Thank you!
0 Comments
Answers (1)
Sudarshan
on 2 Jan 2023
Hi Jerry,
As per my knowledge, the feature is not supported by MATLAB in R2022b. This request has already been forwarded to the relevant team.
However, MATLAB R2022b does support parquet file reading and writing. I have attached a few documentation links that may help you in working with parquet functions.
You can refer to the link below for various functions that could be useful in your case:
You can refer to link below for the detailed documentation of the data type mappings:
To help you read parquet files, you can refer the link below:
I hope that this helps!
0 Comments
See Also
Categories
Find more on Data Type Conversion in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!