MATLAB and Simulink Training

Processing Big Data with MATLAB

View schedule and enroll

Course Details

This one-day course focuses on adapting existing algorithms to work with a collection of data files or a single file that is too big to fit in memory. Learn to represent big data in MATLAB®, adjust existing code to work efficiently with it, and scale up the analysis to take advantage of your own computing resources or a cloud. Topics include:
 
  • Creating datastores to read from data sources
  • Representing and manipulating big data using tall arrays
  • Importing custom or special data formats such as Apache Parquet™ and applying custom functions to tall arrays or datastores
  • Working with clusters of computers and cloud environments

Day 1 of 1


Prototyping Big Data Algorithms

Objective: Applying existing algorithms to data sets that do not fit into memory.

  • Importing data using datastores
  • Creating tall arrays
  • Running algorithms on tall arrays
  • Optimizing code for tall arrays
  • Reading data from cloud environments

Handling Custom Data and Algorithms

Objective: Importing custom formatted data and applying algorithms that are not implemented for tall arrays

  • Importing special data formats such as Apache Parquet
  • Importing custom formatted data using file datastores and custom datastores
  • Partially importing single files
  • Applying transformations, reductions, and moving window operations to tall arrays
  • Transforming datastores

Working with Clusters and Clouds

Objective: Run big data algorithms on a cluster of computers or on cloud environments.

  • Local and remote clusters
  • Cluster discovery and connection
  • Setup of a cluster on a cloud environment
  • File access considerations

Level: Intermediate

Duration: 1 day

Languages: English, 한국어

View schedule and enroll