Main Content

kfoldPredict

Predict responses for observations in cross-validated kernel regression model

Description

YHat = kfoldPredict(CVMdl) returns cross-validated predicted responses by the cross-validated kernel regression model CVMdl. For every fold, kfoldPredict predicts the responses for validation-fold observations using a model trained on training-fold observations.

example

YHat = kfoldPredict(CVMdl,PredictionForMissingValue=prediction) uses the prediction value as the predicted response for observations with missing values in the predictor data. By default, kfoldPredict uses the median of the observed response values in the training-fold data. (since R2023b)

Examples

collapse all

Simulate sample data:

rng(0,'twister'); % For reproducibility
n = 1000;
x = linspace(-10,10,n)';
y = 1 + x*2e-2 + sin(x)./x + 0.2*randn(n,1);

Cross-validate a kernel regression model.

CVMdl = fitrkernel(x,y,'CrossVal','on');

By default, fitrkernel implements 10-fold cross-validation. CVMdl is a RegressionPartitionedKernel model. It contains the property Trained, which is a 10-by-1 cell array holding 10 RegressionKernel models that the software trained using the training set.

Predict responses for observations that fitrkernel did not use in training the folds.

yHat = kfoldPredict(CVMdl);

yHat is a numeric vector. Display the first five predicted responses.

yHat(1:5)
ans = 5×1

    0.8869
    0.7744
    0.8915
    0.8040
    0.8870

Input Arguments

collapse all

Cross-validated kernel regression model, specified as a RegressionPartitionedKernel model object. You can create a RegressionPartitionedKernel model using fitrkernel and specifying any of the one of the cross-validation name-value pair arguments, for example, CrossVal.

To obtain estimates, kfoldPredict applies the same data used to cross-validate the kernel regression model (see X input argument on fitrkernel page).

Since R2023b

Predicted response value to use for observations with missing predictor values, specified as "median", "mean", or a numeric scalar.

ValueDescription
"median"kfoldPredict uses the median of the observed response values in the training-fold data as the predicted response value for observations with missing predictor values.
"mean"kfoldPredict uses the mean of the observed response values in the training-fold data as the predicted response value for observations with missing predictor values.
Numeric scalarkfoldPredict uses this value as the predicted response value for observations with missing predictor values.

Example: "mean"

Example: NaN

Data Types: single | double | char | string

Output Arguments

collapse all

Cross-validated predicted responses, returned as an n-by-1 numeric array, where n is the number of observations in the predictor data used to create CVMdl (see X input argument on fitrkernel page).

Version History

Introduced in R2018b

expand all