For reference, i wrote an in-place memory version of the above code, going the route of the suggested stackoverflow approach - but for general n datasets. seems like the right solution!
function cellDataIn = gpuApplyInvKmatrix(kMatrix,cellDataIn)
%GPUAPPLYINVKMATRIX will take kmatrix and solve the system given varargin
%datasets. should be gpu friendly. datasets can be ndimensional
sizeADataset = size(cellDataIn{1});
cellDataIn = cellfun(@(x) x(:),cellDataIn,'UniformOutput',false);
cellDataIn = [cellDataIn{:}]';
cellDataIn = kMatrix \ cellDataIn;
cellDataIn = num2cell(cellDataIn,2);
cellDataIn = cellfunNonUniformOutput(@(x) reshape(x,sizeADataset),cellDataIn);