First, why does integral2 not allow arrayvalued integration? I can only conjecture, but I think this is reasonable - integral2 is an adaptive routine. If your various kernels (all stuffed into one vector) are different enough that the adaptive rule would make different choices for each, then it has a problem. As such they decided not to allow that option.
So where does that leave you?
You can use GPU tools to do the work, which requires the parallel processing toolbox, and NVIDIA GPUs. (They make GeForce, so I assume you are ok there.) Can you do that without modifying any code at all? Probably not, since you will need to invoke those GPUs for this specific operation.
Perhps better, is to use parfor. You need the parallel toolbox anyway to use the GPUs. But integral2 will surely not be using multiple cores effectively, as that is NOT the sort of thing that is easily accelerated automatically. However, a simple parfor loop around the call to integral2 will allow you to use all the cores on your machine, at once.
Other options? Without knowing the nature of your integration kernels, I might wonder if you could break the multiple integration up, doing one of them using a direct Gaussian integration of some sort. Of course, that would significantly change your code, and I cannot even guess if it is an option. But it would speed things up.