I have set up MATLAB Parallel Server on our cluster. The MATLAB Job Scheduler is running on the headnode, and is able to talk to all of the workers on the compute nodes.
If I run MATLAB as a client on the headnode, I can pass all of the cluster profile validation tests. However, if I run the same tests on a different client machine (outside of the cluster), all of the tests pass except for the "Parallel pool test (parpool)". It fails after about 6 minutes with the following error:
Error Report: Failed to initialize the interactive session.
Error using parallel.internal.pool.InteractiveClient>iThrowIfBadParallelJobStatus (line 789)
The interactive communicating job errored with the following message: Client unable to connect to worker. Check whether a firewall is blocking communication between the worker machine and the MATLAB client machine.
I have the headnode set up so that it is nat-ing the cluster node traffic out of the cluster, so I am not sure why this isn't working. What is different between this test and the others, that this one would be failing when the others pass? It seems to me that in the previous tests, the client is talking to the MJS, and that is all, but in this case the workers need to talk directly to the client (according to the error message), which should be working (I can ssh from the worker machine to the client without issue). If the converse is true, and the client has to talk directly to the worker, I don't see how this would ever work in a cluster situation.
On another track, it may be that some ports are being blocked by filtering on our network switches. What ports do the workers need to be able to talk to the client?
Thank you for any help!