Hi thanks for that,
We have a good understanding of how trident works with HPC server for embarrassingly parallel problems, but maybe not so much when data may need to be passed between machines.
I think if we were to use MPI in our workflows we would probably use one of the two libraries you mention above. The thing I don't quite understand is how a trident workflow could actually be executed using MPI. It would be preferable to have
the workflow run with the lightweight executor on compute nodes so that all libraries etc are fetched from the central registry, and the monitoring is still performed by trident. However to run an MPI enabled application the program must be run using
the mpiexec application. It is not clear how a workflow should be invoked using both of these applications (mpiexec and lightweightexecutor)
ie We currently run two instances of a workflow in parallel on 2 nodes of a cluster in the following way
LightweightExecutor.exe -wf aaaaaaaa-bbbb-cccc-dddddddddddd -input parameters1.xml, and
LightweightExecutor.exe -wf aaaaaaaa-bbbb-cccc-dddddddddddd -input parameters2.xml
So if we wanted to communicate between these workflow instances using MPI, firstly I assume the activity (doing the communicating) would need to be MPI enabled, and secondly we would need to launch the workflow instances using the mpiexec application.
I am thinking that launching the workflow using the MPI MIMD approach may not work in this case since each process needs to be parametrized differently (with the different workflow parameters file) and I don't believe MIMD will support this. However
a MPMD approach may work if each workflow instance is considered as a different program? I guess I am after more clarity how to use both mpiexec and lightweightexecutor together.
DyradLINQ looks really interesting... we have been familiar with it for some time and would like to test drive it sometime in the future!