# Introduction to MPI

This section adapts [mpitutorial.com] materials using [IPython Parallel] and [mpi4py] to run MPI code in Jupyter notebooks.
We won't go into detail in using IPython Parallel, but cover the key bits for getting started.

[mpitutorial.com] materials are used under the MIT License.

[IPython Parallel]: https://ipyparallel.readthedocs.io

[mpitutorial.com]: https://mpitutorial.com

[mpi4py]: https://mpi4py.readthedocs.io/en/stable/

In [1]:
import ipyparallel as ipp

# create a cluster
cluster = ipp.Cluster(engines="mpi", n=2)
# start that cluster and connect to it
rc = cluster.start_and_connect_sync()

Starting 2 engines with <class 'ipyparallel.cluster.launcher.MPIEngineSetLauncher'>


  0%|          | 0/2 [00:00<?, ?engine/s]

What did that do?

```
mpiexec -n 2 python -m ipyparallel.engine --mpi
```

In [2]:
cluster.engine_set.args

['mpiexec',
 '-n',
 '2',
 '/home/dokken/src/mambaforge/envs/mpi-tutorial/bin/python',
 '-m',
 'ipyparallel.engine',
 '--mpi']

If we 'activate' the client,
it registers [magics with IPython](https://ipython.readthedocs.io/en/stable/interactive/magics.html), so we can use `%%px` to run cells on the _engines_
instead of in the local notebook.

In [3]:
rc.activate()

<DirectView all>

Now we have `%%px` available:

In [4]:
%%px
import os
pid = os.getpid()
pid

[0;31mOut[0:1]: [0m330874

[0;31mOut[1:1]: [0m330875

A cell passed to `%%px` is run on _all engines at once_.
This is the equivalent of `mpiexec myscript.py`, when running noninteractive MPI.

From now on, notebooks will start with a brief boilerplate to start and register the cluster,
so we can use `%%px`.

## Rank and size

Our very first MPI code, to test `%%px`.
We are going to get the "MPI World communicator".

The **rank** is the integer id of the current process,
while the **size** is the number of processes in the communicator.

In [5]:
%%px
# Find out rank, size
from mpi4py import MPI

comm = MPI.COMM_WORLD
rank = comm.rank
size = comm.size

print(f"I am rank {rank} / {size}")

[stdout:0] I am rank 0 / 2


[stdout:1] I am rank 1 / 2


In IPython Parallel, the state is _persistent_.
This means the `rank` and `size` variables can be used in subsequent cells:

In [6]:
%%px
print(f"Rank {rank} has PID {pid}")

[stdout:0] Rank 0 has PID 330874


[stdout:1] Rank 1 has PID 330875


To translate a notebook written with `%%px` to a script for `mpiexec`, you would concatenate all the `%%px` cells into a single `.py` file.


Now we can stop the cluster if we want.
It should get cleaned up automatically when the notebook exits, but it's good to be explicit.

In [7]:
cluster.stop_cluster_sync()

Stopping controller
Controller stopped: {'exit_code': 0, 'pid': 330799, 'identifier': 'ipcontroller-1704460714-uyr9-330712'}
Stopping engine(s): 1704460716
engine set stopped 1704460716: {'exit_code': 0, 'pid': 330869, 'identifier': 'ipengine-1704460714-uyr9-1704460716-330712'}
