MitoEM (Instance Segmentation) ================================ .. include:: _intro.rst This tutorial reproduces 3D mitochondria **instance segmentation** on the `MitoEM `_ dataset released by `Wei et al. `__ in 2020. The recipe lives under ``tutorials/mitoEM/`` with two dataset-specific entry points (``R.yaml`` for MitoEM-Rat, ``H.yaml`` for MitoEM-Human, ``HR.yaml`` for joint training) sharing ``common.yaml``. The pipeline is multi-task: predict short-range affinity, long-range affinity (radius 5), and a skeleton-aware EDT head, then decode with a distance-watershed step. Evaluation uses Adapted Rand and Variation of Information. Goal ---- The pipeline pins the following setup (encoded in ``tutorials/mitoEM/common.yaml`` and inherited by ``R.yaml`` / ``H.yaml`` / ``HR.yaml``): - **Input** ``[32, 256, 256]`` patches at native MitoEM resolution ``30 × 8 × 8`` nm. - **Model** MedNeXt-M, kernel size 3, ``checkpoint_style: outside_block``, three output heads: - ``aff_r1`` — 3-channel short-range affinity at offsets ``(0, 0, 1) / (0, 1, 0) / (1, 0, 0)``; - ``aff_r5`` — 3-channel long-range affinity at offsets ``(0, 0, 5) / (0, 5, 0) / (5, 0, 0)``; - ``sdt`` — 1-channel skeleton-aware EDT head. - **Loss** per-channel BCE on each affinity head plus a SmoothL1 (``tanh: true``) on the EDT head, balanced by ``uncertainty`` loss-balancing. - **Augmentation** ``aug_em_neuron_fast`` profile with rotations on all three axes. - **Optimization** ``warmup_cosine_lr`` profile, 200 epochs × 1000 steps, ``accumulate_grad_batches=4``, ``precision=bf16-mixed``. - **Inference** sliding window 32 × 256 × 256, ``sw_batch_size=1``, 50 % overlap, bump blending, replicate-padding mode; head set to ``aff_r1`` for the saved primary output. - **Decoder** ``decode_distance_watershed`` over the EDT channel (``distance_channels=[6]``, ``distance_threshold=[0.5, 0]``, ``min_seed_size=100``, ``min_instance_size=50``). - **Metric** ``adapted_rand`` + ``voi``. 1 - Get the data ^^^^^^^^^^^^^^^^ The MitoEM dataset is publicly available at the `project page `_ and the `MitoEM Challenge `_. On the lab cluster it is staged at: .. code-block:: text /projects/weilab/dataset/mito/mitoEM/ EM30-R/ # rat im_train.h5, mito_train-v2.h5 im_val.h5, mito_val-v2.h5 im_test.h5, mito_test-v2.h5 EM30-H/ # human (same layout) Each split is a 4096 × 4096 × {400|100|500} HDF5 stack at 30 × 8 × 8 nm. The ``train.data.root_path`` field in ``common.yaml`` points at this directory; override at the CLI if you stage data elsewhere. The test labels for MitoEM challenge submission are not publicly released; ``mito_test-v2.h5`` here refers to the locally maintained v2 labels for offline development. 2 - Run training ^^^^^^^^^^^^^^^^ Pick the dataset variant and run: .. code-block:: bash conda activate pytc # MitoEM-Rat python scripts/main.py --config tutorials/mitoEM/R.yaml # MitoEM-Human python scripts/main.py --config tutorials/mitoEM/H.yaml # Joint (rat + human in the same training run) python scripts/main.py --config tutorials/mitoEM/HR.yaml The config sets ``system.num_gpus: -1`` and ``system.num_workers: -1``, so PyTC fans out across every visible GPU. Training schedule: - ``max_epochs=200``, ``n_steps_per_epoch=1000`` → 200 k optimizer steps total. - ``accumulate_grad_batches=4`` with ``batch_size=1`` per GPU → effective batch size 4 × num_gpus. - ``checkpoint.monitor=val_loss_total`` with ``mode=min``, ``save_top_k=3``. - Image previews on the ``aff_r1`` head every 10 epochs. Outputs land in ``outputs/mitoem30{r,h,hr}_mednext_sdt_multitask//``. Monitor with TensorBoard: .. code-block:: bash just tensorboard mitoem30r_mednext_sdt_multitask 3 - Inference, decoding, evaluation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Run the combined ``test`` mode: .. code-block:: bash python scripts/main.py --config tutorials/mitoEM/R.yaml \ --mode test \ --checkpoint outputs/mitoem30r_mednext_sdt_multitask//checkpoints/last.ckpt What happens, in order: 1. **Inference**. Sliding window 32 × 256 × 256, 50 % overlap, bump blending, ``padding_mode=replicate``. The primary head ``aff_r1`` is selected at save time, and per-channel sigmoid is applied. The raw 7-channel multi-head prediction is saved as ``test_im_prediction.h5``. 2. **Decoding**. ``decode_distance_watershed`` runs on the EDT channel (channel 6), seeded at distance > 0.5, growing until the distance hits 0, with seeds < 100 voxels and instances < 50 voxels filtered out. The fast EDT path is enabled with ``edt_parallel=8``. 3. **Evaluation**. Adapted Rand and Variation of Information against the test labels; written next to the segmentation. To swap in the ``aff_r5`` head as the primary inference output: .. code-block:: bash python scripts/main.py --config tutorials/mitoEM/R.yaml \ --mode test --checkpoint \ inference.model.head=aff_r5 4 - Submitting to the MitoEM Challenge ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The Grand Challenge accepts segmentation HDF5 volumes. After ``--mode test`` produces the segmentation under ``outputs/.../results_step=/``, follow the formatting rules at https://mitoem.grand-challenge.org/ and submit. Performance on the challenge test split is only computable on the Grand Challenge website because public ground truth is not released for that split. Per-volume offline evaluation on the **validation** split (provided in ``EM30-{R,H}/mito_val-v2.h5``) uses the same ``adapted_rand + voi`` metrics described above; just point ``test.data.test`` at the val volumes. 5 - Reference behavior ^^^^^^^^^^^^^^^^^^^^^^^^ A few sanity-check signals: - **Training loss** has three components (``aff_r1``, ``aff_r5``, ``sdt``) and uncertainty-balanced weights. The ``train_loss_term_*_weighted`` scalars logged in TensorBoard are the most informative — uncertainty balancing typically pushes the ``aff_r5`` term down faster than ``aff_r1`` because the long-range task is harder. - **Validation loss** is checked at every epoch boundary; the best-3 checkpoints by ``val_loss_total`` are kept. - **Inference** on the 4096 × 4096 × 500 test volume is the dominant cost; expect roughly 1-2 hours on a single A100/H100 with ``sw_batch_size=1``. - **Decoder threshold** (``distance_threshold[0]``) is the primary knob for over- / under-segmentation. The default 0.5 is a reasonable starting point; lower (e.g. 0.3) yields more seeds. - **Adapted Rand** below ~0.05 on the validation split is in the ballpark of the published MitoEM-Rat baseline. The challenge uses AP-75 (average precision at IoU 0.75), which is computed by the Grand Challenge submission system.