NISB ====== .. include:: _intro.rst This tutorial reproduces the BANIS ``train_base_long`` neuron-segmentation result on the NISB dataset using ``tutorials/neuron_nisb/base_banis.yaml``. The target is a 9 nm, 6-channel affinity model (MedNeXt-S / kernel 3) trained for 200 k steps on 128-cube patches, decoded with affinity-threshold connected components, and evaluated with the NERL skeleton metric. Reference: - Codebase: `StructuralNeurobiologyLab/banis `_ — the upstream BANIS pipeline this configuration mirrors. Affinity offset semantics (``affinity_mode: banis``, source-voxel storage, ``edge_offset: 0``), patched-inference geometry (128³ windows, 50 % overlap, ``snap_to_edge`` boundary handling, distance-transform blending), and the ``train_base_long`` schedule are all matched against this repo. If you only need a different starting point, the same directory ships several variants: - ``base_banis.yaml`` — the canonical reproduction (this page). - ``base_banis_v1.yaml`` / ``v2.yaml`` / ``v3.yaml`` — successive variations of the base config (different model size / target shape). - ``*_erosion2.yaml`` — same configs with label erosion 2 to widen instance borders. - ``base_banis_chunk.yaml`` / ``base_banis_crop.yaml`` — chunked / cropped data variants for low-memory machines. - ``common.yaml`` — MedNeXt-B with affinity + signed distance transform at 40 nm; an alternative starting point, *not* the BANIS reproduction. Goal ---- Match the BANIS ``train_base_long`` setup faithfully: - 9 nm isotropic-XY input, ``[128, 128, 128]`` patches. - 6-channel affinity target: short-range ``(1, 0, 0) / (0, 1, 0) / (0, 0, 1)`` plus long-range at 10 voxels, in source-voxel BANIS convention. - MedNeXt-S, kernel size 3, ``WeightedBCEWithLogitsLoss``, no deep supervision. - AdamW @ ``lr=1e-3``, ``weight_decay=0.01``, cosine to ``min_lr=0`` over 200 000 steps, ``precision=16-mixed``, ``batch_size=4`` per GPU. - Sliding-window inference with 128-cube windows, 50 % overlap, distance-transform blending, snap-to-edge boundary handling. Output is ``float16``. - Decoder: ``decode_affinity_cc`` (numba) at threshold 0.75, with the short-range affinities (channels 0-2) only. - Metric: ``nerl`` against ``skeleton.pkl``. Each of these is encoded directly in ``tutorials/neuron_nisb/base_banis.yaml``; do not change them in passing. 1 - Get the data ^^^^^^^^^^^^^^^^^^ The NISB ``base`` split is staged on the lab cluster at: .. code-block:: text /projects/weilab/dataset/nisb/base/ train/seed{0..N}/data.zarr val/seed100/data.zarr test/seed101/data.zarr + skeleton.pkl Each ``data.zarr`` is XYZ-ordered (matching upstream BANIS); the config keeps it that way and the decoder consumes channels in axis-0, axis-1, axis-2 order. ``skeleton.pkl`` is required for NERL evaluation on ``test/seed101``. If you are working off-cluster, edit the ``data.train.path`` / ``data.val.path`` / ``data.test.path`` entries in ``base_banis.yaml`` (under the ``train:``, ``test:``, and ``tune:`` sections) to point at your local copy. 2 - Run training ^^^^^^^^^^^^^^^^^^ .. code-block:: bash conda activate pytc python scripts/main.py --config tutorials/neuron_nisb/base_banis.yaml The config sets ``system.num_gpus: -1`` and ``system.num_workers: -1``, so PyTC will fan out across every visible GPU and use the auto-planned worker count. Override at the CLI if needed: .. code-block:: bash python scripts/main.py --config tutorials/neuron_nisb/base_banis.yaml \ system.num_gpus=4 data.dataloader.batch_size=2 The training schedule is **step-based**, not epoch-based: - ``max_steps: 200000``, ``n_steps_per_epoch: 5000``, ``val_check_interval: 5000`` — validation every 5 k steps. - ``val_steps_per_epoch: 100`` — matches BANIS ``limit_val_batches=100``. - Cosine LR over 200 k steps to 0. - Checkpoints saved every 50 k steps to ``outputs/nisb_base_banis//checkpoints/``. Monitor with TensorBoard: .. code-block:: bash just tensorboard nisb_base_banis 3 - Inference, decoding, evaluation ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Run the combined ``test`` mode with the trained checkpoint. This invokes the inference, decoding, and evaluation stages back-to-back: .. code-block:: bash python scripts/main.py --config tutorials/neuron_nisb/base_banis.yaml \ --mode test \ --checkpoint outputs/nisb_base_banis//checkpoints/last.ckpt What happens, in order: 1. **Inference** (``connectomics.inference.stage``). Runs the BANIS-style sliding-window: 128³ windows, 50 % overlap, ``snap_to_edge`` boundary handling, distance-transform blending, ``sw_device=cuda``, ``output_device=cpu``. Saves the raw 6-channel affinity as ``test_im_prediction.h5`` under ``outputs/nisb_base_banis//results_step=/``. Output dtype is ``float16``. 2. **Decoding** (``connectomics.decoding.stage``). Selects the short-range affinities (channels 0–2 in XYZ order), optionally masks them by ``affinity_mask_path``, then runs ``decode_affinity_cc`` via the numba backend at threshold 0.75 with ``edge_offset: 0`` (BANIS source-voxel convention). 3. **Evaluation** (``connectomics.evaluation.stage``). Computes NERL against ``test/seed101/skeleton.pkl`` and writes the metrics file alongside the segmentation. If you do not yet have the affinity mask referenced under ``decoding.affinity_mask_path``, drop or override that line — the mask is optional and the decoder will treat all voxels as valid: .. code-block:: bash python scripts/main.py --config tutorials/neuron_nisb/base_banis.yaml \ --mode test --checkpoint \ decoding.affinity_mask_path=null 4 - Tune the decoder threshold ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The decoder threshold (0.75 in the canonical config) is the most impactful single parameter. ``--mode tune`` runs an Optuna search on the held-out *training* seed (``seed0``) using the ``cupy`` backend for speed: .. code-block:: bash python scripts/main.py --config tutorials/neuron_nisb/base_banis.yaml \ --mode tune \ --checkpoint outputs/nisb_base_banis//checkpoints/last.ckpt Configuration (under the ``tune:`` block): - TPE sampler with 4 startup trials, 4 total trials. - Search space: ``threshold ∈ [0.4, 0.9]`` step 0.1. - Single objective: maximize ``nerl``. - Study persisted as ``nisb_base_banis_cc_tuning``; ``load_if_exists: true`` so subsequent runs append trials. 5 - Reference behavior ^^^^^^^^^^^^^^^^^^^^^^^^ A few sanity-check signals during reproduction: - **Training loss** (``train_loss_total_epoch``) should drop steadily through the first ~50 k steps; cosine LR makes the late phase a long, slow refinement rather than another sharp drop. - **Validation loss** is reported at 5 k-step intervals. The default checkpoint monitor is ``val_loss_total`` (mode ``min``), top-3 saved. - **Inference speed** with ``sw_batch_size: 8``, 50 % overlap, and ``output_dtype: float16`` is the dominant cost; expect minutes per ``seed`` volume on a single A100/H100, hours on a single L40S. - **Decode threshold** has a clean unimodal curve over ``[0.4, 0.9]`` on validation NERL; the canonical 0.75 is a reasonable starting point but worth re-running ``--mode tune`` per checkpoint. For the underlying mechanics (affinity learning, watershed-style post-processing), see :doc:`snemi3d` — the same affinity-then-decode pipeline applies.