Configuration System
=====================
.. note::
**PyTorch Connectomics v2.0** uses **Hydra/OmegaConf** as the configuration system.
PyTorch Connectomics uses a flexible, type-safe configuration system built on
`Hydra `_ and `OmegaConf `_.
Configuration files are written in YAML and support CLI overrides, composition, and type checking.
Quick Start
-----------
**Basic training:**
.. code-block:: bash
# Train with a config file
python scripts/main.py --config tutorials/minimal.yaml
# Override config from CLI
python scripts/main.py --config tutorials/minimal.yaml \
default.data.dataloader.batch_size=4 \
train.optimization.max_epochs=200
**Python API:**
.. code-block:: python
from connectomics.config import load_config
from omegaconf import OmegaConf
# Load config
cfg = load_config("tutorials/minimal.yaml")
# Access values
print(cfg.model.arch.type) # 'monai_basic_unet3d'
print(cfg.data.dataloader.batch_size) # 1
# Modify values
cfg.data.dataloader.batch_size = 4
# Print entire config
print(OmegaConf.to_yaml(cfg, resolve=True))
Configuration Structure
-----------------------
A typical v2.0 config file has a ``default`` section plus stage-specific
overrides such as ``train`` and ``test``:
.. code-block:: yaml
experiment_name: example
default:
system:
num_gpus: 1
num_workers: 4
seed: 42
model:
arch:
type: monai_basic_unet3d
in_channels: 1
out_channels: 1
input_size: [64, 128, 128]
output_size: [64, 128, 128]
loss:
losses:
- function: DiceLoss
weight: 1.0
data:
dataloader:
batch_size: 2
patch_size: [64, 128, 128]
train:
data:
train:
image: datasets/example/train_image.h5
label: datasets/example/train_label.h5
val:
image: datasets/example/val_image.h5
label: datasets/example/val_label.h5
optimization:
max_epochs: 100
precision: "16-mixed"
optimizer:
name: AdamW
lr: 1e-4
monitor:
checkpoint:
monitor: train_loss_total_epoch
save_top_k: 3
save_last: true
Configuration Sections
----------------------
System Configuration
^^^^^^^^^^^^^^^^^^^^
Controls hardware and reproducibility:
.. code-block:: yaml
system:
num_gpus: 1 # Number of GPUs (0 for CPU)
num_cpus: 4 # Number of CPU workers
seed: 42 # Random seed for reproducibility
deterministic: false # Use deterministic algorithms (slower)
Model Configuration
^^^^^^^^^^^^^^^^^^^
Specifies model architecture and loss functions:
.. code-block:: yaml
model:
arch:
type: monai_basic_unet3d # Model architecture
in_channels: 1 # Input channels
out_channels: 2 # Output channels
monai:
filters: [32, 64, 128, 256] # Filter sizes per level
dropout: 0.1 # Dropout rate
# Loss functions
loss:
deep_supervision: true
losses:
- function: DiceLoss
weight: 1.0
- function: BCEWithLogitsLoss
weight: 1.0
# Optional: architecture-specific nested blocks
mednext:
size: S
**Available architectures:**
- ``monai_basic_unet3d``: Simple and fast 3D U-Net
- ``monai_unet``: U-Net with residual units
- ``monai_unetr``: Transformer-based UNETR
- ``monai_swin_unetr``: Swin Transformer U-Net
- ``mednext``: MedNeXt with predefined sizes (S/B/M/L)
- ``mednext_custom``: MedNeXt with custom parameters
**Available loss functions:**
- ``DiceLoss``: Soft Dice loss
- ``FocalLoss``: Focal loss for class imbalance
- ``TverskyLoss``: Tversky loss
- ``DiceCELoss``: Combined Dice + Cross-Entropy
- ``BCEWithLogitsLoss``: Binary cross-entropy
- ``CrossEntropyLoss``: Multi-class cross-entropy
Data Configuration
^^^^^^^^^^^^^^^^^^
Specifies data paths and loading parameters:
.. code-block:: yaml
data:
# Data paths
train:
image: "path/to/train_image.h5"
label: "path/to/train_label.h5"
val:
image: "path/to/val_image.h5"
label: "path/to/val_label.h5"
test:
image: "path/to/test_image.h5" # Optional
dataloader:
patch_size: [128, 128, 128]
batch_size: 2
persistent_workers: true
pin_memory: true
# Augmentation
augmentation:
profile: aug_standard
Optimizer Configuration
^^^^^^^^^^^^^^^^^^^^^^^
Specifies optimizer type and hyperparameters:
.. code-block:: yaml
optimization:
optimizer:
name: AdamW # Optimizer type
lr: 1e-4 # Learning rate
weight_decay: 1e-4 # Weight decay (L2 regularization)
# Optimizer-specific params
betas: [0.9, 0.999] # For Adam/AdamW
momentum: 0.9 # For SGD
**Supported optimizers:**
- ``Adam``, ``AdamW``, ``SGD``, ``RMSprop``, ``Adagrad``
Scheduler Configuration
^^^^^^^^^^^^^^^^^^^^^^^
Specifies learning rate scheduling:
.. code-block:: yaml
optimization:
scheduler:
name: CosineAnnealingLR
warmup_epochs: 5
min_lr: 1e-6
# Scheduler-specific params
params:
T_max: 100
**Supported schedulers:**
- ``CosineAnnealingLR``, ``StepLR``, ``ExponentialLR``, ``ReduceLROnPlateau``
Training Configuration
^^^^^^^^^^^^^^^^^^^^^^
Controls training loop parameters:
.. code-block:: yaml
optimization:
max_epochs: 100
precision: "16-mixed" # "32", "16-mixed", "bf16-mixed"
gradient_clip_val: 1.0
accumulate_grad_batches: 1 # Gradient accumulation
val_check_interval: 1.0 # Validation frequency
Command Line Overrides
-----------------------
Override any config value from the command line:
.. code-block:: bash
# Override single values
python scripts/main.py --config tutorials/minimal.yaml \
default.data.dataloader.batch_size=4
# Override multiple values
python scripts/main.py --config tutorials/minimal.yaml \
default.data.dataloader.batch_size=4 \
train.optimization.max_epochs=200 \
train.optimization.optimizer.lr=1e-3
# Override nested values
python scripts/main.py --config tutorials/minimal.yaml \
default.model.monai.filters=[64,128,256,512]
# Add new values
python scripts/main.py --config tutorials/minimal.yaml \
+description="debug run"
Multiple Loss Functions
------------------------
Combine multiple loss functions with different weights:
.. code-block:: yaml
model:
loss:
losses:
- function: DiceLoss
weight: 1.0
- function: BCEWithLogitsLoss
weight: 1.0
- function: FocalLoss
weight: 0.5
The total loss is computed as:
.. code-block:: python
total_loss = (1.0 * dice_loss +
1.0 * bce_loss +
0.5 * focal_loss)
Deep Supervision
----------------
Enable multi-scale loss computation for improved training:
.. code-block:: yaml
model:
arch:
type: mednext
loss:
deep_supervision: true
losses:
- function: DiceLoss
weight: 1.0
Deep supervision automatically:
- Computes losses at multiple scales (5 scales for MedNeXt)
- Resizes ground truth to match each scale
- Averages losses across scales
MedNeXt Configuration
---------------------
**Predefined sizes:**
.. code-block:: yaml
model:
arch:
type: mednext
mednext:
size: S # S, B, M, or L
kernel_size: 3 # 3, 5, or 7
in_channels: 1
out_channels: 2
loss:
deep_supervision: true
**Custom configuration:**
.. code-block:: yaml
model:
arch:
type: mednext_custom
mednext:
base_channels: 32
exp_r: [2, 3, 4, 4, 4, 4, 4, 3, 2]
block_counts: [3, 4, 8, 8, 8, 8, 8, 4, 3]
kernel_size: 7
grn: true
loss:
deep_supervision: true
See `.claude/MEDNEXT.md `_ for details.
2D Configuration
----------------
For 2D segmentation tasks:
.. code-block:: yaml
data:
train:
do_2d: true
dataloader:
patch_size: [1, 256, 256] # [D, H, W] - D=1 for 2D
Mixed Precision Training
------------------------
Use mixed precision for faster training and reduced memory:
.. code-block:: yaml
optimization:
precision: "16-mixed" # FP16 mixed precision
# Or for BFloat16 (requires Ampere+ GPUs)
optimization:
precision: "bf16-mixed"
Distributed Training
--------------------
Automatically use distributed training with multiple GPUs:
.. code-block:: yaml
system:
num_gpus: 4 # Uses DDP automatically
data:
dataloader:
batch_size: 2 # Per-GPU batch size
Effective batch size = ``num_gpus * batch_size = 4 * 2 = 8``
Gradient Accumulation
---------------------
Simulate larger batch sizes:
.. code-block:: yaml
data:
dataloader:
batch_size: 2
optimization:
accumulate_grad_batches: 4
Effective batch size = ``batch_size * accumulate_grad_batches = 2 * 4 = 8``
Checkpointing and Logging
--------------------------
**Model checkpointing:**
.. code-block:: yaml
monitor:
checkpoint:
monitor: "val/loss"
mode: "min" # "min" or "max"
save_top_k: 3 # Keep best 3 checkpoints
save_last: true # Also save last checkpoint
filename: "epoch{epoch:02d}-loss{val/loss:.2f}"
**Early stopping:**
.. code-block:: yaml
monitor:
early_stopping:
enabled: true
monitor: "val/loss"
patience: 10
mode: "min"
min_delta: 0.0
**Logging:**
.. code-block:: yaml
monitor:
logging:
scalar:
loss_every_n_steps: 10
wandb:
use_wandb: false
project: "connectomics"
entity: "your_team"
Configuration in Python
-----------------------
**Load and modify configs:**
.. code-block:: python
from connectomics.config import load_config, save_config
from omegaconf import OmegaConf
# Load config
cfg = load_config("tutorials/minimal.yaml")
# Access values
print(cfg.model.arch.type)
print(cfg.data.dataloader.batch_size)
# Modify values
cfg.data.dataloader.batch_size = 4
cfg.optimization.max_epochs = 200
# Merge configs
overrides = OmegaConf.create({
"data": {"dataloader": {"batch_size": 8}},
"optimization": {"optimizer": {"lr": 1e-3}}
})
cfg = OmegaConf.merge(cfg, overrides)
# Save config
save_config(cfg, "modified_config.yaml")
# Print config
print(OmegaConf.to_yaml(cfg, resolve=True))
**Create configs programmatically:**
.. code-block:: python
from omegaconf import OmegaConf
cfg = OmegaConf.create({
"system": {"num_gpus": 1, "seed": 42},
"model": {
"arch": {"type": "monai_unet"},
"in_channels": 1,
"out_channels": 2
},
"data": {
"dataloader": {
"batch_size": 2,
"patch_size": [128, 128, 128]
}
}
})
Inference Configuration
-----------------------
Many training configs are reused for inference. Key differences:
.. code-block:: yaml
# inference_config.yaml
model:
arch:
type: monai_unet
# ... same as training
data:
test:
image: "path/to/test.h5"
dataloader:
patch_size: [128, 128, 128]
batch_size: 4 # Can use larger batch size
inference:
output_path: "predictions/"
sliding_window:
overlap: 0.5
blend_mode: gaussian
test_time_augmentation:
enabled: false
**Run inference:**
.. code-block:: bash
python scripts/main.py \
--config inference_config.yaml \
--mode test \
--checkpoint outputs/best.ckpt
Configuration Examples
----------------------
See the ``tutorials/`` directory for complete examples:
- `tutorials/minimal.yaml `_: minimal MONAI smoke config
- `tutorials/mito_lucchi++.yaml `_: mitochondria segmentation
- `tutorials/neuron_snemi/neuron_snemi_sdt.yaml `_: MedNeXt SNEMI config
Best Practices
--------------
1. **Use version control** for config files
2. **Document** non-obvious parameter choices
3. **Start simple** with basic configs, then customize
4. **Save configs** with experiment outputs for reproducibility
5. **Use meaningful names** for experiments
6. **Validate configs** before long training runs
For more information:
- `Hydra Documentation `_
- `OmegaConf Documentation `_
- `.claude/CLAUDE.md `_