Axis and Channel Order ====================== Spatial axis order (``xyz`` vs ``zyx``) is the single piece of metadata most likely to silently corrupt a connectomics pipeline. Different consumers in this stack disagree by convention, and affinity channel order is *coupled* to spatial order through the affinity offsets — not an independent knob. This page is the convention reference. The normative rules live in code docstrings (linked below); this page exists so a new contributor can build the right mental model in one read. Two Orders, Two Camps --------------------- The codebase touches volumes in two orders: .. list-table:: :header-rows: 1 * - Order - Used by * - ``xyz`` — array indexed ``arr[x, y, z]`` - BANIS zarr on disk; ``seunglab`` / ``abiss`` / ``cc3d``; CloudVolume / Neuroglancer precomputed output; ``scripts/h5_to_precomputed_cloud.py``. * - ``zyx`` — array indexed ``arr[z, y, x]`` - PyTorch / MONAI training tensors (``(C, Z, Y, X)``); ``waterz`` C++ kernel. Neither is "right." They are conventions of the respective ecosystems. The pipeline must declare which order a given array is in and convert exactly once at each consumer boundary. The Coupling: Channel Order Follows Spatial Order ------------------------------------------------- ``connectomics.data.processing.affinity.seg_to_affinity`` is the authoritative affinity producer in the framework, and it is **positional**: offset ``(a, b, c)`` shifts along the input array's axes ``0``, ``1``, ``2`` — *regardless of how the caller labels those axes*. The returned tensor has shape ``(C, A0, A1, A2)`` and channel ``i`` is the affinity along spatial axis ``i``. Concretely: - If the input ``seg`` is stored ``xyz`` (axis 0 = X, axis 1 = Y, axis 2 = Z), then the offsets ``["1-0-0", "0-1-0", "0-0-1"]`` produce channels ``[X-aff, Y-aff, Z-aff]`` in ``xyz`` order. - If the input ``seg`` is stored ``zyx`` (axis 0 = Z, axis 1 = Y, axis 2 = X), the same offset strings produce channels ``[Z-aff, Y-aff, X-aff]`` in ``zyx`` order. The implication is the rule that should never be violated: **Channel order is not an independent setting. Given the spatial axis order of the array and the affinity offset list, the channel semantics are determined.** When a tool exposes a separate ``channel_order`` knob (waterz, ``h5_to_precomputed_cloud.py``), that knob exists to express the *destination's* expectation; the source order is implicit and must match the spatial order the array was actually written with. Consumers and Their Required Orders ----------------------------------- .. list-table:: What each consumer requires :header-rows: 1 * - Consumer - Spatial - Channels (for affinity) - Notes * - PyTorch / Lightning training - ``zyx`` - ``zyx`` - Tensors are ``(C, Z, Y, X)``. Targets generated by ``seg_to_affinity`` inherit the input array's order. * - ``waterz`` (``decoding.decoders.waterz``) - ``zyx`` - ``zyx`` - ``channel_order="xyz"`` triggers an internal transpose so both pieces are converted together. * - ``abiss``, seung-lab fragment/agglomerate - ``xyz`` - ``xyz`` - No internal reorder. * - ``cc3d`` / ``decode_affinity_cc`` - ``xyz`` - ``xyz`` - * - CloudVolume precomputed / Neuroglancer - ``xyz`` - ``xyz`` - Output info layout is ``xyz``. The split is therefore: training and ``waterz`` are ``zyx``; everything else is ``xyz``. Where Order Is Currently Declared --------------------------------- There is **no single declared source of truth** in the codebase today — the V3 refactor surfaced this and the unification work is tracked in ``.agents/features/xyz-order/plan.md``. The current moving parts are: - ``connectomics.inference.artifact.PredictionArtifactMetadata`` (``layout="CZYX"`` and optional ``channel_order``) — stamped on the raw-prediction artifact. - ``connectomics.decoding.decoders.waterz.decode_waterz`` (``channel_order: str = "xyz"``, ``edge_offset: int = 1``) — decoder-local transpose plus the BANIS source→destination ``np.roll(+1)`` for ``edge_offset=0``. - ``output_transpose`` in ``connectomics.decoding.stage`` — manual axis permutation applied to decode output. - The ``axis_order`` HDF5 attribute (currently only ``"XYZ"`` is accepted), read in ``connectomics.decoding.stage`` and ``connectomics.decoding.qc.affinity``. - ``--src-axes`` and ``--reverse-channels`` flags on ``scripts/h5_to_precomputed_cloud.py``. These overlap and can disagree. Until the unification plan lands, the safest workflow is to **trace the array's order from disk all the way to the consumer by hand** and convert at exactly one boundary. Worked Example: BANIS HDF5 → CloudVolume Precomputed ---------------------------------------------------- This is the case that motivated this page. The same reasoning applies to any external converter. **Source.** BANIS stores zarr volumes ``xyz`` on disk and reads them without a spatial transpose. ``seg_to_affinity`` is then called on an ``xyz`` array, so the produced affinity has shape ``(C, X, Y, Z)`` and channel order ``ch0=X-aff, ch1=Y-aff, ch2=Z-aff``. **Sink.** CloudVolume precomputed expects ``xyz`` spatial order ``(X, Y, Z, C)`` and ``xyz`` channels. **Conversion.** With ``scripts/h5_to_precomputed_cloud.py``: .. code-block:: bash python scripts/h5_to_precomputed_cloud.py \ input.h5 gs://bucket/path/ \ --src-axes cxyz \ # NO --reverse-channels: source channels are already xyz The ``--src-axes cxyz`` transposes ``(C, X, Y, Z) → (X, Y, Z, C)``, preserving the ``X, Y, Z`` spatial order. ``--reverse-channels`` would flip the already-correct ``xyz`` channels into ``zyx``, producing an apparently valid but semantically wrong volume that only fails at decode time. The same script with a ``zyx``-on-disk source (e.g. a tensor exported directly from a training step) requires ``--src-axes czyx`` **and** ``--reverse-channels`` to land on the same ``xyz``-channel precomputed output. The Cost of Getting It Wrong ---------------------------- The May 2026 BANIS upload (slurm jobs 2505811 / 2505812) ran for ~2 hours each, then required a complete re-upload (jobs 2517273 / 2517274, another ~2 hours each) because a misleading ``"z-y-x format"`` string in an unrelated parser caused ``--reverse-channels`` to be set when it should not have been. The data was structurally valid at every checkpoint; the bug only surfaced when a downstream block-read was compared against the source. This is the failure mode that motivates the unification work: a quietly inconsistent axis convention produces ~150 GB of plausible garbage, not a crash. Rules of Thumb -------------- When reading or writing a volume: 1. **State the order explicitly** in a comment or filename when the array is not flowing through a typed container that already carries it. 2. **Do not introduce a separate channel-order knob** for affinity. If you need to express channel order, derive it from ``(spatial_axis_order, offsets)``. 3. **Convert at one boundary, with one helper.** If the same array gets transposed in two places, one of them is wrong. 4. **Verify against the source** after any large external upload (read back a block, compare to the input under the declared transpose, *without* reversing channels). This is what catches the silent-corruption class of bugs. See Also -------- - ``connectomics.data.processing.affinity.seg_to_affinity`` — the authoritative docstring for the positional offset convention. - ``connectomics.decoding.decoders.waterz.decode_waterz`` — the ``channel_order`` / ``edge_offset`` knobs and their semantics. - ``scripts/h5_to_precomputed_cloud.py`` — the standalone converter with ``--src-axes`` and ``--reverse-channels`` flags. - ``.agents/features/xyz-order/plan.md`` — proposal to consolidate axis-order handling onto declared metadata and a single ``reorder_affinity`` helper.