Quickstart#
Run a three-step MoDaCor pipeline against the bundled MOUSE example dataset, see how data sources plug into a configuration file, and inspect the pipeline trace that records what changed at every step.
Prerequisites#
Python 3.11 or newer
pip,curl(orwget) and a POSIX-like shellApproximately 1.3 GB of free disk space for the sample NeXus file
If you are working from the cloned MoDaCor repository, activate the project virtual environment instead of creating a
new one and use pip install -e . to install the package in editable mode.
Step 1 – Prepare a working folder#
Create a clean folder, bootstrap a virtual environment, and install MoDaCor:
mkdir modacor-quickstart
cd modacor-quickstart
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install modacor
Step 2 – Download example data and metadata#
Grab the MOUSE sample dataset and create a small metadata file describing the detector dark current:
curl -LO https://github.com/BAMresearch/modacor/raw/main/tests/testdata/MOUSE_20250324_1_160_stacked.nxs
cat <<'YAML' > mouse_metadata.yaml
---
detector:
darkcurrent:
value: 1.0e-5
units: counts/second
uncertainty: 1.0e-6
YAML
The NeXus file exposes counts in entry1/instrument/detector00/data and the exposure time in
entry1/instrument/detector00/frame_exposure_time. The metadata file supplies a scalar dark-current estimate so the
last pipeline step can remove it.
Step 3 – Create the pipeline configuration#
Save the following pipeline definition as mouse_quickstart.yaml:
name: mouse_quickstart
steps:
1:
name: add_poisson_uncertainties
module: PoissonUncertainties
requires_steps: []
configuration:
with_processing_keys:
- sample
2:
name: normalize_by_exposure
module: Divide
requires_steps: [1]
configuration:
with_processing_keys:
- sample
divisor_source: sample::entry1/instrument/detector00/frame_exposure_time
divisor_units_source: sample::entry1/instrument/detector00/frame_exposure_time@units
3:
name: subtract_darkcurrent
module: Subtract
requires_steps: [2]
configuration:
with_processing_keys:
- sample
subtrahend_source: metadata::detector/darkcurrent/value
subtrahend_units_source: metadata::detector/darkcurrent/units
subtrahend_uncertainties_sources:
propagate_to_all: metadata::detector/darkcurrent/uncertainty
Step 4 – Create a runner script#
Place the script below in run_mouse_pipeline.py. It registers the data sources, prepares a ProcessingData object,
runs the pipeline, and prints both numeric results and a compact pipeline trace.
from __future__ import annotations
from pathlib import Path
from time import perf_counter
from modacor import ureg
from modacor.dataclasses.basedata import BaseData
from modacor.dataclasses.databundle import DataBundle
from modacor.dataclasses.processing_data import ProcessingData
from modacor.debug.pipeline_tracer import PipelineTracer, PlainUnicodeRenderer
from modacor.io.hdf.hdf_source import HDFSource
from modacor.io.io_sources import IoSources
from modacor.io.yaml.yaml_source import YAMLSource
from modacor.runner.pipeline import Pipeline
def _decode_unit(unit_value) -> str:
if isinstance(unit_value, bytes):
return unit_value.decode()
return str(unit_value)
def build_processing_data(sources: IoSources) -> ProcessingData:
processing = ProcessingData()
processing["sample"] = DataBundle()
signal = sources.get_data("sample::entry1/instrument/detector00/data")
signal_unit = _decode_unit(
sources.get_data_attributes("sample::entry1/instrument/detector00/data").get("units", "counts")
)
processing["sample"]["signal"] = BaseData(
signal=signal,
units=ureg.Unit(signal_unit),
rank_of_data=2, # last two dimensions carry detector pixels
)
return processing
def main() -> None:
pipeline = Pipeline.from_yaml_file(Path("mouse_quickstart.yaml"))
sources = IoSources()
sources.register_source(
YAMLSource(source_reference="metadata", resource_location=Path("mouse_metadata.yaml"))
)
sources.register_source(
HDFSource(source_reference="sample", resource_location=Path("MOUSE_20250324_1_160_stacked.nxs"))
)
processing_data = build_processing_data(sources)
tracer = PipelineTracer(watch={"sample": ["signal"]})
pipeline.prepare()
while pipeline.is_active():
for node in pipeline.get_ready():
node.processing_data = processing_data
node.io_sources = sources
start = perf_counter()
node.execute(processing_data)
tracer.after_step(node, processing_data, duration_s=perf_counter() - start)
pipeline.done(node)
sample_signal = processing_data["sample"]["signal"]
mean_intensity = float(sample_signal.signal.mean())
print(f"Mean intensity after corrections: {mean_intensity:.6g} {sample_signal.units}")
print("\nPipeline trace (last few events):\n")
print(tracer.last_report(renderer=PlainUnicodeRenderer()))
print("\nMermaid flowchart definition:\n")
print(pipeline.to_mermaid())
if __name__ == "__main__":
main()
Step 5 – Run the pipeline#
Execute the script:
python run_mouse_pipeline.py
You should see the corrected mean intensity, a compact trace summarising what changed in each step (unit conversions, shape, NaN counts, etc.), and a Mermaid flowchart definition that can be pasted into https://mermaid.live for a visual graph.
Step 6 – Where to go next#
Swap out
mouse_metadata.yamlfor the metadata produced by your instrument and adjustwith_processing_keysfor additionalDataBundleentries (for examplebackgroundorcalibration).Add
pipeline.attach_tracer_event(node, tracer, include_rendered_trace=True)inside the execution loop if you want to export the trace alongside the configuration.Explore the Pipeline operations and Extending MoDaCor sections for branching workflows, module development, and integration best practices.