# CLI And Runner API Use MoDaCor either from the command line (`modacor run`) or from Python via a shared execution helper (`run_pipeline_job`). Both paths use the same pipeline scheduler backend. ## Prerequisites - Install MoDaCor in an environment with Python 3.12+. - If using a source checkout, install in editable mode: ```bash pip install -e . ``` ## Command-line interface The package now installs a `modacor` command with a `run` subcommand. ### Minimal run ```bash modacor run --pipeline processing_pipelines/MOUSE_solids.yaml ``` This is useful for pipelines that already load data sources/sinks internally (for example using `AppendSource` and `AppendSink` steps). ### Run with external source registration For pipelines that expect sources to be provided externally: ```bash modacor run \ --pipeline processing_pipelines/MOUSE_solids.yaml \ --hdf-source sample=/data/MOUSE_sample.nxs \ --hdf-source background=/data/MOUSE_background.nxs \ --yaml-source defaults=/data/defaults.yaml ``` Supported source/sink registration flags: - `--hdf-source REF=PATH` (repeatable) - `--yaml-source REF=PATH` (repeatable) - `--csv-sink REF=PATH` (repeatable) ### Tracing and step control ```bash modacor run \ --pipeline processing_pipelines/MOUSE_solids.yaml \ --trace \ --trace-watch sample:signal,Q \ --trace-watch background:signal \ --trace-report-lines 50 \ --stop-after GX ``` - `--trace` enables trace capture and event attachment. - `--trace-watch` is repeatable and uses `bundle:key[,key...]`. - `--stop-after STEP_ID` stops the run after that step is executed. ### Export selected results to HDF5 Use repeatable `--write-path` selectors: ```bash modacor run \ --pipeline processing_pipelines/MOUSE_solids.yaml \ --write-hdf output/results.h5 \ --run-name run1 \ --write-path /sample/signal/signal \ --write-path /sample/Q/signal \ --write-path /background/signal/signal ``` If you want to store the entire current `ProcessingData` (all `BaseData` entries) without listing paths: ```bash modacor run \ --pipeline processing_pipelines/MOUSE_solids.yaml \ --write-hdf output/results_full.h5 \ --write-all-processing-data ``` Semantics: - `--write-hdf` sets the output file. - each `--write-path` adds one `ProcessingData` path to `data_paths`. - `--write-all-processing-data` auto-selects all `BaseData` entries from `ProcessingData`. - `--run-name` maps to the HDF sink run subpath (default: `default`). - the HDF output stores reproducibility metadata under `processing/pipeline//`: `spec` (JSON) and `yaml` (pipeline YAML). - trace output is stored under `processing/tracer//` as raw `events` JSON and indexed `steps/` + `index/`. ## Shared Python runner API Use this when driving MoDaCor in notebooks or scripts while keeping behavior consistent with the CLI: ```python from pathlib import Path from modacor.io.hdf.hdf_source import HDFSource from modacor.io.io_sources import IoSources from modacor.runner import run_pipeline_job sources = IoSources() sources.register_source( HDFSource(source_reference="sample", resource_location=Path("/data/sample.nxs")) ) result = run_pipeline_job( Path("processing_pipelines/MOUSE_solids.yaml"), sources=sources, trace=True, trace_watch={"sample": ["signal"]}, ) print(result.executed_steps) print(result.processing_data.keys()) ``` `run_pipeline_job(...)` returns a `RunResult` container with: - `processing_data` - `pipeline` - `tracer` (or `None`) - `step_durations` - `executed_steps` - `stopped_after_step`