Getting Started with `bam-masterdata`¶

This tutorial will guide you through your first interaction with the bam-masterdata package, helping you understand its core concepts and basic functionality.

What is `bam-masterdata`?¶

The bam-masterdata is a Python package designed to help administrators and users to manage the Masterdata/schema definitions. It provides with a set of Python classes and utilities for working with different types of entities in the openBIS Research Data Management (RDM) system. It also contains the Masterdata definitions used at the Bundesanstalt für Materialforschung und -prüfung (BAM) in the context of Materials Science and Engineering research.

[Image placeholder: Architecture overview diagram showing the relationship between BAM Masterdata, openBIS, and the BAM Data Store. The diagram should illustrate data flow and the role of masterdata schemas in the system.]

The bam-masterdata provides you with tools to:

Export the Masterdata from your openBIS instance.
Update the Masterdata in your openBIS instance.
Export/Import from different formats: Excel, Python, RDF/XML, JSON.
Check consistency of the Masterdata with respect to a ground truth.
Automatically parse metainformation in your openBIS instance.

Prerequisites

Basic Python and openBIS knowledge.
A system with Python 3.10 or higher.
Knowledge of virtual environments, CLI usage, IDEs such as VSCode, and GitHub.

Warning

Note all steps in this documentation are done in Ubuntu 22.04. All the commands in the terminal need to be modified if you work from Windows.

Installation and Setup¶

Create an empty test directory¶

We will test the basic functionalities of bam-masterdata in an empty directory. Open your terminal and type:

mkdir test_bm
cd test_bm/

Create a Virtual Environment¶

We strongly recommend using a virtual environment to avoid conflicts with other packages.

Using venv:

python3 -m venv .venv
source .venv/bin/activate

Using conda:

conda create --name .venv python=3.10  # or any version 3.10 <= python <= 3.12
conda activate .venv

Install the Package¶

bam-masterdata is part of the PyPI registry and can be installed via pip:

pip install --upgrade pip
pip install bam-masterdata

Faster Installation

For faster installation, you can use uv:

pip install uv
uv pip install bam-masterdata

Verify Installation¶

You can verify that the installation was successful. Open a Python script and write:

from importlib.metadata import version


print(f"BAM Masterdata version: {version("bam_masterdata")}")

And running in your terminal:

python <path-to-Python-script>

This should return the version of the installed package.

Your First `bam-masterdata` Experience¶

Understanding Entity Types¶

The BAM Masterdata system organizes information into different entity types:

Object Types: Physical or conceptual objects (samples, instruments, people)
Collection Types: Groups of related objects
Dataset Types: Data files and their metadata
Vocabulary Types: Controlled vocabularies for standardized values

[Image placeholder: Entity relationship diagram showing the four main entity types and their relationships. Should include sample instances of each type.]

Deprecating Collection Types and Dataset Types

As of September 2025, the development of new Collection and Dataset types is stalled. We will use the abstract concepts only, i.e., a Collection Type is a class used to add objects to it and their relationships, and a Dataset Type is a class to attach raw data files to it.

Overview of the Object Types¶

The central ingredients for defining data models associated with a research activity are the Object Types. These are classes inheriting from an abstract class called ObjectType and with two types of attributes:

defs: The definitions of the Object Type. These attributes do not change when filling with data the object.
properties: The list of properties assigned to an object. These attributes are filled when assigning data to the object.

All accessible object types are defined as Python classes in

DataType	Python type	Example assignment
`BOOLEAN`	`bool`	`myobj.flag = True`
`CONTROLLEDVOCABULARY`	`str` (enum term code)	`myobj.status = "ACTIVE"` (must match allowed vocabulary term)
`DATE`	`datetime.date`	`myobj.start_date = datetime.date(2025, 9, 29)`
`HYPERLINK`	`str`	`myobj.url = "https://example.com"`
`INTEGER`	`int`	`myobj.count = 42`
`MULTILINE_VARCHAR`	`str`	`myobj.notes = "Line 1\nLine 2\nLine 3"`
`OBJECT`	(openBIS object reference)	`myobj.parent = another_object_instance` (depends on schema)
`REAL`	`float`	`myobj.temperature = 21.7`
`TIMESTAMP`	`datetime.datetime`	`myobj.created_at = datetime.datetime.now()`
`VARCHAR`	`str`	`myobj.name = "Test sample"`
`XML`	`str` (XML string)	`myobj.config = "<root><tag>value</tag></root>"`

Getting Started with `bam-masterdata`¶

What is `bam-masterdata`?¶

Installation and Setup¶

Create an empty test directory¶

Create a Virtual Environment¶

Install the Package¶

Verify Installation¶

Your First `bam-masterdata` Experience¶

Understanding Entity Types¶

Overview of the Object Types¶

Creating Your First Entity¶

Available properties for an Object Type¶

Data types¶

Assigning controlled vocabularies¶

Saving your Object Types instances in a collection¶

Converting Object Types¶

Working with real raw data¶

Using the Command Line Interface¶

Next Steps¶

Development Setup¶

Getting Started with bam-masterdata¶

What is bam-masterdata?¶

Installation and Setup¶

Create an empty test directory¶

Create a Virtual Environment¶

Install the Package¶

Verify Installation¶

Your First bam-masterdata Experience¶

Understanding Entity Types¶

Overview of the Object Types¶

Creating Your First Entity¶

Available properties for an Object Type¶

Data types¶

Assigning controlled vocabularies¶

Saving your Object Types instances in a collection¶

Converting Object Types¶

Working with real raw data¶

Using the Command Line Interface¶

Next Steps¶

Development Setup¶

Getting Started with `bam-masterdata`¶

What is `bam-masterdata`?¶

Your First `bam-masterdata` Experience¶