OASISS, Micro-Parameters, and the Real Work of Safer Autonomy

ODDs describe the big picture of how an automated system is intended to operate. Real models fail on the small stuff. Micro-parameters, such as lighting angle, clothing reflectivity, lane-marking wear, or windshield glare, often determine whether perception and planning remain stable or drift. OASISS (ODD-based AI Safety in Autonomous Systems) establishes a framework for this reality by evaluating whether training and testing datasets are sufficiently comprehensive for the Target Operational Domain (TOD). It focuses on completeness and representativeness, and asks developers to justify any gaps through an acceptability argument.

Credit where it’s due

OASISS was introduced by Jerein Jeyachandran, Siddartha Khastgir, Xingyu Zhao (WMG, University of Warwick), Eric Barbier (Wayve), and Paul Jennings (WMG). Their work proposes a model-agnostic way to judge whether your training and test scenarios suit the TOD you claim to support.

Why OASISS matters now

Standards alignment
OASISS explicitly maps to dataset-related properties in ISO/PAS 8800 and utilizes ISO 34503 concepts, the standards used by regulators and safety assessors. Refer to the ISO/PAS 8800 and ISO 34503 overviews for context.
TOD vs ODD clarity
It distinguishes between where the system was designed to work (ODD) and where it will actually encounter conditions (TOD). The gap matters.
Quantitative checks, not assumptions
Completeness looks for the presence of required elements and even out-of-ODD conditions you should safely reject. Representativeness compares the frequency of conditions to the TOD, using standard binning and divergence metrics for continuous and categorical variables.
Micro-parameters are treated as core elements
Micro-parameters are not typically part of the ODD, yet they influence perception and behavior. OASISS treats them as first-class items to specify in the scenario or in a parametrization file, and then score.

How this connects to MATT3R’s pipeline

In our scenario-reconstruction post, we outlined how real-world journeys are turned into simulation-ready assets that power two products: Capsul3 FOUNDATION for training data and Capsul3 SCENARIO for validation. The post walks through the global occupancy grid, OpenDRIVE/OpenSCENARIO export, and quality checks.

Capsul3 FOUNDATION delivers synchronized multi-modal clips and tags that highlight unusual or safety-critical moments for training.
Capsul3 SCENARIO reconstructs the same moments into standards-compliant scenarios for simulation, verification, and validation (V&V).

OASISS provides us with a language to demonstrate how sufficient those assets are for a stated TOD. It also aligns with behavior taxonomies emerging from BSI Flex 1891, connecting scenario behaviors and ODD.

Deep dive: “AI training” vs “post-training” pipelines and where gaps are

The following outlines how OASISS can be applied in both phases.

A) AI training pipeline

Data discovery and pre-filtering

From FOUNDATION: select clips by ODD tags and micro-parameters. Examples: twilight with high backlight, reflective clothing at crossings, and worn lane markings after rain.
OASISS hooks:
- Completeness: verify that each element and its value set exists for in-TOD and out-of-ODD coverage. Binary scoring at scenario, value, and micro-parameter levels makes missing categories obvious.
- Representativeness: Before training, compare distributions against TOD statistics to ensure the sampler does not over-index on easy weather conditions and times of day.

Micro-parameter modeling

Treat micro-parameters as latent axes in the sampler. Introduce a rule-based parametrization file that instantiates them for training batches, exactly as OASISS suggests.
Sensitivity analysis informs weights for micro-axes. For example, clothing reflectivity may rank higher than curb paint saturation for nighttime pedestrian detection.

Synthetic data integration

Our GAIA-2 post explains how synthetic data can target rare conditions with multi-view temporal consistency and action-conditioning. Use synthetic data to fill completeness gaps and to rebalance representativeness where TOD data is scarce.
Wayve’s GAIA-2 work details controllable generation across weather, lighting, road semantics, and agent behavior. These are the levers needed to populate micro-parameter bins without waiting for long-tail collection.

Training and curation loop

Keep a live completeness table for each training shard. If a shard lacks “urban roundabout, light rain, twilight, medium occlusion,” generate or mine FOUNDATION for it before the epoch starts.
Track distribution drift between sampler outputs and TOD. Use Jensen-Shannon divergence to keep the gap bounded.

Documentation for the safety case

For the safety file, show ISO/PAS 8800 property coverage and the rationale for any exceptions.

B) Post-training pipeline (validation, monitoring, and continuous improvement)

Scenario-level validation with SCENARIO

Build test suites that follow TOD distributions rather than a flat list. That is the heart of OASISS representativeness. Use TOD sources, such as weather records, HD maps, incidents, and real-world trials, to set frequencies.
For continuous variables, adopt OASISS binning aligned to standard categories, then compute divergence to the TOD.

Joint distributions and interactions

Failures often occur when conditions interact. Validate on joint distributions like “night + moderate rain + heavy oncoming glare + occluded crosswalk.” OASISS highlights joint distribution checks and the role of expert drivers in identifying influential combinations.

Acceptability arguments

Some micro-axes will not move your model. Log the analysis and declare acceptable gaps. Reviewers need the justification, not just a score.

Synthetic stress and action-conditioning checks

Use GAIA-2 style action-conditioned scenes to probe behavior under controlled hazards, then measure deltas in downstream KPIs. This tightens the link from synthetic to real operational risk.

Field monitoring and re-evaluation

As deployed data shifts, regenerate TOD stats and re-run OASISS scores. The framework is designed to adapt when new recorded data arrives.

Where the gaps usually are

ODD says “urban,” TOD lives in “commuter dusk.” Training often oversamples daylight and fair weather. Representativeness flags this quickly.
Missing micro-parameters. Scenarios specify “traffic light” but not the signal color at approach, or specify “pedestrian” but do not reflect its reflectivity or occlusion. OASISS micro-parameter scoring pushes these into scope.
No joint conditions. Validation passes single-axis tests but fails on “night + rain + glare + occlusion.” Joint distribution checks are required.
Synthetic without anchors. Synthetic fills coverage but needs FOUNDATION anchors and SCENARIO replays for calibration and task KPIs. Refer to the GAIA-2 blog for evaluation metrics and common pitfalls.

What this means for Capsul3 FOUNDATION and SCENARIO

FOUNDATION: Provides OASISS-aligned, tagged multimodal clips with explicit micro-parameter coverage and TOD divergence reports, so completeness is auditable and sampling stays balanced.
SCENARIO: Delivers standards-ready OpenDRIVE/OpenSCENARIO suites with completeness and representativeness summaries, TOD-aligned frequencies, and joint-condition coverage for V&V.

Closing thought

OASISS provides the industry with a defensible way to assert that “this dataset is sufficient for this domain.” It connects cleanly to ISO/PAS 8800 and ISO 34503, and it elevates micro-parameters to a practical level where safety is a priority.

Capsul3 FOUNDATION and Capsul3 SCENARIO operationalize OASISS: tagged data you can measure and scenario suites you can run, producing evidence you can share with partners, regulators, and customers.

References and credits

Your cart is currently empty.