API Reference

Main Functions

RealLabelNormalization.normalize_labelsFunction
normalize_labels(labels; method=:minmax, range=(-1, 1), mode=:global, clip_quantiles=(0.01, 0.99))

Normalize labels with various normalization methods and modes. Handles NaN values by ignoring them in statistical computations and preserving them in the output.

Arguments

  • labels: Vector or matrix where the last dimension is the number of samples
  • method::Symbol: Normalization method
    • :minmax: Min-max normalization (default)
    • :zscore: Z-score normalization (mean=0, std=1)
  • range::Tuple{Real,Real}: Target range for min-max normalization (default: (-1, 1))
    • (-1, 1): Scaled min-max to [-1,1] (default)
    • (0, 1): Standard min-max to [0,1]
    • Custom ranges: e.g., (-2, 2)
  • mode::Symbol: Normalization scope
    • :global: Normalize across all values (default)
    • :columnwise: Normalize each column independently
    • :rowwise: Normalize each row independently
  • clip_quantiles::Union{Nothing,Tuple{Real,Real}}: Percentile values (0-1) for outlier clipping before normalization
    • (0.01, 0.99): Clip to 1st-99th percentiles (default)
    • (0.05, 0.95): Clip to 5th-95th percentiles (more aggressive)
    • nothing: No clipping

NaN Handling

  • NaN values are ignored when computing statistics (min, max, mean, std, quantiles)
  • NaN values are preserved in the output (remain as NaN)
  • If all values in a column are NaN, appropriate warnings are issued and NaN is returned

Returns

  • Normalized labels with same shape as input

Examples

# Vector labels (single target)
labels = [1.0, 5.0, 3.0, 8.0, 2.0, 100.0]  # 100.0 is outlier

# Min-max to [-1,1] with outlier clipping (default)
normalized = normalize_labels(labels)

# Min-max to [0,1] 
normalized = normalize_labels(labels; range=(0, 1))

# Z-score normalization with outlier clipping
normalized = normalize_labels(labels; method=:zscore)

# Matrix labels (multi-target)
labels_matrix = [1.0 10.0; 5.0 20.0; 3.0 15.0; 8.0 25.0; 1000.0 5.0]  # Outlier in col 1

# Global normalization with clipping
normalized = normalize_labels(labels_matrix; mode=:global)

# Column-wise normalization with clipping 
normalized = normalize_labels(labels_matrix; mode=:columnwise)

# Row-wise normalization with clipping
normalized = normalize_labels(labels_matrix; mode=:rowwise)
source
RealLabelNormalization.compute_normalization_statsFunction
compute_normalization_stats(labels; method=:minmax, mode=:global, 
range=(-1, 1), clip_quantiles=(0.01, 0.99))

Compute normalization statistics from training data for later application to validation/test sets.

Inputs

  • labels: Vector or matrix where the last dimension is the number of samples
  • method::Symbol: Normalization method
    • :minmax: Min-max normalization (default)
    • :zscore: Z-score normalization (mean=0, std=1)
  • range::Tuple{Real,Real}: Target range for min-max normalization (default (-1, 1))
    • (-1, 1): Scaled min-max to [-1,1] (default)
    • (0, 1): Standard min-max to [0,1]
    • Custom ranges: e.g., (-2, 2)
  • mode::Symbol: Normalization scope
    • :global: Normalize across all values (default)
    • :columnwise: Normalize each column independently
    • :rowwise: Normalize each row independently
  • clip_quantiles::Union{Nothing,Tuple{Real,Real}}: Percentile values (0-1) for outlier clipping before normalization
    • (0.01, 0.99): Clip to 1st-99th percentiles (default)
    • (0.05, 0.95): Clip to 5th-95th percentiles (more aggressive)
    • nothing: No clipping

Returns

  • Named tuple with normalization parameters that can be used with apply_normalization

Example

# Compute stats from training data with outlier clipping
train_stats = compute_normalization_stats(train_labels; method=:zscore, mode=:columnwise, clip_quantiles=(0.05, 0.95))

# Apply to validation/test data (uses same clipping bounds)
val_normalized = apply_normalization(val_labels, train_stats)
test_normalized = apply_normalization(test_labels, train_stats)
source
RealLabelNormalization.apply_normalizationFunction
apply_normalization(labels, stats)

Apply pre-computed normalization statistics to new data (validation/test sets).

Ensures consistent normalization across train/validation/test splits using only training statistics. This includes applying the same clipping bounds if they were used during training.

source
RealLabelNormalization.denormalize_labelsFunction
denormalize_labels(normalized_labels, stats)

Convert normalized labels back to original scale using stored statistics.

Useful for interpreting model predictions in original units.

source

Function Index

Internal Implementation Details

The package is organized into several internal modules for different aspects of label normalization:

  • Clipping: Handles outlier detection and clipping based on quantiles
  • Methods: Implements different normalization algorithms (min-max, z-score)
  • Statistics: Computes and stores normalization statistics
  • Core: Main API functions that orchestrate the normalization process

For details on the internal implementation, please refer to the source code in the package repository.