API Reference
Main Functions
RealLabelNormalization.normalize_labels
— Functionnormalize_labels(labels; method=:minmax, range=(-1, 1), mode=:global, clip_quantiles=(0.01, 0.99))
Normalize labels with various normalization methods and modes. Handles NaN values by ignoring them in statistical computations and preserving them in the output.
Arguments
labels
: Vector or matrix where the last dimension is the number of samplesmethod::Symbol
: Normalization method:minmax
: Min-max normalization (default):zscore
: Z-score normalization (mean=0, std=1)
range::Tuple{Real,Real}
: Target range for min-max normalization (default: (-1, 1))(-1, 1)
: Scaled min-max to[-1,1]
(default)(0, 1)
: Standard min-max to [0,1]- Custom ranges: e.g.,
(-2, 2)
mode::Symbol
: Normalization scope:global
: Normalize across all values (default):columnwise
: Normalize each column independently:rowwise
: Normalize each row independently
clip_quantiles::Union{Nothing,Tuple{Real,Real}}
: Percentile values (0-1) for outlier clipping before normalization(0.01, 0.99)
: Clip to 1st-99th percentiles (default)(0.05, 0.95)
: Clip to 5th-95th percentiles (more aggressive)nothing
: No clipping
NaN Handling
- NaN values are ignored when computing statistics (min, max, mean, std, quantiles)
- NaN values are preserved in the output (remain as NaN)
- If all values in a column are NaN, appropriate warnings are issued and NaN is returned
Returns
- Normalized labels with same shape as input
Examples
# Vector labels (single target)
labels = [1.0, 5.0, 3.0, 8.0, 2.0, 100.0] # 100.0 is outlier
# Min-max to [-1,1] with outlier clipping (default)
normalized = normalize_labels(labels)
# Min-max to [0,1]
normalized = normalize_labels(labels; range=(0, 1))
# Z-score normalization with outlier clipping
normalized = normalize_labels(labels; method=:zscore)
# Matrix labels (multi-target)
labels_matrix = [1.0 10.0; 5.0 20.0; 3.0 15.0; 8.0 25.0; 1000.0 5.0] # Outlier in col 1
# Global normalization with clipping
normalized = normalize_labels(labels_matrix; mode=:global)
# Column-wise normalization with clipping
normalized = normalize_labels(labels_matrix; mode=:columnwise)
# Row-wise normalization with clipping
normalized = normalize_labels(labels_matrix; mode=:rowwise)
RealLabelNormalization.compute_normalization_stats
— Functioncompute_normalization_stats(labels; method=:minmax, mode=:global,
range=(-1, 1), clip_quantiles=(0.01, 0.99))
Compute normalization statistics from training data for later application to validation/test sets.
Inputs
labels
: Vector or matrix where the last dimension is the number of samplesmethod::Symbol
: Normalization method:minmax
: Min-max normalization (default):zscore
: Z-score normalization (mean=0, std=1)
range::Tuple{Real,Real}
: Target range for min-max normalization (default (-1, 1))(-1, 1)
: Scaled min-max to[-1,1]
(default)(0, 1)
: Standard min-max to [0,1]- Custom ranges: e.g.,
(-2, 2)
mode::Symbol
: Normalization scope:global
: Normalize across all values (default):columnwise
: Normalize each column independently:rowwise
: Normalize each row independently
clip_quantiles::Union{Nothing,Tuple{Real,Real}}
: Percentile values (0-1) for outlier clipping before normalization(0.01, 0.99)
: Clip to 1st-99th percentiles (default)(0.05, 0.95)
: Clip to 5th-95th percentiles (more aggressive)nothing
: No clipping
Returns
- Named tuple with normalization parameters that can be used with
apply_normalization
Example
# Compute stats from training data with outlier clipping
train_stats = compute_normalization_stats(train_labels; method=:zscore, mode=:columnwise, clip_quantiles=(0.05, 0.95))
# Apply to validation/test data (uses same clipping bounds)
val_normalized = apply_normalization(val_labels, train_stats)
test_normalized = apply_normalization(test_labels, train_stats)
RealLabelNormalization.apply_normalization
— Functionapply_normalization(labels, stats)
Apply pre-computed normalization statistics to new data (validation/test sets).
Ensures consistent normalization across train/validation/test splits using only training statistics. This includes applying the same clipping bounds if they were used during training.
RealLabelNormalization.denormalize_labels
— Functiondenormalize_labels(normalized_labels, stats)
Convert normalized labels back to original scale using stored statistics.
Useful for interpreting model predictions in original units.
Function Index
RealLabelNormalization._apply_training_clip_bounds
RealLabelNormalization._clip_outliers
RealLabelNormalization.apply_normalization
RealLabelNormalization.apply_normalization
RealLabelNormalization.compute_normalization_stats
RealLabelNormalization.compute_normalization_stats
RealLabelNormalization.denormalize_labels
RealLabelNormalization.denormalize_labels
RealLabelNormalization.normalize_labels
RealLabelNormalization.normalize_labels
Internal Implementation Details
The package is organized into several internal modules for different aspects of label normalization:
- Clipping: Handles outlier detection and clipping based on quantiles
- Methods: Implements different normalization algorithms (min-max, z-score)
- Statistics: Computes and stores normalization statistics
- Core: Main API functions that orchestrate the normalization process
For details on the internal implementation, please refer to the source code in the package repository.