Reliability Analysis and Metrics
GAT provides comprehensive reliability analysis tools for evaluating power system performance under uncertainty, including Monte Carlo simulation, LOLE/EUE metrics, multi-area coordination, and integration with ADMS operations.
Core Concepts
Loss of Load Expectation (LOLE)
Hours per year during which the system load cannot be fully served due to generation or transmission inadequacy. Calculated via Monte Carlo sampling of random outage scenarios with load variations.
Energy Unserved (EUE)
MWh per year of customer demand that cannot be met. Accounts for both the duration and severity of shortfalls.
Deliverability Score
Composite 0-100 metric combining LOLE, voltage violations, and thermal overloads with configurable weights. Provides a single reliability indicator for system health assessment.
Score Ranges:
- 90-100: Excellent (< 0.5 hrs LOLE/year)
- 75-90: Good (0.5-2 hrs LOLE/year)
- 60-75: Fair (2-5 hrs LOLE/year)
- 40-60: Poor (5-10 hrs LOLE/year)
- < 40: Critical (> 10 hrs LOLE/year)
Multi-Area Reliability (CANOS)
Coordinated Automatic Network Operating System framework for multi-area grids with:
- Zone-to-Zone LOLE: Reliability contribution from inter-area transmission
- Corridor Utilization: Flow ratios during contingencies
- Area Coordination: Synchronized outage windows to avoid cascades
Core Algorithms
Monte Carlo Simulation
GAT uses random sampling of outage scenarios, including both generator and transmission branch outages:
For N scenarios (default 500):
1. Sample generator outage probabilities (Weibull distribution)
2. Sample branch (transmission line) outages (per N-1/N-2 contingencies)
3. Sample load variation (±10% around baseline)
4. Compute available generation considering network topology:
- Generators may be isolated if connecting branches are offline
- Use BFS to determine which generators can reach which loads
5. Track shortfalls: max(0, Load - Deliverable_Gen)
6. Compute LOLE = (hours_with_shortfall / N) Ć 8766
7. Compute EUE = sum(shortfall_mw Ć duration) / N
Configuration:
scenarios: Number of Monte Carlo samples (default 500)outage_rate: Annual failure rate for generatorsmttr: Mean time to repair (hours)demand_variation: Load scaling factor (default 0.8-1.2)
Branch Outage Impact (v0.3)
When transmission branches are offline, generators may be unable to reach certain loads even if they have available capacity. GAT computes deliverable generation by performing a breadth-first search through the network graph, only traversing online branches:
# Example: 118-bus system with one line outage
# Generator A: 100 MW, connected to Bus 1
# Critical line: Bus 1 ā Bus 5 (offline in this scenario)
# Load B: 80 MW, connected to Bus 50 (reachable through Bus 5)
#
# Result: Generator A cannot contribute to Load B
# Available capacity for Load B = 0 MW (not 100 MW)
# Shortfall = 80 MW (even though 100 MW generation exists)
This topology-aware approach correctly captures transmission-limited reliability.
Deliverability Score Computation
score = 100 Ć [1 - w_lole * (LOLE/LOLE_max)
- w_voltage * (violations/max_violations)
- w_thermal * (overloads/max_overloads)]
Parameters:
LOLE_max: Threshold LOLE (hours/year, default 5.0)voltage_weight: Relative importance (default 0.4)thermal_weight: Relative importance (default 0.6)
N-k Contingency Analysis with LODF/PTDF (v0.4.0)
For N-k analysis with k ā„ 2, the combinatorial explosion makes exhaustive power flow infeasible. GAT uses Line Outage Distribution Factors (LODFs) and Power Transfer Distribution Factors (PTDFs) for efficient pre-screening.
Key Concepts
-
PTDF (Power Transfer Distribution Factor): Sensitivity of branch flow to bus injection.
PTDF[ā,n]= āf_ā/āP_n (change in flow on branch ā per MW injected at bus n) -
LODF (Line Outage Distribution Factor): Redistribution of flow when a branch trips.
LODF[ā,m]= flow increase on branch ā when branch m is outaged, as a fraction of the pre-outage flow on branch m.
Screening Algorithm
For N-k analysis:
1. Pre-compute PTDF and LODF matrices (one-time O(n³) cost)
2. For each contingency combination, estimate post-contingency flows using LODFs
3. Flag combinations where estimated flows exceed 90% of limits
4. Run full DC power flow only on flagged cases (~1-5% of total)
5. Rank flagged contingencies by Expected Unserved Energy (EUE)
Probabilistic Ranking with EUE
Contingencies are ranked by Expected Unserved Energy (EUE) which combines:
- Outage probability: Product of individual component failure rates
- Load not served: MW curtailed during the contingency
- Duration: Expected time to restore (MTTR)
EUE = P(outage) Ć Load_curtailed Ć MTTR
This prioritizes contingencies that are both likely and impactful.
Performance
- PTDF/LODF computation: O(n³) where n = number of buses (one-time)
- Screening: O(C(m,k)) where m = branches, k = contingency order
- Full evaluation: Only ~1-5% of screened cases require full power flow
For IEEE 118-bus network with 186 branches:
- N-1: 186 contingencies ā all evaluated
- N-2: 17,205 combinations ā ~500-800 flagged for full evaluation
Usage Examples
Basic Reliability Calculation
# Compute LOLE/EUE for a network
Output includes:
lole(hours/year)eue(MWh/year)scenarios_analyzed(count)scenarios_with_shortfall(count)average_shortfall(MW)
Multi-Area Reliability Analysis
# Evaluate zone-to-zone LOLE and corridor utilization
Output per area:
area_idarea_lole(hours/year for that zone)zone_to_zone_lole(contribution from other areas)- Corridor utilization (0-100%)
Deliverability Score Assessment
# Compute composite reliability score
Output:
score(0-100)status(Excellent/Good/Fair/Poor/Critical)lole(hours/year)- Component breakdown
Sensitivity Analysis
# Analyze LOLE vs. capacity margin
for; do
done
Integration with ADMS Operations
FLISR Impact on Reliability
FLISR operations reduce LOLE by restoring load during outages:
Usage:
# Track FLISR effectiveness
VVO with Reliability Constraints
Volt-Var Optimization respects minimum deliverability scores:
# VVO that maintains 80+ reliability score
The optimizer reduces losses while keeping the score above the threshold. If score drops below 80, it shifts weight toward reliability (0.1 loss weight) and away from loss minimization.
Maintenance Scheduling with Multi-Area Coordination
Schedule outages to minimize peak LOLE impact:
# Plan maintenance windows with coordination
Ensures:
- No two neighboring areas on same day
- Peak LOLE during worst maintenance window ⤠threshold
- ⤠15% EUE reduction from coordinated scheduling
Test Data
The crate includes comprehensive test cases validating reliability calculations:
test_nerc_lole_benchmark_range: Validates against NERC standards (LOLE should be 0-8766 hours/year)test_capacity_margin_effect: Higher capacity ā lower LOLEtest_deliverability_score_range: Score always 0-100test_multiarea_zone_to_zone_lole: Zone-to-zone contributions >= 0test_corridor_utilization_tracking: Flow ratios stay 0-100%test_branch_outage_impact: Branch outages correctly reduce available generation
Run with:
Implementation Details
Outage Scenario Generation
Generated using:
- Generator failures: Weibull(shape=1.2, scale=0.02) annual rate
- Branch failures: Included for N-1/N-2 contingencies via realistic topology modeling
- Demand variations: Uniform [0.8, 1.2] Ć baseline
Topology-Aware Generation Calculation
For each scenario, available generation is computed as the sum of capacity from generators that can reach at least one load through available (online) branches. This uses a breadth-first search to traverse the network graph, respecting branch outage status.
Multi-Area Coordination
The MultiAreaSystem maintains:
- Areas: Independent sub-networks with separate LOLE
- Corridors: Transmission ties with flow limits (MVA)
- Coordination constraints: No two neighbors can be down simultaneously
Zone-to-zone LOLE = LOLE contribution when one area fails and must rely on others.
Performance Considerations
- Memory: O(N scenarios Ć buses Ć branches)
- Time: O(N Ć AC_OPF_iterations) per evaluation
- Parallelism: Rayon work-stealing over scenarios
For 859,800 PFDelta instances (IEEE 14/30/57/118-bus cases):
- 500 scenarios Ć 118 buses ā 59k power flows
- ~500ms per case on 16-core system = ~8 hours full suite
- Use
--max-cases Nto sample subset
References
- CIM Standard: IEC 61970-301 (CIM 3.0)
- NERC Standards: PJM MISO interconnection frequency standards
- CANOS: "Coordinated Automatic Network Operating System", TPWRS 2007
- Crate:
crates/gat-algo/src/reliability_monte_carlo.rs - Integration:
crates/gat-adms/src/reliability_integration.rs