Building Resilient Electric Grid

Resilient and reliable operation of cyber-physical systems (CPS) of societal importance such as Smart Electric Grids is one of the top national priorities. Recent blackouts and hurricane Sandy in 2012 demonstrated grid vulnerability and gave reasons to look at existing defense mechanisms more closely. There are studies by North Electric Reliability Corporation (NERC) which states that relay or automatic control misoperations can account for nearly all major system events. Effect of failures in protection system components, protection settings, software tools, and human decisions impacting power system physical components are not captured either. In the absence of a system-wide integrated fault model, faults are identified by directly observing the associated anomaly or a set of anomalies as part of a pattern. However, this technique fails when a large number of alarms occur within a short time period. It has been noted that in the case of transmission systems this leads to a situation where the utility operators are quickly overwhelmed with alarms.

Current research gap is in developing efficient models and tools for performing fault diagnostics and predicting the progression of failure cascades. With the growing importance of efficient power generation and stable power supply, the power transmission and distribution systems are changing rapidly. Integration of renewable resources and distributed generation in the distribution network has the potential to make the system more efficient and resilient but only if the control and protection challenges can be met. The key enabler to solving this problem is the introduction of components such as PMUs that provide local snapshots of the system state to a central control authority. While these new sources provide richer data sets, fusing all the data available is becoming even more challenging.

Our approach to solving this challenge is to use a mix of data-driven and model-based technicues. Primarily, we use a discrete event model that captures the causal and temporal relationships between failure modes (causes) and discrepancies (effects) in a system, thereby modeling the failure cascades while taking into account propagation constraints imposed by operating modes, protection elements, and timing delays. This formalism is called Temporal Causal Diagram (TCD) and can model the effects of faults and protection mechanisms as well as incorporate fine-grain, physics-based diagnostics into an integrated, system-level diagnostics scheme. The uniqueness of the approach is that it does not involve complex real-time computations involving high-fidelity models, but performs reasoning using efficient graph algorithms based on the observation of various anomalies in the system. TCD is based on prior work on Timed Failure Propagation Graphs (TFPG). When fine-grain results are needed and computing resources and time are available, the diagnostic hypotheses can be refined with the help of the physics-based diagnostics.

This approach differs from existing practice where fault analysis and mitigation relies on a logic-based approach that relies on hard thresholds and local information, often ignoring system-level effects introduced by the distributed control algorithms. This often leads to scenarios where a local mitigation in a subsystem, especially in case of malfunction of protection devices results in a larger fault cascade, leading to a blackout. Such approaches to fault management, if successful, will improve the effectiveness of isolating failures in large-scale systems such as Smart Electric Grids, by identifying impending failure propagations, and determining the time to critical failure, which can increase the system reliability and reduce the losses accrued due to power failures.

[1], [2], [3], [4], [5], [6]

See the following papers for details.


  1. S. Hasan, A. Dubey, G. Karsai, and X. Koutsoukos, A game-theoretic approach for power systems defense against dynamic cyber-attacks, International Journal of Electrical Power & Energy Systems, vol. 115, 2020.
  2. A. Chhokra, A. Dubey, N. Mahadevan, S. Hasan, and G. Karsai, Diagnosis in Cyber-Physical Systems with Fault Protection Assemblies, in Diagnosability, Security and Safety of Hybrid Dynamic and Cyber-Physical Systems, M. Sayed-Mouchaweh, Ed. Cham: Springer International Publishing, 2018, pp. 201–225.
  3. A. Chhokra, A. Dubey, N. Mahadevan, G. Karsai, D. Balasubramanian, and S. Hasan, Hierarchical Reasoning about Faults in Cyber-Physical Energy Systems using Temporal Causal Diagrams, International Journal of Prognostics and Health Management, vol. 9, no. 1, Feb. 2018.
  4. A. Chhokra, A. Kulkarni, S. Hasan, A. Dubey, N. Mahadevan, and G. Karsai, A Systematic Approach of Identifying Optimal Load Control Actions for Arresting Cascading Failures in Power Systems, in Proceedings of the 2nd Workshop on Cyber-Physical Security and Resilience in Smart Grids, SPSR-SG@CPSWeek 2017, Pittsburgh, PA, USA, April 21, 2017, 2017, pp. 41–46.
  5. S. Hasan et al., A simulation testbed for cascade analysis, in IEEE Power & Energy Society Innovative Smart Grid Technologies Conference, ISGT 2017, Washington, DC, USA, April 23-26, 2017, 2017, pp. 1–5.
  6. N. Mahadevan, A. Dubey, A. Chhokra, H. Guo, and G. Karsai, Using temporal causal models to isolate failures in power system protection devices, IEEE Instrum. Meas. Mag., vol. 18, no. 4, pp. 28–39, 2015.