7 minute read 13 May 2020
ey-data-center

Understand model risk management for AI and machine learning

Authors
Gagan Agarwala

EY Americas Financial Services Advisory Principal

Leads the Enterprise Control Transformation practice. Soccer and cricket fan.

Alejandro Latorre

EY Americas Financial Services Advisory Principal

Principal in the FSO Risk Management Advisory practice of Ernst & Young LLP

Susan Raffel

EY Americas Financial Services Advisory Partner

Partner in the Advisory Services practice of Ernst & Young LLP

7 minute read 13 May 2020

The risks of AI/ML models can be difficult to identify. Enhancing MRM can help firms leverage the power of AI/ML to solve complex problems.

Sound risk management of artificial intelligence (AI) and machine learning (ML) models enhances stakeholder trust by fostering responsible innovation. Responsible innovation requires an effective governance framework at inception and throughout the AI/ML model life cycle to achieve proper coverage of risks.

Effective model risk management (MRM) is part of a broader four-step process to accelerate the adoption of AI/ML by creating stakeholder trust and accountability through proper governance and risk management. These steps include:

  • Developing an enterprise-wide AI/ML model definition to identify AI/ML risks
  • Enhancing existing risk management and control frameworks to address AI/ML-specific risks
  • Implementing an operating model for responsible AI/ML adoption
  • Investing in capabilities that support AI/ML adoption and risk management

Effective MRM can further enhance trust in AI/ML by embedding supervisory expectations  throughout the AI/ML life cycle to better anticipate risks and reduce harm to customers and other stakeholders. This entails holding model owners and developers accountable for deploying models that are conceptually sound, thoroughly tested, well-controlled and appropriate for their intended use.

Regulatory perspective

US banking regulatory agencies are closely monitoring developments related to AI/ML. In their messaging and supervisory posture, US regulators are seeking to balance the benefits associated with innovation against the downside risks. The balancing act they are striking is most evident in recent guidance regarding the use of alternative data in consumer credit.

Financial services firms can in many respects leverage existing MRM processes, such as risk assessment, validation and ongoing monitoring, to address AI/ML-specific risks and align with supervisory expectations because the risks of AI/ML models are similar to those of more traditional modeling techniques. Nevertheless, four aspects of AI/ML will likely require additional investment to align with current expectations. These include the growth in diverse use cases (e.g., document intelligence, advertising/marketing), reliance on high-dimensional data and feature engineering, model opacity, and dynamic training.

These aspects of AI/ML will require greater investment in data governance and infrastructure and key elements of model life cycle risk management, including model definition, development and validation, change management and ongoing monitoring. These aspects will also require tighter linkage among the MRM framework, data governance and other risk management frameworks such as privacy, information security and third-party risk management.

  • Unique features of AI/ML models

    The problems to be learned are complex

    • Typically the problems that are solved by ML models are more complex and nonlinear

    Model specification is not formulated explicitly

    • ML models are generally fitting data in high-dimensional spaces (connected or disconnected), and representing such data in a manner that is understandable to humans is intractable or impossible

    Model training process

    • Training data is typically high-dimensional, semi-structured or unstructured, and voluminous
    • Complex training methods, such as online training, may be required.
    • Extrapolation is hard to detect and avoid
    • Limited training data results in low-density regions in training space
    • Feature space is not uniform; more training data is needed for heterogeneous regions
    • Hard to define extrapolation vs. interpolation in the input data space

    Transformation logic: linkage between input and output

    • Large number of variables used may create interpolation and/or extrapolation issues
    • Input may impact output through a nonlinear and/or non-monotonic relationship
    • Inputs may jointly impact the output, i.e., interaction terms exist
    • Inputs may be correlated (may lead to implicit bias, language terms can be correlated to gender, certain habits are correlated to geographic location)
    • Use of nontraditional sources of data may increase

    Emerging challenges

    • Adversarial attacks by perturbating data not perceivable by humans while recognizable by machines

The risks of AI/ML models

Like traditional models, such as logistic regression, AI/ML models, such as deep learning (DL), can expose a firm to risk because they can lead to adverse consequences and poor decisions if a model has errors in its design or construction, performs poorly or is used inappropriately. While the risks of AI/ML models are qualitatively similar to those of traditional models, the reliance on high-dimensional data, dynamic retraining, the opacity of the transformation logic and feature engineering can lead to unexpected results and make risks more difficult to identify and assess.

As with traditional models, poor performance can arise from implementation errors, including those related to calibration and poor data quality. In the case of AI/ML, the complexity of the model makes it more difficult to assess whether the results of the model can be generalized beyond the data used for training. The results may not be generally applicable if the model underfits or overfits the data in relation to a set of performance criteria. 

Like traditional models, AI/ML models can be used inappropriately, giving rise to unintended consequences.

Underfitting means that the model does not capture the data “well” in sample relative to the performance criteria. Overfitting means that the model fits the training data “too well” relative to a set of performance criteria and exhibits poor prediction performance when tested out of sample. As discussed below, poor data availability or quality can undermine model fit and lead to sampling bias and lack of fairness.

Also like traditional models, AI/ML models can be used inappropriately, giving rise to unintended consequences. The model result should be relevant and informative in understanding whether the desired business outcome is achieved. Risk can arise because the goal as defined by the algorithm is not clearly aligned to the real-world business problem statement.

The intended use of the model also may not align with real-world applications due to issues noted later regarding data availability, quality and representativeness. As a result, the informativeness of the output to the business decision is overstated. Alternatively, the business goal that the algorithm quantifies may be aligned to the business problem, but it may not account for all relevant considerations, which can lead to unintended consequences, such as a lack of fairness.

  • AI/ML vs. traditional statistical models

    Model methodology

    AI/ML

    • Probability theory, structured framework and engineering experience
    • High-dimensional, non-linear
    • Large number of data attributes
    • Unstructured data

    Traditional statistical

    • Clear stochastic or statistical theory
    • Linear or transformed to non-linear
    • Limited explanatory variables/factors
    • Quantitative method and business judgment           

    Data

    AI/ML

    • High-dimensional data
    • Data preparation can be tedious and costly

    Traditional statistical

    • Low-dimensional or structured data
    • Measurable data quality standard

    Model calibration/training

    AI/ML

    • Dependent on multiple factors
    • Several methods to solve for under-training and over-training

    Traditional statistical

    • Standardized calibration procedures
    • Closed-form or semi-closed form formulas

    Implementation

    AI/ML

    • Open source and vendor algorithms and libraries
    • Difficult model replication
    • Higher demand on infrastructure

    Traditional statistical

    • In-house and vendor solutions
    • Tractable replication
    • Lower demand on infrastructure

    Performance assessment

    AI/ML

    • Model output is hard to attribute
    • Stability can be hard or infeasible
    • Theoretical performance tests can be hard or non-existent

    Traditional statistical

    • Well-established statistical measures
    • Explanatory variables/factors and attribution of results are easily analyzed

    Ongoing monitoring

    AI/ML

    • KPIs more difficult to determine
    • Models frequently retrained to handle population shifts

    Traditional statistical

    • Well-established KPIs
    • Models infrequently retrained

Governance, policies and controls

The oversight of AI/ML models should be consistent with the processes used for traditional models. Oversight by the board and senior management remains important. They should be aware of use cases being employed and understand the effectiveness of governance and controls used in the AI/ML model life cycle.

Roles and responsibilities for model developers, users and validators, and other control functions should be clearly articulated to achieve ownership and accountability for risks. Internal audit will also need to remain engaged to give assurance that the MRM framework and related controls are effective for AI/ML models.

Nevertheless, several enhancements to policies and procedures should consider the dynamic and integrated risks associated with AI/ML. MRM policies should explicitly reference how other risk and control requirements (e.g., information security) apply where appropriate. That will give AI/ML model developers clarity on all requirements needed to get models approved and help control functions understand how their responsibilities are allocated. Procedures associated with enhanced capabilities and their relationships to other policies should also be well documented.

Summary

Artificial intelligence and machine learning models offer unique advantages compared to traditional statistical models, but they also present unique challenges related to risk management. Incorporating sound model risk management and embedding regulatory considerations into the design of AI/ML is critical to building trust.

About this article

Authors
Gagan Agarwala

EY Americas Financial Services Advisory Principal

Leads the Enterprise Control Transformation practice. Soccer and cricket fan.

Alejandro Latorre

EY Americas Financial Services Advisory Principal

Principal in the FSO Risk Management Advisory practice of Ernst & Young LLP

Susan Raffel

EY Americas Financial Services Advisory Partner

Partner in the Advisory Services practice of Ernst & Young LLP