Considerations for AI use in Drug Product Life Cycle- FDA Guidance

The use of Artificial Intelligence (AI) in the drug development lifecycle is rapidly increasing, offering the potential to speed up the process and enhance patient care. However, the unique challenges that AI presents, such as potential biases and lack of transparency, necessitate a structured approach to ensure the reliability of AI-driven results. The FDA has proposed a risk-based credibility assessment framework to address these concerns, providing a pathway for sponsors to demonstrate the trustworthiness of AI models used in regulatory decision-making. This framework, outlined in their draft guidance, is a multi-step process designed to evaluate the credibility of AI models used to produce information or data intended to support regulatory decisions about drugs and biological products. Here’s a breakdown of the key steps:

Step 1. Define the Question of Interest:

The first step involves clearly stating the specific question, decision, or concern that the AI model is intended to address. This step ensures that the AI model has a well-defined purpose. For example, in clinical development, a question of interest could be: “Which participants can be considered low risk and do not need inpatient monitoring after dosing?”. In manufacturing, the question might be: “Do vials of Drug B meet established fill volume specifications?”

Step 2. Define the Context of Use (COU):

The COU defines the specific role and scope of the AI model in addressing the question of interest. It should detail what will be modeled and how model outputs will be used. The description of the COU should also state whether other information, such as clinical or animal studies, will be used alongside the model output. For instance, will the AI model be the sole determinant of a decision, or is it just one input among others? For example, an AI model might be used to predict a participant’s risk for an adverse drug reaction, which would then be used to determine whether the participant requires inpatient monitoring. In manufacturing, an AI model may be used to analyze images of vials, but the final decision on whether a batch is released might include independent volume verification tests.

Step 3. Assess the AI Model Risk:

This step involves assessing the risk associated with the AI model, which is a combination of model influence and decision consequence. Model influence refers to the contribution of the evidence derived from the AI model relative to other contributing evidence used to inform the question of interest. How much weight is the model given in the overall decision? Decision consequence describes the significance of an adverse outcome resulting from an incorrect decision concerning the question of interest. What is the potential impact if the AI model makes an error? The model risk is not about the risk intrinsic to the AI model, but the risk that the output may lead to an incorrect decision that has an adverse outcome. For example, a model with a high influence (it is the sole determinant) and a high decision consequence (a patient could have a life-threatening reaction if the model is incorrect) has a high model risk. A model with a low influence (it is just one input), but a high decision consequence (the impact on product quality if the AI model provides an incorrect measurement) has a medium model risk.

Step 4. Develop a Plan to Establish AI Model Credibility:

Based on the risk assessment, sponsors must develop a plan to establish the credibility of the AI model. This plan, called a credibility assessment plan, should detail how the model was developed and how its performance will be evaluated. The plan should include: * Description of the model: This includes the model's architecture, inputs, outputs, features, parameters, and the rationale for choosing that particular model. * Description of data used to develop the model: This section details the data used for training and tuning the model, including how it was collected, processed, and annotated. It should also explain how the data is fit for the context of use. The data should be both relevant (including key elements and representative data) and reliable (accurate, complete, and traceable). * Description of model training: This explains how the model was trained, including learning methodologies, performance metrics, and techniques to prevent overfitting. It should also specify if a pre-trained model was used and if the model was calibrated. * Description of model evaluation: This details how the model was tested, including the test data used, the evaluation methods, and performance metrics, as well as the uncertainty and limitations of the model. It is important to ensure test data is independent of training data and applicable to the context of use.

Step 5. Execute the Plan:

This step involves putting the credibility assessment plan into action. It’s often beneficial to discuss this plan with the FDA before execution.

Step 6. Document Results and Deviations:

This step involves documenting the results of the credibility assessment plan and any deviations from the plan in a report. The credibility assessment report provides the information necessary to establish the credibility of the AI model for its context of use.

Step 7. Determine the Adequacy of the AI Model:

Based on the results, it is determined whether the AI model is adequate for its intended context of use. If the model's credibility is not sufficiently established for the model risk, adjustments must be made. This can include things like incorporating additional evidence, increasing the rigor of the assessment, establishing risk controls, changing the modeling approach or rejecting the model's context of use. Special Considerations The guidance also emphasizes the importance of life cycle maintenance for AI models, particularly in manufacturing, where models may be continuously updated or impacted by changes in inputs. This includes continuous monitoring and risk-based oversight throughout the model's life cycle. Additionally, the guidance strongly recommends early engagement with the FDA to ensure the credibility assessment process is aligned with regulatory expectations. There are multiple options for engagement, including formal meetings and specific programs tailored to AI uses in various fields.

Life Cycle Maintenance:

Life cycle maintenance ensures AI models remain fit for their Context of Use (COU) throughout a drug's life cycle by managing changes to the model, whether incidental or intentional. Since AI models can self-evolve or be sensitive to input changes, ongoing monitoring of performance metrics is critical. A risk-based approach helps assess the impact of changes, with adjustments like retraining or retesting as needed. In pharmaceutical manufacturing, changes must align with the quality system's change management process. Sponsors should provide detailed maintenance plans in marketing applications and may use ICH Q12 guidance tools for managing post-approval changes effectively.

Early Engagement:

The FDA encourages early engagement with sponsors to set expectations for credibility assessments of AI models based on their risk and Context of Use (COU). Early discussions can help address potential challenges and align on regulatory requirements. Sponsors can utilize various engagement options, including Targeted Engagement for Regulatory Advice (INTERACT) meetings or Pre-Investigational New Drug Application (Pre-IND) consultations. For AI models linked to specific development programs under an IND or pre-IND, sponsors should reference the relevant application number and notify the review team when requesting formal meetings, such as Initial Advisory Meetings, to facilitate collaboration and regulatory alignment.

Conclusion

The FDA’s risk-based credibility assessment framework is a crucial step towards ensuring the safe and effective use of AI in drug development. By following these steps, sponsors can establish the credibility of their AI models and promote trust in this rapidly evolving technology. This structured approach should ultimately lead to better regulatory decisions and more effective treatments for patients.