Professor earns NSF CAREER Award to defend AI models from attackers
Professor earns NSF CAREER Award to defend AI models from attackers
https://www.rit.edu/news/professor-earns-nsf-career-award-defend-ai-models-attackers
Publish Date: 2026-05-18 10:15:00
Source Domain: www.rit.edu
-
Artificial Intelligence Threats: AI is increasingly used in critical systems, making machine learning models potential targets for attackers trying to embed vulnerabilities or exploits.
-
Zhao’s Research Mission: Assistant Professor Weijie Zhao aims to ensure machine learning models do not have hidden vulnerabilities, emphasizing the need for security in the growing use of AI.
-
Challenge of Interpretability: The inherent complexity and “black box” nature of machine learning make it difficult to understand how decisions are made, raising concerns about false information and manipulated outputs.
-
Career Award and Project Focus: Zhao received a National Science Foundation CAREER award to develop methods for securing machine learning; his five-year project is titled “Defending Machine Learning Models from Adversarial Threats via Unified Interpretability and Attribution.”
-
Transparency and Safety Goals: The project aims to enhance safety, resilience, and accountability in machine learning systems by making them more understandable and trustworthy.
-
Techniques and Strategies: Zhao’s research involves identifying harmful outputs from adversarial inputs, designing strategies to correct these issues without full retraining, auditing training data, and creating tools for remediating vulnerabilities.
-
Future Implications: Zhao hopes the defense framework developed at RIT will be used to build resilient, transparent, and trustworthy machine learning tools, emphasizing the importance of security in AI development.
-
NSF CAREER Award Recognition: Zhao’s award highlights the importance of integrating educational activities with impactful research, reflecting the broader goals of the NSF CAREER program in supporting junior faculty.