LLM SecurityAgentic AIAdversarial ML

Agentic LLM Security

As large language models are deployed as autonomous agents that invoke tools, browse the web, and coordinate with other agents, they introduce a new class of attack surfaces spanning prompt injection, tool abuse, memory poisoning, and multi-agent collusion. This project develops threat models, detection mechanisms, and runtime defenses that make agentic LLM systems robust, auditable, and trustworthy in high-stakes deployments.

Research Objectives

Characterize attack surfaces unique to tool-using, memory-augmented LLM agents
Build runtime monitors that detect prompt injection and anomalous tool invocations
Design policy-enforcing guardrails for multi-agent communication and task delegation
Develop benchmarks and red-teaming frameworks for evaluating agentic LLM robustness

Methods & Techniques

Adversarial prompt generation and automated red-teaming pipelines
Information-flow tracking across agent memory, tools, and external content
Constrained decoding and policy-based action filtering for safe tool use
Reinforcement learning from safety feedback (RLSF) for defensive agent training

Interested in this research?

Get in touch to discuss collaboration or graduate opportunities.

Contact Dr. Hossain