Skip to main content
Academic Research
Academic Research
Academic Research

Research & Publications

Research & Publications

Research & Publications

Advancing the frontiers of AI safety, neuroscience-inspired computing, and interpretable machine learning through rigorous academic research and collaboration.
Academic Research
Academic Research
Academic Research
47Publications
1842Citations
18h-index
$3.2MResearch Grants
12PhD Students
35Collaborators
JOURNAL
2024

Adversarial Robustness in Large Language Models: A Comprehensive Survey

Roberts, G., Smith, J., Chen, L.Journal of AI Safety Research
This paper presents a comprehensive survey of adversarial robustness techniques in large language models, examining both attack vectors and defense mechanisms.
AI Safety
LLMs
Adversarial ML
Survey
42 citations
PDF
Code
DOI
CONFERENCE
2024

Neuroscience-Inspired Architectures for Interpretable AI

Roberts, G., Johnson, M.NeurIPS 2024
We propose novel neural architectures inspired by biological neural circuits that provide inherent interpretability while maintaining competitive performance.
Neuroscience
Interpretable AI
Neural Architecture
28 citations
PDF
DOI
CONFERENCE
2023

Formal Verification of Neural Network Safety Properties

Roberts, G., Anderson, K., Liu, W.ICML 2023
A novel framework for formally verifying safety properties in deep neural networks using abstract interpretation and SMT solvers.
Formal Methods
AI Safety
Verification
67 citations
PDF
Code
DOI
PREPRINT
2024

Emergent Communication in Multi-Agent Reinforcement Learning

Roberts, G., Park, S., Zhang, Y.arXiv preprint
Investigation of emergent communication protocols in multi-agent systems trained with reinforcement learning, revealing surprising linguistic structures.
Multi-Agent RL
Emergent Communication
Language

Collaborate With Me

I'm always interested in collaborating on research projects at the intersection of AI safety, neuroscience, and interpretable machine learning.
© 2025 /gareth/ All rights reserved