Research Domains

🧠 Large Language Models

Architecture innovation, efficient training at scale, long-context modeling, and multi-modal integration for next-generation LLMs.

🛡️ AI Safety & Alignment

Constitutional AI, RLHF, interpretability, red-teaming, and developing robust frameworks to keep AI systems aligned with human intent.

👁️ Computer Vision

Object detection, scene understanding, generative imaging, video comprehension, and multi-modal visual reasoning systems.

🎯 Reinforcement Learning

Multi-agent systems, reward modeling, sim-to-real transfer, and training agents for complex real-world decision making.

💬 Natural Language Processing

Multilingual understanding, semantic parsing, dialogue systems, and information extraction across 100+ languages.

🤖 Autonomous Agents

Planning, tool use, world models, and building AI agents capable of sustained reasoning and action in open-ended environments.

Selected Papers
Feb 2026
LLM

Sparse Attention Architectures for Long-Context Language Modeling

A novel sparse attention mechanism enabling 128K token context windows with 3× lower memory overhead. Achieves new SOTA on SCROLLS, BookSum, and our internal LongEval benchmark.

Jan 2026
Safety

Constitutional Alignment via Reward-Weighted Self-Reflection

A framework where language models evaluate and improve their own outputs through constitutionally grounded reward signals. Reduces harmful outputs by 94% on HarmBench while maintaining helpfulness scores.

Dec 2025
Vision

Multi-Scale Feature Pyramids for Zero-Shot Object Detection

Hierarchical feature pyramid network that generalizes to unseen categories without fine-tuning. Outperforms GLIP and OWL-ViT on LVIS and Objects365 zero-shot benchmarks.

Oct 2025
RL

Cooperative Multi-Agent Learning with Emergent Communication

Demonstrating that RL agents develop structured communication protocols when incentivized to cooperate, with implications for scalable multi-agent AI systems.

Aug 2025
LLM

Efficient Knowledge Distillation for On-Device Language Models

A novel distillation pipeline that compresses 70B parameter models to 3B with less than 5% quality degradation, enabling powerful on-device inference.

Jun 2025
Safety

Red-Teaming at Scale: Automated Adversarial Evaluation of LLMs

An automated red-teaming framework that generates diverse adversarial probes and systematically evaluates model robustness across safety-critical domains.

Mar 2025
Vision

Temporal Consistency in Video Generation via Latent Diffusion

Achieving state-of-the-art temporal coherence in AI-generated video through a novel latent diffusion architecture with temporal attention layers.

Open Source Contributions
throlson/safeguard

SafeGuard Framework

Production-ready constitutional alignment toolkit with plug-and-play safety layers for any LLM deployment.

⭐ 4.2k   🔀 890
throlson/sparseformer

SparseFormer

Efficient sparse attention implementation for PyTorch enabling 128K+ token context windows with minimal memory overhead.

⭐ 3.1k   🔀 620
throlson/redteam-auto

RedTeam Auto

Automated adversarial evaluation suite for LLMs with 1000+ curated attack vectors across safety-critical domains.

⭐ 2.8k   🔀 540
Collaborate on Research

We actively seek academic and industry partnerships. Let's advance AI together.

Propose a Collaboration