Projects — Eclipse Cyber Nexus

vLLM ROCm Docker AMD GPU

Deployed a 4-node inference cluster serving 12 concurrent LLMs with sub-200ms first-token latency on AMD Instinct GPUs using ROCm 6.0.

Ansible Docker Compose Systemd

Built a self-service provisioning platform enabling 80+ engineers to spin up standardized dev environments in under 90 seconds.

GitHub Actions PyTorch ROCm

Designed a CI/CD pipeline with automated GPU workload testing, model benchmarking, and performance regression detection.

Ollama Nginx Rate Limiting

Architected a self-hosted API gateway routing requests across multiple local models with authentication, rate limiting, and usage analytics.

Prometheus Grafana Alertmanager

Deployed a full observability stack with custom exporters for GPU metrics, model latency, and queue depth across a distributed fleet.

Terraform Bash Systemd

Automated backup, failover, and recovery workflows for a multi-site infrastructure with RTO under 15 minutes.