← All Projects
Python
Ollama
Android
WireGuard
HMAC
Private LLM serving stack from server to mobile:
- GPU inference — Ollama-based model serving with automated scheduling
- Authenticated API — HMAC authentication over WireGuard VPN
- Mobile client — Android app with certificate pinning and QR-code provisioning
- Benchmarking — Automated HumanEval, MMLU, GSM8K evaluation across models
- GPU comparison — Speed benchmarking across GPU generations
- Model discovery — Automated ranking and selection of new models