AI Systems Architect. I build inference infrastructure, agentic pipelines, and production ML stacks — not prototypes that crumble in prod.
Track record:
- Employee #1 at Refact.ai (ex-OpenAI founder). Scraped 80M code repos, co-built dataset for Refact-1.6B-fim (SOTA HumanEval 2022). Built enterprise LLM inference backend, RAG over AST+vectors, SWE-bench-compatible agent loop.
- Interim Head of AI (0→1): shipped full air-gapped ML stack in <6 months — training, inference, synthetic data pipelines, fine-tuned BERTs/LoRAs at 70 labels production F1.
- Currently: sub-400ms voice-to-voice cascade (STT+VAD+LLM+TTS) on consumer GPU. Zero-dependency MCP server in pure Python.
I operate as Nautiloid Protocol LLC. Clean B2B engagement, no hiring overhead, no equity games.
Looking for: AI infrastructure contracts where output is judged by systems shipped, not meetings attended.
I am a senior AI engineer and systems architect, typically coming in as the first engineering hire to build infrastructure from scratch. I specialize in sovereign, local AI systems rather than just wrapping APIs. My recent work includes building data ingestion pipelines for 80 million repositories to train code models and engineering local voice-to-voice assistants that achieve 400ms latency on consumer hardware. I operate via a US entity under MSA and do not require visa sponsorship
Technologies: Python, Rust, vLLM, llama.cpp, ONNX, FastAPI, RAG, LoRA fine-tuning, MCP, gRPC
Résumé/Work: https://work.valerii.cc
Email: valerii[@]nautiloid.dev
---
AI Systems Architect. I build inference infrastructure, agentic pipelines, and production ML stacks — not prototypes that crumble in prod.
Track record: - Employee #1 at Refact.ai (ex-OpenAI founder). Scraped 80M code repos, co-built dataset for Refact-1.6B-fim (SOTA HumanEval 2022). Built enterprise LLM inference backend, RAG over AST+vectors, SWE-bench-compatible agent loop. - Interim Head of AI (0→1): shipped full air-gapped ML stack in <6 months — training, inference, synthetic data pipelines, fine-tuned BERTs/LoRAs at 70 labels production F1. - Currently: sub-400ms voice-to-voice cascade (STT+VAD+LLM+TTS) on consumer GPU. Zero-dependency MCP server in pure Python.
I operate as Nautiloid Protocol LLC. Clean B2B engagement, no hiring overhead, no equity games.
Looking for: AI infrastructure contracts where output is judged by systems shipped, not meetings attended.