arcasidian.xyz
Sovereign AI
infrastructure,
built in Rust.
We build open-source AI infrastructure with a singular focus: systems that run fast and stay running. No garbage collector. No Python overhead. No fragile ML stacks.
I
LLM inference
Low-latency inference servers for GGUF models. p99 under 200ms. Cold start under 1.5s.
II
RAG pipelines
Domain-specific retrieval with LaTeX preservation, pgvector indexing, and automated dataset generation.
III
Small language models
QLoRA fine-tuning and GGUF quantization for 1B–7B models. Training pipelines built for single-GPU setups.
Porto Alegre, Brazil · Est. 2026
Get in touch
Working on something
that needs to run fast?
Available for LLM infrastructure consulting, inference server development, and RAG pipeline architecture. Tell us what you're building.
Received. Talk soon.
We'll be in touch within a day or two. In the meantime, the code is on GitHub.