On-Device AI Auto-Categorization
Two on-device models — Qwen3 1.7B for chat and a 768-dim Nomic embedding model — categorize transactions, suggest tags, and learn from your corrections. Your statements never leave the phone.
Why on-device AI is the only AI that protects your statements
Cloud “AI” budgeting apps stream every merchant string to a remote LLM, which often means OpenAI sees your supermarket habits. Budgie loads both models once, then keeps every inference local — same accuracy, zero data exfiltration.
The two-stage pipeline runs embedding lookup first for instant nearest-neighbor categorization from your own history, then falls back to Qwen3 1.7B for novel transactions the embedding index has not seen before. Every accepted or edited suggestion updates the 768-dim index immediately — accuracy compounds over time.
Two-stage categorization flow
Embedding lookup — 768-dim Nomic model finds the nearest historical transaction instantly via sqlite-vec SIMD search
LLM fallback — Qwen3 1.7B Q4 handles novel transactions the embedding index has not seen, proposing category and tags from context
Correction loop — every accepted or edited suggestion updates the embedding index immediately so the next similar transaction lands closer without re-training
What you get
Qwen3 1.7B Q4 model runs entirely on your phone after a one-time download
768-dim Nomic embedding model + sqlite-vec for SIMD-accelerated similarity search
Two complementary signals: vector lookup over your history plus a generative LLM suggestion
Every correction updates the embedding index instantly — accuracy improves as you use it
Statements never leave the device — no OpenAI, no remote inference, ever
How it works
On first run, Budgie downloads Qwen3 1.7B Q4 and a 768-dim Nomic embedding model from the Hugging Face hub. Both are stored in your app sandbox. For each new transaction, the embedding model runs first — if a strong nearest neighbor exists in your history, the result is instant. If not, Qwen3 1.7B generates a suggestion. Your response (accept, edit, or reject) feeds back into the embedding index without any network call.
Frequently Asked Questions
Does the AI work offline?
How big is the model download?
Can I correct the AI's suggestions?
Does Budgie use OpenAI or any cloud LLM?
Related Features
Read More on the Blog
Ready to take Budgie for a spin?
Join the waitlist — be first to try the offline-first expense tracker.