A 100% private AI voice assistant that lives on your computer (works offline). Talk naturally as if Jarvis is a third person in the room — say its name anywhere in your sentence and get conversational, context-aware responses. It remembers everything, always knows the current location and time, can search the web, read your screen, control Chrome, track nutrition, and much more with support for unlimited MCPs and tools without context rot. Sensitive info is automatically redacted before anything is saved to disk.
🔒 100% local processing. No subscriptions. No data harvesting. Automatic redaction of sensitive info. Free offline dictation included.
🔒 Your data stays yours - 100% local AI processing. No cloud, no subscriptions, no data harvesting. Automatic redaction of sensitive info. This is non-negotiable.
🗣️ A third person in the room - Unlike voice assistants that only respond to rigid commands, Jarvis understands conversations. It maintains a short temporary rolling context of what's being discussed, so when you ask "Jarvis, what do you think?" it knows exactly what you're talking about. Have it chime into discussions with friends, help debug code while you talk through problems, or weigh in on decisions.
🧠 Never forgets - Unlimited memory across conversations. Adapts tone naturally to the topic. Learns your preferences over time.
🎙️ Free dictation - Hold a hotkey, speak, release — your words appear in any app as text. Like WisprFlow, but free, offline, and private. No subscription, no cloud transcription.
🔌 Extensible - MCP integration connects Jarvis to thousands of tools: smart home, GitHub, Slack, databases, and more. Smart tool selection means adding more tools won't slow things down.
📊 Transparent progress - We track what works (and what doesn't) with automated evals. See current accuracy →
🚧 Known limitations: Jarvis is under active development. Primary development happens on macOS. Windows/Linux support may lag behind. We're building in the open, issues and contributions welcome!
- Voice-only for now—no text chat interface yet (#35)
- No mobile apps (#17)
- "Stop" commands during speech sometimes get filtered as echo (#24)
- Dictation is not available on macOS 26+ (Tahoe) due to a pynput incompatibility (#172)
See it in action (example conversations)
Chiming into conversations (the magic moment):
👤 Alice: I wonder what the weather will be like tomorrow
👤 Bob: Yeah, we should check before planning the picnic
👤 Alice: Jarvis, what do you think?
📝 Heard: "What do you think Jarvis?"
🧠 Intent (wake word): directed → "what do you think about the weather for the picnic"
✨ Working on it: what do you think about the weather for the picnic
🧰 Tool: getWeather…
💬 Generating response...
🤖 Jarvis
Tomorrow looks great for a picnic! Sunny with highs around 22°C...
Jarvis understood the entire conversation and gave a contextual answer — no need to repeat the question.
Browser automation via MCP:
📝 Heard: "Open YouTube Jarvis."
🧠 Intent (wake word): directed → "open YouTube"
✨ Working on it: open YouTube
💬 Generating response...
🤖 Jarvis
I have opened YouTube for you.
Health-aware advice:
📝 Heard: "Should I order pizza tonight considering my health goals Jarvis?"
🧠 Intent (wake word): directed → "should I order pizza tonight considering my health goals"
✨ Working on it: should I order pizza tonight considering my health goals
🧰 Tool: fetchMeals…
💬 Generating response...
🤖 Jarvis
Looking at your week — you mentioned wanting to lose 5kg by June...
Today you've had about 860 calories so far. Given your 1,800 calorie target,
you've got room for pizza! But maybe consider thin crust to save calories.
Personalised news search:
📝 Heard: "What are some news from today that might interest me Jarvis?"
🧠 Intent (wake word): directed → "what are some news from today that might interest me"
✨ Working on it: what are some news from today that might interest me
🧰 Tool: searchMemory…
🧰 Tool: webSearch…
💬 Generating response...
🤖 Jarvis
Here's a quick snapshot of today's headlines...
Natural wake word placement (say "Jarvis" anywhere):
📝 Heard: "Give me a random topic to discuss Jarvis."
🧠 Intent (wake word): directed → "give me a random topic to discuss"
✨ Working on it: give me a random topic to discuss
💬 Generating response...
🤖 Jarvis
How about the Fermi Paradox? Given the vast number of stars...
Echo detection (Jarvis ignores its own speech):
🤖 Jarvis
I have opened YouTube for you.
👂 Listening for follow-up (3s)...
📝 Heard: "I have opened YouTube for you."
🔇 Heard (echo): "i have opened youtube for you."
💤 Returning to wake word mode
| Platform | Requirement |
|---|---|
| All | Ollama |
| Windows | Visual C++ Redistributable (most PCs already have this) |
Get the latest from GitHub Releases:
| Platform | Download | Run |
|---|---|---|
| Windows | Jarvis-Windows-x64.zip |
Extract → Run Jarvis.exe |
| macOS | Jarvis-macOS-arm64.zip |
Extract → Move to Applications → Right-click → Open |
| Linux | Jarvis-Linux-x64.tar.gz |
tar -xzf → Run ./Jarvis/Jarvis |
Jarvis starts listening automatically — just say "Jarvis" and talk!
- Conversational Awareness - Understands ongoing discussions. Ask "Jarvis, what do you think?" and it knows what you're talking about. Works naturally in multi-person conversations.
- Unlimited Memory - Never forgets. Searches across all your conversation history. Memory Viewer GUI included.
- Adaptive Tone - Automatically surgical for code, pragmatic for business, encouraging for wellbeing — no manual mode switching
- Smart Tool Selection - Embedding-based relevance filtering picks only the tools needed per query — add unlimited MCP tools without performance degradation
- Built-in Tools - Screenshot OCR, web search (with auto-fetch), weather, file access, nutrition tracking, location awareness
- Natural Voice - Say "Jarvis" anywhere in your sentence, interrupt with "stop", follow up without repeating the wake word
- Dictation Mode - Free, offline alternative to WisprFlow — hold a hotkey, speak, release to paste text into any app
- MCP Integration - Connect to thousands of external tools (Home Assistant, GitHub, Slack, etc.)
| Hardware | VRAM | Model |
|---|---|---|
| Most users | 8GB+ | gemma4:e2b (default) |
| Better quality | 16GB+ | gemma4:e4b |
| High-end | 24GB+ | gpt-oss:20b |
Note: VRAM requirements include the intent judge model (
gemma4:e2b) which is always loaded alongside the chat model for voice intent classification. The default model shares this, so no extra VRAM is needed.
The setup wizard will guide you through model selection and installation on first launch.
Most users won't need to change anything. Open ⚙️ Settings from the tray menu to configure Jarvis through a graphical interface — no JSON editing required. Settings are saved to ~/.config/jarvis/config.json.
Speech Recognition (Whisper)
- Multilingual (default, 99 languages):
"whisper_model": "medium" - English Only (slightly better English accuracy):
"whisper_model": "medium.en"
| Model | English | Multilingual | Download | VRAM | Speed |
|---|---|---|---|---|---|
| Tiny | tiny.en |
tiny |
~75 MB | ~1 GB | ~10x |
| Base | base.en |
base |
~140 MB | ~1 GB | ~7x |
| Small | small.en |
small |
~465 MB | ~2 GB | ~4x |
| Medium | medium.en |
medium |
~1.5 GB | ~5 GB | ~2x |
| Large V3 Turbo | - | large-v3-turbo |
~1.5 GB | ~6 GB | ~8x |
Speed is relative to the original large model. Source
If you have an NVIDIA GPU, Jarvis can use CUDA for much faster speech recognition. The Windows installer offers an optional CUDA download during setup. For development:
pip install nvidia-cublas-cu12 nvidia-cudnn-cu12CUDA is detected automatically — no configuration needed.
Voice Interface (Advanced)
LLM Intent Judge - Jarvis uses gemma4:e2b for intelligent voice intent classification (echo detection, query extraction, stop commands). This model is automatically installed alongside your chosen chat model during setup. The intent judge cannot be disabled but gracefully falls back to simpler text matching if Ollama is unavailable.
Hold a hotkey to record speech, release to paste the transcription into any app. Works everywhere — your editor, browser, chat, terminal. Completely local, completely free.
| Platform | Default hotkey |
|---|---|
| Windows | Ctrl + Win |
| macOS | Ctrl + Option |
| Linux | Ctrl + Alt |
- 🔒 100% offline — your speech never leaves your machine (unlike cloud dictation services)
- 🧠 Shared Whisper model — uses the same speech recognition as voice input, no extra memory
- ⚡ Zero latency startup — no server round-trip, transcription starts the moment you release
- 📋 Universal paste — works in any app that accepts
Ctrl+V/Cmd+V - 🔇 Non-intrusive — main voice listener pauses automatically during dictation
- ✋ Hands-free mode — double-tap the hotkey to keep recording without holding; press again or hit Escape to stop
- 🧹 Filler word removal — optional LLM-powered cleanup removes "um", "uh", "like", "you know" while preserving meaning
- 📖 Custom dictionary — define
"wrong -> right"replacements for jargon, names, and technical terms - 📜 History window — browse, copy, or delete past dictations from the system tray
- 🎛️ Easy setup — configure dictation during the setup wizard or anytime in Settings (hotkey dropdown, filler removal toggle, custom dictionary editor)
Customise the hotkey in Settings or config.json:
{
"dictation_hotkey": "ctrl+alt",
"dictation_filler_removal": true,
"dictation_custom_dictionary": [
"jarvis -> Jarvis",
"pytorch -> PyTorch"
]
}Note: macOS requires Accessibility permissions for the global hotkey. Linux requires X11 (limited Wayland support).
Text-to-Speech
Piper TTS (default) - Neural TTS that auto-downloads on first use (~60MB):
- Works out of the box - no setup required
- High-quality British English male voice (en_GB-alan-medium)
- Fast local synthesis with exact duration tracking
To use different Piper voices, download from HuggingFace and set:
{
"tts_piper_model_path": "~/.local/share/jarvis/models/piper/en_GB-alan-medium.onnx"
}Chatterbox - AI voice with emotion control (requires running from source):
{ "tts_engine": "chatterbox" }Voice cloning with Chatterbox - add a 3-10 second .wav sample:
{
"tts_engine": "chatterbox",
"tts_chatterbox_audio_prompt": "/path/to/voice.wav"
}Location Detection
Jarvis can provide location-aware responses (weather, local time, etc.) using a local GeoLite2 database — no cloud geolocation services are used.
IP detection chain (in order of preference):
- Manual IP — configure
location_ip_addressin settings - UPnP — queries your local router (no traffic leaves LAN)
- Socket heuristic — determines which interface routes externally (no data sent)
- OpenDNS DNS query — single
myip.opendns.comlookup to208.67.222.222(only external query)
If your ISP uses carrier-grade NAT (CGNAT), Jarvis automatically resolves your true public IP via the same OpenDNS DNS query. This can be disabled:
{
"location_cgnat_resolve_public_ip": false
}Setup: Register for a free MaxMind GeoLite2 account, download the City database (MMDB format), and save it to ~/.local/share/jarvis/geoip/GeoLite2-City.mmdb. The setup wizard will guide you through this.
MCP Tool Integration
Connect Jarvis to external tools via MCP servers:
{
"mcps": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "your-token" }
}
}
}Popular integrations:
- Home Assistant - Voice control for smart home
- Google Workspace - Gmail, Calendar, Drive, Docs
- GitHub - Issues, PRs, workflows
- Notion - Knowledge management
- Slack/Discord - Team communication
- Databases - MySQL, PostgreSQL, MongoDB
- Composio - 500+ apps in one integration
See full MCP setup guide below.
Home Assistant - Smart home voice control
- Add MCP Server integration in Home Assistant (Settings → Devices & services)
- Expose entities you want to control (Settings → Voice assistants → Exposed entities)
- Create Long-lived Access Token (Profile → Security → Create token)
- Install proxy:
uv tool install git+https://github.com/sparfenyuk/mcp-proxy - Add to config:
{
"mcps": {
"home_assistant": {
"command": "mcp-proxy",
"args": ["http://localhost:8123/mcp_server/sse"],
"env": { "API_ACCESS_TOKEN": "YOUR_TOKEN" }
}
}
}"Jarvis, turn on the living room lights" / "set bedroom to 72°" / "run good night scene"
Google Workspace - Gmail, Calendar, Drive, Docs, Sheets
{
"mcps": {
"google_workspace": {
"command": "npx",
"args": ["-y", "google-workspace-mcp"],
"env": {
"GOOGLE_CLIENT_ID": "your-client-id",
"GOOGLE_CLIENT_SECRET": "your-client-secret"
}
}
}
}GitHub - Repos, issues, PRs, workflows
{
"mcps": {
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": { "GITHUB_TOKEN": "your-token" }
}
}
}Notion, Slack, Discord, Databases
Notion:
{ "mcps": { "notion": { "command": "npx", "args": ["-y", "@makenotion/mcp-server-notion"], "env": { "NOTION_API_KEY": "your-token" } } } }Slack:
{ "mcps": { "slack": { "command": "npx", "args": ["-y", "slack-mcp-server"], "env": { "SLACK_BOT_TOKEN": "xoxb-...", "SLACK_USER_TOKEN": "xoxp-..." } } } }Discord:
{ "mcps": { "discord": { "command": "npx", "args": ["-y", "discord-mcp-server"], "env": { "DISCORD_BOT_TOKEN": "your-token" } } } }Databases: bytebase/dbhub (SQL), mongodb-mcp-server (MongoDB)
Composio - 500+ apps in one integration
{
"mcps": {
"composio": {
"command": "npx",
"args": ["-y", "@composiohq/rube"],
"env": { "COMPOSIO_API_KEY": "your-key" }
}
}
}Get API key at composio.dev
Common issues
Jarvis doesn't hear me - Check microphone permissions, speak clearly after "Jarvis"
Responses are slow - Ensure you have enough VRAM (8GB+ for default model; see System Requirements for other models)
Windows: App won't start - Extract full zip first, check Windows Defender
macOS: "App can't be opened" - Right-click → Open, or System Settings → Privacy & Security → Allow
Linux: No tray icon - sudo apt install libayatana-appindicator3-1
Running from source
git clone https://github.com/isair/jarvis.git
cd jarvis
# macOS
bash scripts/run_macos.sh
# Windows (with Micromamba)
pwsh -ExecutionPolicy Bypass -File scripts\run_windows.ps1
# Linux
bash scripts/run_linux.shRunning from source enables Chatterbox TTS (AI voice with emotion/cloning). Piper TTS works in both bundled and source modes.
Privacy hardening (stay 100% offline)
{
"web_search_enabled": false,
"mcps": {},
"location_auto_detect": false,
"location_cgnat_resolve_public_ip": false,
"location_enabled": false
}Verify: sudo lsof -i -n -P | grep jarvis (should only show 127.0.0.1 to Ollama)
- 100% offline - No cloud services required
- Auto-redaction - Emails, tokens, passwords automatically removed
- Local storage - Everything in
~/.local/share/jarvis
- Personal use: Free forever
- Commercial use: Contact us












