LocalMind

Initializing...

Drop image, audio, or video here

🧠

Private AI, right in your browser

Models run entirely on your device via WebGPU. Nothing is sent to any server.

Pick a Ternary Bonsai or Qwen3 model for tool calling and memory; pick a Gemma 4 model for multimodal (image + audio) input.
Web search requires an API key (Tavily, Brave) or a self-hosted SearXNG instance.

📱 On a phone, big models won’t fit. Point LocalMind at your computer’s Ollama / LM Studio (Settings → Models → Local endpoint) and use any model over your network.

Preparing to download model...

Loading SAM model...

System prompt (optional)

Data folder sync

Mirror all data (skills, memory, documents, conversations) to a folder on disk — point it at a git / iCloud / Dropbox folder to keep it permanent and synced across devices. API keys stay in this browser only.

Not connected — data is stored in this browser only.

Web search provider

Segmentation model (SAM)

Loaded on first use. Gemma calls this when you ask it to segment, outline, or identify objects in attached images.

Auto-download backup on New Chat

Model cache

Resumable downloads (5 MB chunks cached in IndexedDB; interrupted downloads resume from the last chunk)

Voice to text language (Whisper — leaving as English improves accuracy on short clips)

Custom models (causal LMs with ONNX exports on Hugging Face)

Must be a Hugging Face repo with ONNX files under onnx/. Multimodal custom models are not yet supported. Added models appear in the model selector.

Custom tools (agent-capable models; HTTP POST to your endpoint)

On a tool call, LocalMind does POST <endpoint> with the model-generated args as JSON body; the response JSON is fed back to the model. The endpoint must send CORS headers for this origin. Name must be [a-zA-Z_][a-zA-Z0-9_]* and not collide with a built-in.

Local model endpoint (OpenAI-compatible local server — Ollama, LM Studio, llama.cpp, Atomic Chat)

Run bigger / faster models without the WebGPU limit — inference runs on your machine, nothing leaves your device. Discovered models appear in the model menu above. The server must allow CORS for this origin (Ollama: set OLLAMA_ORIGINS; LM Studio: enable CORS). Presets: Ollama :11434/v1 · LM Studio :1234/v1 · llama.cpp :8080/v1 · Atomic :1337/v1.

MCP servers (Streamable HTTP transport; tools discovered via JSON-RPC tools/list)

Tools discovered from an MCP server are registered with the prefix mcp_ to avoid collisions with built-ins. The server must allow CORS for this origin and accept JSON-RPC 2.0 requests at the given URL. Connections re-established on page load.

JavaScript API (experimental) Expose window.localmind to scripts in this page

Lets other JavaScript on this page call the loaded model via an OpenAI-shaped method. Same-tab only — cross-origin scripts cannot reach it. No tool calling, memory access, or web search is exposed.

Memory 0 memories

API

Pick 2–3 models 0 selected

Auto-inject {{previous}} into each next prompt

On-device image generation

Model Size Steps 4 Seed

Type a prompt below and Send. First run downloads the model (~3 GB, cached after). Best on Chrome/Edge.

Masked-diffusion text · Qwen3 0.6B (~1.4 GB)

New tokens Steps Block Conf≥

Type a prompt below and Send. First run downloads ~1.4 GB (cached after). Watch the answer denoise out of the fog. Best on Chrome/Edge.

History 0 conversations

LocalMind

Available Models

Agent Tools (Ternary Bonsai + Qwen3 + LFM2 + Gemma 4)

Image Segmentation (SAM)

Things to try

Document Upload

Multimodal (Gemma 4)

Document Upload

Conversations

Memory browser

Output & Export

Batch Prompts

Other

JavaScript API (experimental)

Surface (v1.0)

Not exposed

Activity log

Demo

Math & Conversions

Time & Reminders

Memory

Translation

Writing & Analysis

Documents (attach a PDF, DOCX, or text file)

Multimodal (attach an image first)

Web Research (requires API key)

Coding

Math & Diagrams

Live HTML / SVG Artifacts

Python (Gemma 4, run_python tool)

Multi-step planning (Gemma 4, Settings toggle)

MCP tools (after adding a server in Settings)

Voice to text (no prompt needed)

Private AI, right in your browser

LocalMind

Available Models

Agent Tools (Ternary Bonsai + Qwen3 + LFM2 + Gemma 4)

Image Segmentation (SAM)

Things to try

Document Upload

Multimodal (Gemma 4)

Document Upload

Conversations

Memory browser

Output & Export

Batch Prompts

Other

JavaScript API (experimental)

Surface (v1.0)

Not exposed

Activity log

Demo

Math & Conversions

Time & Reminders

Memory

Translation

Writing & Analysis

Documents (attach a PDF, DOCX, or text file)

Multimodal (attach an image first)

Web Research (requires API key)

Coding

Math & Diagrams

Live HTML / SVG Artifacts

Python (Gemma 4, run_python tool)

Multi-step planning (Gemma 4, Settings toggle)

MCP tools (after adding a server in Settings)

Voice to text (no prompt needed)

Private AI, right in your browser

Share Conversation