Skip to content

Model Downloads

Recall uses AI models for semantic search and text generation. The Full install ships with the core models bundled — you can use Recall immediately without downloading anything. Other models are optional downloads you can add later.

UI-Only installs don't manage models locally; the server you connect to supplies them.

Bundled models (Full install)

Role Model Size Notes
Embedder ModernBERT ~570 MB Legal-domain fine-tuned. Produces 768-dimensional vectors.
Generator Llama 3.1 8B Instruct ~4.7 GB Used by Clerk and parenthetical generation.

These load automatically on first launch. No action required.

Optional downloads

Role Model Size Why download
Embedder E5-Large-Instruct ~1.4 GB General-purpose multilingual embeddings. Useful if your documents aren't primarily English legal text.
Embedder all-MiniLM-L6-v2 ~90 MB Smallest/fastest. Lower quality; good for very large matters on constrained hardware.
Generator Phi-3.5 Mini Instruct ~3.8 GB Smaller alternative to Llama 3.1. Sometimes produces tighter output.

Optional models download on demand from Hugging Face. The first download can take several minutes over a decent connection.

Downloading from the Setup Wizard

On the Models step of the Setup Wizard:

  1. Check the boxes next to the models you want to download.
  2. Click Download.
  3. Watch the progress bar. A running download shows the current model and a percentage.
  4. You can leave the step and come back — the download continues in the background. If you move past the Models step, polling stops but the download itself continues.

If you skip this step entirely, the bundled models are already available — Recall is fully functional without downloading any extras.

Downloading from Settings

Anytime after setup, gear menu → Settings → Semantic Search (for embedders) or Generation (for generators). The same options are available; a download button appears next to models that aren't yet on disk.

Switching the active model

Once a model is downloaded, pick it in Settings:

  • Semantic Search → Embedder — switches which embedder Recall uses for indexing. Changing the embedder invalidates every document's vectors, so Recall prompts you to Rebuild Index. The rebuild runs across all matters in your organization.
  • Generation → Generator — switches Clerk's generator. No rebuild required; takes effect immediately.

Custom GGUF models

Advanced users can add their own GGUF files:

  1. Place the file in the custom models folder under your data root:

    • Embedders: <data root>/custom_models/embedders/
    • Generators: <data root>/custom_models/generators/

    Data root is %LocalAppData%\Recall\ on Windows, ~/Library/Application Support/Recall/ on macOS.

  2. In Settings, click Refresh Custom Models.

  3. Your file appears in the embedder or generator dropdown and can be selected like any other model.

Custom embedders must produce vectors compatible with Recall's retrieval pipeline; swapping in an arbitrary GGUF embedder isn't guaranteed to work well. Custom generators should be instruction-tuned chat models — base pretraining checkpoints tend to produce poor output for Clerk's prompts.

Where models live on disk

<data root>/
├── models/
│   ├── embedders/
│   │   ├── modernbert/                                Bundled (Python snapshot)
│   │   ├── multilingual-e5-large-instruct-q8_0.gguf   Optional (E5)
│   │   └── all-MiniLM-L6-v2-ggml-model-f16.gguf       Optional (all-MiniLM)
│   └── generators/
│       ├── Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf     Bundled
│       └── Phi-3.5-mini-instruct.Q6_K.gguf            Optional
└── custom_models/
    ├── embedders/
    └── generators/

Status indicators

The Embedder and Generator items in the status pill show whether the selected model is loaded and ready:

  • Green — ready to use
  • Red — not loaded (still loading, download failed, or file missing)

Click the pill for details.

Troubleshooting

A download fails partway through. Retry it. Recall doesn't resume partial downloads — it re-fetches the full file.

An embedder downloaded but isn't available. Check Settings → Semantic Search. If it shows up but indexing errors, try Rebuild Index to force a clean pass.

A custom GGUF file doesn't appear after Refresh Custom Models. Confirm the file has a .gguf extension and is in the right subfolder (embedders vs. generators).

Out of disk space. ModernBERT + Llama 3.1 + the full Windows install uses about 6–7 GB. Deleting optional embedders and generators you don't use frees space; just remove them from the models directory.