EphemerAl

EphemerAl (EphemerAI UI): A Simple Self-Hosted Chat Interface for Local AI with Ollama that Accepts Documents and Images

EphemerAl is a lightweight, open-source web interface (user-facing brand: EphemerAI) for interacting with local LLMs on your hardware via Ollama. I designed it for my day job to help keep our team’s sensitive info off cloud services, and to provide a modern AI experience to staff without the per-user cost required to achieve equivalent capabilities online. The repository now targets Qwen3.6-35B-A3B through a stable local alias (ephemeral-default) that you create from qwen3.6:35b-a3b, and can still be retargeted to other models by changing one environment variable (LLM_MODEL_NAME).

While it wasn’t built for broad distribution, I’m sharing this generalized version in case it helps others looking for a local-only, account-free, multimodal LLM interface. . . whether to provide an operational tool, a staff learning environment, or bragging rights when friends visit on your home network.

View the source code on GitHub

A screenshot of EphemerAl, a self-hosted AI assistant for local document Q&A and image analysis using Ollama


Core Features

Default Model and Retargeting

Default target

Using a stable local alias is intentional: it lets you pin runtime defaults (like context and generation parameters) in one place while the app consistently calls ephemeral-default.

Runtime assumptions in this repo

Retarget to a different model

Set LLM_MODEL_NAME to any available Ollama model tag or local alias:

The app performs model capability/context detection at runtime via Ollama (/api/show) so behavior remains adaptive across models.

Privacy Notes

EphemerAl is designed to minimize data retention:

Note that browser caching behavior depends on your browser settings and cache-control headers. For maximum privacy on shared machines, use private/incognito browsing or clear browser data after use.

If you enable a shared Ollama API backend, requests made directly to Ollama bypass the EphemerAl UI/session layer; privacy and logging behavior for those requests depends on the external client and Ollama deployment settings, not EphemerAl session behavior.

Network Security Note

EphemerAl is designed for trusted local networks (home, office LAN) and does not implement authentication or transport encryption. The Streamlit container disables CORS and XSRF protection to allow straightforward LAN access. Do not expose this application to the public internet without adding a reverse proxy with authentication and TLS.

Technical Stack

Hardware Planning (honest baseline)

Qwen3.6-35B-A3B is a large model and should be planned like one.

If this model is too heavy for your machine, retarget LLM_MODEL_NAME to a smaller local model.

System Requirements

To run this interface effectively, the following specifications are recommended.

Deployment

Use the step-by-step guide:

Migration note for pre-Qwen installs

If your current stack still points at any older model tag, recreate or update your local ephemeral-default alias to point to qwen3.6:35b-a3b using the deployment guide, then confirm docker-compose.yml uses LLM_MODEL_NAME=ephemeral-default.

Shared Ollama API Backend (optional)

Accessing EphemerAl

UI Validation Checklist (Streamlit 1.56)

After deployment (or after UI updates), run this quick manual checklist in a browser:

  1. Open the app and confirm the empty welcome screen renders before any chat messages.
  2. Send a text-only prompt and confirm normal response streaming.
  3. Upload a supported document (for example, PDF or DOCX) and confirm document chat still works.
  4. Upload an image and confirm behavior is model-aware:
    • If the selected model supports vision, the image is accepted and included.
    • If the selected model does not support vision, the UI shows a clear warning and continues text chat.
  5. Click New chat and confirm prior messages/attachments are cleared from the current session.
  6. After at least one exchanged message, click Copy conversation and confirm clipboard copy works.
  7. Temporarily stop Ollama or Tika and confirm the app UI still renders with backend-unavailable guidance.
  8. Narrow the browser window (or test on mobile) and confirm basic sidebar/chat usability.

Stopping the Application

Execute the following in an Administrator PowerShell window:

wsl --shutdown

To restart, either run wsl or reboot the system if you have the startup script installed.

Support

This project is provided as a resource for the community as-is. I hope it solves a problem or provides value outside my environment.

If you run into issues, consider submitting error details, including screenshots and system files, to an AI assistant for guidance. This isn’t meant to be snark, it’s amazing how well the big reasoning models can troubleshoot.

License:

MIT - (At least the parts of this stack that are mine to license)