ra-yavuz › lillycoder

lillycoder

A small CLI that drops you into a chat REPL inside any folder, with a persona that evolves. The model on the other end can read, write, and edit your files, run shell commands, and install packages, all gated by a per-tool permission prompt. Talks to any local OpenAI-compatible /v1 endpoint. No cloud, no API key, no telemetry, no account.

What sets it apart: a real persona system. Six bundled voices (default kid coder, tsundere, yandere, sweet, calm-adult, analytical), live switching, copy-on-write user shadows, and a model-driven evolve mode where Lilly rewrites her own system prompt over time and the shape persists across sessions.

source releases install issues

What it is

A 10-file Python package plus a CLI shim. You run lillycoder in a project directory and start typing. The model picks tools from a fixed set (read_file, write_file, edit_file, bash, mkdir, mv, rm, grep, find, list_dir, pkg_install) to do what you asked.

Every mutating action prompts:

🦊 lilly wants to: write_file("src/index.js", 142 chars)
   [y]es  [n]o  [a]lways for this tool  [p]ath: always for this exact target
   >

What it is not

lillycoder does not start LLM servers, manage Docker, or ship a model. It expects a server already running on localhost (llama.cpp, ollama, LM Studio, etc.). On first run it scans common ports and offers to use whatever it finds, or you pass --api.

Install

One line (Debian / Ubuntu)

Sets up the signed ra-yavuz apt repo if not already added, refreshes the package index, and installs lillycoder. Idempotent, safe to re-run:

sudo bash -c 'set -e; install -m 0755 -d /etc/apt/keyrings && curl -fsSL https://ra-yavuz.github.io/apt/pubkey.gpg -o /etc/apt/keyrings/ra-yavuz.gpg && echo "deb [signed-by=/etc/apt/keyrings/ra-yavuz.gpg] https://ra-yavuz.github.io/apt stable main" > /etc/apt/sources.list.d/ra-yavuz.list && apt update && apt install -y lillycoder'

If you already added the ra-yavuz apt repo earlier, all you need is sudo apt update && sudo apt install lillycoder. The sudo apt update step is required: without it apt will not see new packages or new versions.

One line via the bundled installer script

Equivalent to the above, with extra prerequisite checks and a friendlier output summary:

curl -fsSL https://raw.githubusercontent.com/ra-yavuz/lillycoder/main/scripts/get.sh | sudo bash

If you would rather read the script first (recommended for any curl | bash):

curl -fsSL https://raw.githubusercontent.com/ra-yavuz/lillycoder/main/scripts/get.sh -o get.sh
less get.sh
sudo bash get.sh

Step by step (manual repo setup)

# 1. Trust the signing key
sudo install -d -m 0755 /etc/apt/keyrings
curl -fsSL https://ra-yavuz.github.io/apt/pubkey.gpg \
  | sudo tee /etc/apt/keyrings/ra-yavuz.gpg >/dev/null

# 2. Add the apt source
echo "deb [signed-by=/etc/apt/keyrings/ra-yavuz.gpg] https://ra-yavuz.github.io/apt stable main" \
  | sudo tee /etc/apt/sources.list.d/ra-yavuz.list

# 3. Refresh the package index, then install
sudo apt update
sudo apt install lillycoder

From source (any Linux, also macOS via pip)

git clone https://github.com/ra-yavuz/lillycoder.git
cd lillycoder
pip install --user -e .

Platform support

Tested on Ubuntu (Linux only). Should also work on WSL2 Ubuntu / Debian (it is a Linux distro, the apt path applies). On macOS, the .deb and apt install paths do not apply, but the from-source pip install --user -e . path is expected to work because the dependencies (httpx, prompt_toolkit, rich, pydantic) are all cross-platform and lillycoder shells out to standard POSIX tools that exist on Darwin. macOS support is not regularly tested by the author, so if you hit a portability issue please open an issue.

Quick start

Have an LLM server running somewhere on localhost. Then in any project:

cd ~/myproject
lillycoder

🦊 scanning localhost for LLM servers...
🦊 found 1 endpoint: http://localhost:11434/v1 (ollama, 3 models)
   use it? [Y/n] y
✓ ollama · qwen2.5-coder:7b
🦊 lilly is awake · qwen2.5-coder:7b · /home/you/myproject  ·  11 tools
   type a message · /help for commands · /exit to leave
[ctx 1.2k/8k·15%] › what files are in this folder?

Personalities

lillycoder ships six bundled personas, all written in first person with explicit anti-roleplay rules so a local model still sounds like Lilly typing rather than narrating about her:

name	voice
`default`	nine-and-a-half-year-old kid coder, warm and curious
`tsundere`	snippy, grumpy, still does the work
`yandere`	doting, focused on the user, mildly possessive about the code
`sweet`	gentle, encouraging, low-key cheerful
`adult`	calm senior engineer voice, no exclamation marks
`analytical`	precise, methodical, distinguishes "checked" from "assumed"

Switch live with /personalities load <name>. Add your own with /personalities add <name> <text> or drop a markdown file into ~/.config/lillycoder/personas/; user files shadow bundled ones of the same name.

When you shadow a bundled persona, lillycoder snapshots the bundled text at the moment of override (a .bundled-base.md sidecar). Later you can run /personalities diff <name> to see your edits AND any upstream drift since you forked. Bundled files are never overwritten by an update if you have a shadow.

Lilly can manage her own personalities through real tool calls (add_persona, clone_persona, set_active_persona, set_evolve). Tell her "make a pirate persona and switch to it" and she does it through the tool registry, not by writing files in your repo.

Flip /persona-evolve on to snapshot the current in-memory persona to disk and switch to it. From then on, every persona rewrite (whether by the model itself via set_persona, or by you inline) gets persisted to that file. Next launch, lillycoder reloads the last active persona automatically.

Token budget

The default /max-tokens auto computes a per-reply cap from your model's reported context window (about 85% of remaining headroom, with a 4096-token ceiling). That matters because:

Most local servers default to a tiny n_predict (llama.cpp's default is 128). lillycoder's auto replaces that with a real number.
Reasoning models burn unpredictable amounts of budget on hidden <think> content before they emit visible text. With a small fixed cap they can exhaust the budget inside the think block. auto leaves headroom for both.

Set an explicit cap any time: /max-tokens 256 for snappy answers, /max-tokens 4096 for long-form. Or via CLI: lillycoder --max-tokens 4096.

Compatible servers

Server	Default port	Notes
hydra-llm	18080+	recommended pairing (sibling project)
llama.cpp `llama-server`	8080	OpenAI `/v1` shape native
ollama	11434	OpenAI surface at `/v1`
LM Studio	1234	built-in local server
any other	any	`--api http://your.url/v1`

The model on the other end matters. Tool-calling reliability needs a model trained for it. lillycoder warns when the chosen model is not in its known-tool-capable allowlist (Qwen 2.5+, Qwen 3, Gemma 3+, Llama 3.1+, Mistral Small 3, Dolphin 3 R1). Pass --force to silence.

Pairs with hydra-llm

hydra-llm is a sibling project that manages local LLM servers: it wraps llama.cpp in Docker, ships a curated GGUF catalog with anonymous downloads, and exposes each running model as an OpenAI-compatible endpoint on a stable local port. lillycoder talks that exact shape, so the two compose into a fully local coding agent in one terminal:

# in hydra-llm:
hydra-llm start qwen2.5-32b           # or any 'code' tagged model
hydra-llm api   qwen2.5-32b           # prints the URL

# in your project directory:
lillycoder --api http://localhost:18087/v1
# (lilly auto-detects common local LLM ports too, so just `lillycoder` often works)

hydra-llm handles model lifecycle (download, start/stop, system prompts, persistent sessions, optional KDE Plasma widget). lillycoder is the agent on top: file tools, shell tools, grep, permission gating. Use them together, or use lillycoder with whatever local server you already run.

Safety

Hard-banned commands cannot be turned off by --bypass-permissions: sudo, rm -rf /, rm -rf ~, mkfs, dd of=/dev/*, recursive chmod / chown of / or ~, fork bombs. They are refused at the safety classifier before exec. Writes outside the working directory are also blocked by default; widen with LILLY_ALLOW_OUTSIDE_CWD=1 if you really mean it.

Disclaimer: lillycoder is provided AS IS, WITHOUT WARRANTY OF ANY KIND. The LLM can read, write, and delete files in the current working directory and run shell commands on your behalf. LLMs hallucinate; they may invent file paths or write incorrect commands. You alone are responsible for any damage to your data, hardware, or system. By installing or running this software you accept full risk. See README for the full text.