Skip to content

email-top-tasks

When to invoke

Trigger this skill when the user asks any of:

  • "list the top N tasks from my email"
  • "what do I need to act on from the last X days"
  • "summarize my email inbox into action items"
  • anything that couples email_store data + local LM Studio + ranked task list

Pre-flight (non-negotiable)

  1. Confirm Postgres reachable: host=192.168.1.33 port=54522 db=email_store. Creds in ~/Developer/EMail CLI/email-store/.env.
  2. Confirm LM Studio reachable: curl -s http://192.168.1.235:1234/v1/models.
  3. If the user names a model other than google/gemma-4-e4b, set MODEL env var.

Known infra pitfalls (learned the hard way — do NOT re-discover)

SymptomRoot causeFix
NotImplementedError: RotatingKVCache Quantization NYILM Studio KV cache quantisation toggle does not support Gemma sliding-window rotating cache once total tokens exceed ~5000.Keep each request prompt + max_tokens ≤ ~4600. Split into cohorts.
Output empty, reasoning_tokens equals completion_tokensGemma thinking eats the entire output budget before producing any content.Add a third message {"role":"assistant","content":"1."} as a prefill — the model resumes from "1." and skips the think phase. Prepend "1." back when parsing.
The number of tokens to keep from the initial prompt is greater than the context lengthTarget model loaded with smaller context than requested.Either shrink input or pick a model loaded with ≥10k ctx (gemma-4-e4b is).
Model loading was stopped due to insufficient system resourcesMachine cannot hold the model.Fall back to a smaller loaded model (gemma-4-e4b, gpt-oss-20b, qwen3-vl-8b).

Pipeline shape

Three Postgres queries → three LM Studio calls → merge + dedup + renumber.

  • Cohort A (work/personal/travel/invoice/education/other with non-empty ai_action_items): up to 50 tasks.
  • Cohort B (upcoming subscriptions, v_email_transaction_amounts big TX, notification emails with pay|due|bill|invoice|refund|tax action items): up to 50 tasks.
  • Cohort C (email_threads.is_awaiting_reply=true filtered against newsletter regex, plus notification emails with rsvp|confirm|verify|deliver|renew|expire|deadline action items): up to 40 tasks.

Dedup by lowercase-alnum prefix of the first 50 chars, keep first 130, renumber 1..130.

Budget per call (keep total tokens ≤ 4600)

input ≈ 1100–1700 tokens
max_tokens = 2300
reasoning_tokens = 0   ← guaranteed via prefill trick

Reference script

~/.claude/skills/email-top-tasks/run.py — copy and adjust N_TASKS, WINDOW_DAYS, and the three SQL slices to tune.

Invocation

bash
MODEL=google/gemma-4-e4b python3 ~/.claude/skills/email-top-tasks/run.py
# Output goes to stdout and /tmp/top130_tasks.txt

Measured performance (baseline, 2026-04-20, David's LM Studio @ 192.168.1.235, gemma-4-e4b)

Full 3-call pipeline, 130 tasks output:

MetricValue
Wall time (3 sequential calls)150.90 s
Cohort A usageprompt=2491, completion=1995, total=4486
Cohort B usageprompt=2376, completion=1919, total=4295
Cohort C usageprompt=1886, completion=1819, total=3705
Total tokens processed12 486
Completion tokens only5 733
Completion throughput≈ 38 tok/s
End-to-end throughput≈ 82.7 tok/s
Tasks emitted130 (100 % of target)

Per-call avg ≈ 50 s. If the user wants faster wall time, issue the 3 calls in parallel — they're independent and the server can serve them back-to-back; expect ~55-70 s end-to-end.

Open issue surfaced while running

email_threads.is_awaiting_reply in this project is heavily polluted with newsletter / security-alert subjects (e.g. "New login from Chrome", "Cricut Craftfest", "Koala x Bluey collection"). The classification logic in scripts/reconstruct_threads.py needs tighter filters. The skill works around it with a regex exclusion in the cohort-C query, but the data pipeline itself should be fixed.

Read-only documentation bundle of the Med Tracker agent stack. AU compliance baked in (AHPRA + Privacy Act 1988 + Spam Act 2003).