uncloseai-cli

Three tools for growing machine learning from seed. A ReAct agent harness powered by a local 8B LLM, a pure-Python GPT trainer with zero dependencies, & a model garden of 15 curated datasets. All public domain. All running in production, validating the self-sandbox algorithm entirely out of band of Anthropic.

Source Code → git.unturf.com

Why This Exists

The self-sandbox whitepaper proves that machine learning agents can provision their own infrastructure. But if that proof depends entirely on one vendor's model, it proves less than it claims. uncloseai-cli validates the algorithm through an independent path: a different model family (Llama 3.1 8B), a different agent harness (pure Python, zero vendor SDK), & a different execution loop (ReAct with planning & task trimming). Same sandbox API. Same infrastructure. Different everything else.

This matters. If the algorithm only works with Claude, it belongs to Anthropic. If it works with any model through any harness, it belongs to everyone. uncloseai-cli proves the latter.

uncloseai-cli: Local LLM Agent

ReAct agent with a todo system. Every request flows through:

Plan: LLM breaks request into numbered tasks
Trim: Python removes fluff (open/read, report/inform), caps at 5
Execute: each task runs through a ReAct loop with tool access
Forward: prior task results flow to later tasks as context

Usage

unclose "pull and sync"
unclose "what time is it in EST? also run ddate"
unclose -p "summarize SYSTEM-PROMPT.md"      # print-only mode
unclose -v "deploy"                           # verbose (show turn numbers)
unclose -i                                    # interactive REPL

Install

git clone https://git.unturf.com/engineering/unturf/uncloseai-cli.git
cd uncloseai-cli
make install

Creates uncloseai-cli, unclose, u, microgpt-cli, microgpt in /usr/local/bin (plus backwards-compat hermes-cli, hermes, h).

Configuration

Variable	Default	Description
`UNCLOSE_BASE`	`https://hermes.ai.unturf.com/v1`	OpenAI-compatible API base
`UNCLOSE_MODEL`	`adamo1139/Hermes-3-Llama-3.1-8B-FP8-Dynamic`	Model name
`UNCLOSE_KEY`	`permacomputer`	API key
`UNCLOSE_MAX_TURNS`	`15`	Max ReAct turns per task
`UNCLOSE_MAX_RESULT`	`12000`	Max chars per tool result

Legacy HERMES_* env vars still work as fallbacks.

8 Tools

Tool	Args	Description
`bash`	`command`	Run a shell command
`read`	`path`	Read a file
`write`	`path`, `content`	Create/overwrite a file
`edit`	`path`, `old`, `new`	Replace exact string in file
`glob`	`pattern`, `path`	Find files by pattern
`grep`	`pattern`, `path`	Search file contents with regex
`fetch`	`url`, `depth`, `keywords`	Fetch web page, extract text (async crawler)
`todo_add`	`content`, `activeForm`	Add task to live todo list during execution

Web Fetching

The fetch tool wraps a production-grade async web crawler (2,746 lines):

Single page: fetch url, fetches, strips HTML, returns readable text + links
Deep crawl: fetch url depth=2 keywords=python,async, keyword-aware multi-level crawl
Ethical: respects robots.txt, crawl delays, clear user-agent string
Media-aware: detects images, videos, audio, PDFs

microgpt: Pure-Python GPT

Minimal GPT training & inference. Zero external dependencies, only os, math, random, json, argparse. Based on Karpathy's microgpt.

Implements: scalar autograd (Value class), multi-head attention, RMSNorm, MLP, Adam optimizer with linear LR decay, temperature-controlled sampling. JSON model persistence with full metadata.

# Train on a text dataset (one document per line)
microgpt train --dataset names.txt --steps 1024 --save model.json

# Generate samples from a trained model
microgpt generate --load model.json --samples 10 --temperature 0.5

# Train & generate in one shot
microgpt run --dataset names.txt --steps 1024 --samples 10

# Inspect a saved model
microgpt info model.json

Model Sizes

Size	n_embd	n_head	n_layer	Use
64	64	4	1	Fast, good for testing
128	128	8	2	Balanced
512	512	8	4	Slow in pure Python, best quality

garden.mk: Smol Model Garden

Pull 15 curated character-level datasets, grow GPT models at 3 sizes. Every dataset tracks its upstream origin. Mirrored to HuggingFace: russellbal/smol-seeds.

15 Datasets

Seed	Lines	Origin
names	32K	karpathy/makemore
words	97K	dwyl/english-words
pokemon	1K	sindresorhus/pokemon
dinosaurs	1.5K	brunoklein99/deep-learning-notes
hex-colors	32K	xkcd + meodai/color-names
color-names	31K	meodai/color-names
chords	1K	tombatossals/chords-db
json-keys	4K	GitHub OpenAPI spec
css-classes	2K	twbs/bootstrap
make-targets	291	scraped from major repos
commit-msgs	20K	angular/angular
haiku	143K	docmarionum1/haikurnn
variable-names	3.8K	GitHub repo trees
last-names	225K	sacrificialpancakes/synthetic_demographics_seed
occupations	831	sacrificialpancakes/synthetic_demographics_seed

Garden Operations

make -f garden.mk pull              # download all datasets
make -f garden.mk grow-64           # train all models at size 64
make -f garden.mk grow              # train all at all sizes (64, 128, 512)
make -f garden.mk names             # pull + grow one model family
make -f garden.mk origins           # list upstream sources
make -f garden.mk freshness         # check upstream last-modified dates
make -f garden.mk upgrade-all       # force re-fetch all from origins
make -f garden.mk mirror            # push datasets to HuggingFace
make -f garden.mk sample MODEL=pokemon-64 N=10
make -f garden.mk inventory         # show trained models

Out of Band Validation

The self-sandbox whitepaper describes 14 flows & 2,324 assertions proving that machine learning agents can provision their own infrastructure. The oracle (Claude Opus 4.6) validates this daily. But a proof that depends on one vendor proves vendor lock-in, not vendor independence.

uncloseai-cli closes that gap:

	Oracle (Claude)	uncloseai-cli (Llama 3.1 8B)
Model	Claude Opus 4.6	Hermes-3-Llama-3.1-8B
Provider	Anthropic	Self-hosted (llama-server)
Agent harness	Claude Code	uncloseai-cli (pure Python)
SDK dependency	Anthropic SDK	None (raw HTTP to OpenAI-compatible API)
Sandbox API	Same unsandbox infrastructure
Tool interface	Same 8 tools (bash, read, write, edit, glob, grep, fetch, todo)

Same infrastructure. Same tools. Different model. Different harness. Different vendor. Zero shared dependencies. If both paths produce correct results through the same sandbox API, the algorithm belongs to no one & works for everyone.

The oracle's make scatter, make exorcise, & make cross-audit dispatch work to uncloseai-cli for independent verification. A debate between copies of the same model functions as an echo chamber. Real truth requires collision between architectures that fail differently.

Architecture


  User Request
       |
       v
  +-----------+     +-----------------+
  |  Plan     | --> | LLM generates   |
  |           |     | numbered tasks  |
  +-----------+     +-----------------+
       |
       v
  +-----------+     +-----------------+
  |  Trim     | --> | Python removes  |
  |           |     | fluff, caps @ 5 |
  +-----------+     +-----------------+
       |
       v
  +-----------+     +-----------------+
  |  Execute  | --> | ReAct loop per  |
  |  Task 1   |     | task: think,    |
  |  Task 2   |     | act, observe    |
  |  Task N   |     | (8 tools)       |
  +-----------+     +-----------------+
       |
       v
  +-----------+
  |  Forward  |  prior results flow
  |           |  to later tasks
  +-----------+