uncloseai-cli

Three tools for growing machine learning from seed. A ReAct agent harness powered by a local 8B LLM, a pure-Python GPT trainer with zero dependencies, & a model garden of 15 curated datasets. All public domain. All running in production, validating the self-sandbox algorithm entirely out of band of Anthropic.

Source Code → git.unturf.com

Why This Exists

The self-sandbox whitepaper proves that machine learning agents can provision their own infrastructure. But if that proof depends entirely on one vendor's model, it proves less than it claims. uncloseai-cli validates the algorithm through an independent path: a different model family (Llama 3.1 8B), a different agent harness (pure Python, zero vendor SDK), & a different execution loop (ReAct with planning & task trimming). Same sandbox API. Same infrastructure. Different everything else.

This matters. If the algorithm only works with Claude, it belongs to Anthropic. If it works with any model through any harness, it belongs to everyone. uncloseai-cli proves the latter.

uncloseai-cli: Local LLM Agent

ReAct agent with a todo system. Every request flows through:

  1. Plan: LLM breaks request into numbered tasks
  2. Trim: Python removes fluff (open/read, report/inform), caps at 5
  3. Execute: each task runs through a ReAct loop with tool access
  4. Forward: prior task results flow to later tasks as context

Usage

unclose "pull and sync"
unclose "what time is it in EST? also run ddate"
unclose -p "summarize SYSTEM-PROMPT.md"      # print-only mode
unclose -v "deploy"                           # verbose (show turn numbers)
unclose -i                                    # interactive REPL

Install

git clone https://git.unturf.com/engineering/unturf/uncloseai-cli.git
cd uncloseai-cli
make install

Creates uncloseai-cli, unclose, u, microgpt-cli, microgpt in /usr/local/bin (plus backwards-compat hermes-cli, hermes, h).

Configuration

VariableDefaultDescription
UNCLOSE_BASEhttps://hermes.ai.unturf.com/v1OpenAI-compatible API base
UNCLOSE_MODELadamo1139/Hermes-3-Llama-3.1-8B-FP8-DynamicModel name
UNCLOSE_KEYpermacomputerAPI key
UNCLOSE_MAX_TURNS15Max ReAct turns per task
UNCLOSE_MAX_RESULT12000Max chars per tool result

Legacy HERMES_* env vars still work as fallbacks.

8 Tools

ToolArgsDescription
bashcommandRun a shell command
readpathRead a file
writepath, contentCreate/overwrite a file
editpath, old, newReplace exact string in file
globpattern, pathFind files by pattern
greppattern, pathSearch file contents with regex
fetchurl, depth, keywordsFetch web page, extract text (async crawler)
todo_addcontent, activeFormAdd task to live todo list during execution

Web Fetching

The fetch tool wraps a production-grade async web crawler (2,746 lines):

  • Single page: fetch url, fetches, strips HTML, returns readable text + links
  • Deep crawl: fetch url depth=2 keywords=python,async, keyword-aware multi-level crawl
  • Ethical: respects robots.txt, crawl delays, clear user-agent string
  • Media-aware: detects images, videos, audio, PDFs

microgpt: Pure-Python GPT

Minimal GPT training & inference. Zero external dependencies, only os, math, random, json, argparse. Based on Karpathy's microgpt.

Implements: scalar autograd (Value class), multi-head attention, RMSNorm, MLP, Adam optimizer with linear LR decay, temperature-controlled sampling. JSON model persistence with full metadata.

# Train on a text dataset (one document per line)
microgpt train --dataset names.txt --steps 1024 --save model.json

# Generate samples from a trained model
microgpt generate --load model.json --samples 10 --temperature 0.5

# Train & generate in one shot
microgpt run --dataset names.txt --steps 1024 --samples 10

# Inspect a saved model
microgpt info model.json

Model Sizes

Sizen_embdn_headn_layerUse
646441Fast, good for testing
12812882Balanced
51251284Slow in pure Python, best quality

garden.mk: Smol Model Garden

Pull 15 curated character-level datasets, grow GPT models at 3 sizes. Every dataset tracks its upstream origin. Mirrored to HuggingFace: russellbal/smol-seeds.

15 Datasets

SeedLinesOrigin
names32Kkarpathy/makemore
words97Kdwyl/english-words
pokemon1Ksindresorhus/pokemon
dinosaurs1.5Kbrunoklein99/deep-learning-notes
hex-colors32Kxkcd + meodai/color-names
color-names31Kmeodai/color-names
chords1Ktombatossals/chords-db
json-keys4KGitHub OpenAPI spec
css-classes2Ktwbs/bootstrap
make-targets291scraped from major repos
commit-msgs20Kangular/angular
haiku143Kdocmarionum1/haikurnn
variable-names3.8KGitHub repo trees
last-names225Ksacrificialpancakes/synthetic_demographics_seed
occupations831sacrificialpancakes/synthetic_demographics_seed

Garden Operations

make -f garden.mk pull              # download all datasets
make -f garden.mk grow-64           # train all models at size 64
make -f garden.mk grow              # train all at all sizes (64, 128, 512)
make -f garden.mk names             # pull + grow one model family
make -f garden.mk origins           # list upstream sources
make -f garden.mk freshness         # check upstream last-modified dates
make -f garden.mk upgrade-all       # force re-fetch all from origins
make -f garden.mk mirror            # push datasets to HuggingFace
make -f garden.mk sample MODEL=pokemon-64 N=10
make -f garden.mk inventory         # show trained models

Out of Band Validation

The self-sandbox whitepaper describes 14 flows & 2,324 assertions proving that machine learning agents can provision their own infrastructure. The oracle (Claude Opus 4.6) validates this daily. But a proof that depends on one vendor proves vendor lock-in, not vendor independence.

uncloseai-cli closes that gap:

Oracle (Claude)uncloseai-cli (Llama 3.1 8B)
ModelClaude Opus 4.6Hermes-3-Llama-3.1-8B
ProviderAnthropicSelf-hosted (llama-server)
Agent harnessClaude Codeuncloseai-cli (pure Python)
SDK dependencyAnthropic SDKNone (raw HTTP to OpenAI-compatible API)
Sandbox APISame unsandbox infrastructure
Tool interfaceSame 8 tools (bash, read, write, edit, glob, grep, fetch, todo)

Same infrastructure. Same tools. Different model. Different harness. Different vendor. Zero shared dependencies. If both paths produce correct results through the same sandbox API, the algorithm belongs to no one & works for everyone.

The oracle's make scatter, make exorcise, & make cross-audit dispatch work to uncloseai-cli for independent verification. A debate between copies of the same model functions as an echo chamber. Real truth requires collision between architectures that fail differently.

Architecture


  User Request
       |
       v
  +-----------+     +-----------------+
  |  Plan     | --> | LLM generates   |
  |           |     | numbered tasks  |
  +-----------+     +-----------------+
       |
       v
  +-----------+     +-----------------+
  |  Trim     | --> | Python removes  |
  |           |     | fluff, caps @ 5 |
  +-----------+     +-----------------+
       |
       v
  +-----------+     +-----------------+
  |  Execute  | --> | ReAct loop per  |
  |  Task 1   |     | task: think,    |
  |  Task 2   |     | act, observe    |
  |  Task N   |     | (8 tools)       |
  +-----------+     +-----------------+
       |
       v
  +-----------+
  |  Forward  |  prior results flow
  |           |  to later tasks
  +-----------+