v0.3.0 · Latent Embedding Operating System

The operating system
that thinks in vectors.

leOS is a local AI substrate where knowledge, tools, media, routing decisions, and cached responses all live as points on the surface of a high-dimensional sphere. Agents don't search by keywords. They search by meaning, route by geometry, and learn by accumulating experience in embedding space.

CPU-only embeddings
Fully local runtime
6 modalities unified
420+ kernel ops
§ 01 · The Thesis

A substrate that grows with use.

For an AI agent to be genuinely useful it needs to do thousands of things — read files, search the web, analyze images, transcribe video, call APIs, process astronomy data. Today's agent frameworks hit a hard wall: the more tools you give an agent, the worse it performs. leOS was built to break that wall.

Every tool definition eats context tokens; load 200 tools into an LLM and there's no room for the actual work. leOS keeps the full catalogue in an embedding-indexed registry instead. When a task arrives it's embedded and scored against domain centroids to discard 80-90% of tools instantly (Pass 1), then fine-grained semantic plus keyword scoring runs on the survivors (Pass 2), blended with learned usage history from past sessions (Pass 3). The agent receives only the 6-8 tools that matter. This scales cleanly to thousands.

"Every capability in the system — from embed text to transcribe a YouTube video to query the SDSS catalog — is a typed atomic operation we call a bone. Bones compose into chains. Chains that work become skeletons. Skeletons become skills."

The FABRIK planner (borrowed from inverse kinematics in character animation) works backward from the desired output and forward from available inputs to assemble chains that achieve goals. Successful chains are saved as skeletons — pre-validated patterns reused at zero-LLM cost. Failed trajectories get recorded as displacements so the next similar task avoids the bad path. The library of known-good chains grows every interaction.

§ 02 · Architecture

Three layers, one hypersphere.

The system is cleanly split into a hardware-analog layer, a software layer, and a kernel that bridges them — with four CPU embedding processors feeding the whole stack and a semantic membrane exposing the inside to the world.

Layer 00
Browser
Three.js 3D desktop (with 2D canvas fallback) that acts as both the agent's perception layer and the human's observation window. Agents get visual tools — describe the screen, click, type, drag — so they can see what they're doing. The desktop is the spatial representation of embedding space itself: stored vectors, SDF regions, and knowledge density rendered as geometry.
HTTP / WebSocket
Layer 01
SDOL
The Semantic Driven Operation Layer. Variables are 768d vectors, comparison is cosine similarity, branching is routing to the nearest SDF region, assignment is SLERP interpolation. Hosts bone chains, the FABRIK planner, dynamic tool selection, scopes, plans, the task orchestrator, context assembly, and the full SDOL programming language with compiler + REPL.
Software · Python
Layer 02
Kernel
420+ instructions spanning vector math, storage, routing, apps, LLM escalation, filesystem, network, dreaming, scopes, plans, mathematics, the external I/O membrane, the dataset job engine, and full VSA computation primitives (BIND, BUNDLE, PERMUTE, RESONATOR_FACTORIZE).
Dispatch
Layer 03
LVM
The Latent Virtual Machine — the hardware layer. Spherical geometry, SDF regions, the displacement codec, embedding partitions, the reflex arc, the gravitational lens, the holographic cache, the living medium. This is where the system actually computes in vector space.
Hardware-analog
Layer 04
Embeddings
Four CPU-only models — nomic-embed-text (768d), nomic-embed-vision (768d), Qwen3-Embedding (1024d), and ImageBind (1024d, six modalities: vision, text, audio, depth, thermal, IMU). The Rosetta codec translates between 768d and 1024d via Procrustes alignment.
Perception
§ 03 · How it Works

One loop, four moves.

The same cycle runs whether the agent is answering a question, writing code, analysing a chart, or ingesting a 50 GB astronomy catalog. Each pass leaves the system a little smarter than it found it.

01

Embed

The incoming task becomes a vector on the unit hypersphere. Literal strings are pre-computed at compile time — zero runtime cost.

02

Route

Three-pass tool selection: centroid culling, semantic scoring, learned history. Agent sees only the 6-8 tools that matter.

03

Plan

FABRIK searches backward from the goal and forward from available inputs. If a known skeleton matches (similarity ≥ 0.80), reuse before assembly.

04

Learn

Successful chains become skeletons. Failed trajectories get recorded. Idle time consolidates and repairs via the dreaming engine.

§ 04 · The Sensory Cortex

Four models.
One shared perceptual field.

These four models aren't just listed in a config file. They work together as a system in ways that produce capabilities none of them has individually. Every model is open-weight and every one of them runs on CPU — no GPU required for the perceptual layer.

nomic-embed-text v1.5
Primary text space
The workhorse. Routing, classification, semantic search, tool selection, knowledge base — all land here. Open Apache-2.0 weights, Matryoshka-trained so you can truncate to 256d for speed without retraining. Fast on CPU, aggressively cached. This is the 768d lingua franca of the system.
nomic-embed-vision v1.5
Images in the text space
Critical trick: this model produces vectors in the same 768d space as nomic-embed-text. Embed an image and a sentence, cosine-compare them, and you get meaningful similarity with zero alignment step. CROSS_SEARCH queries the text store and the vision store with a single vector. Text-to- image and image-to-text search become the default.
Qwen3-Embedding 0.6B
Instruction-aware contextual text
Bigger, deeper, instruction-aware. The same raw text embedded with different instructions produces different vectors optimised for different retrieval contexts — "represent this market event for correlation with sentiment signals" vs. "find observations related to cascade events." Used for context stores, code-aware scoring, and the blackboard. The second voice in the opponent-channel duet.
ImageBind
Six modalities, one space
Meta's multimodal encoder that lands vision, text, audio, depth, thermal, and IMU in a single 1024d space. A sound, an image, a text caption, and a depth map of the same scene embed to nearby points. This is the mechanism behind searching a stellar spectrum with plain English — both go in the same space.

Emergent capabilities

Shared-space search

Cross-modal by default

Because nomic-text and nomic-vision emit into the same 768d space, searching images by text description (or text by image) is just cosine similarity. No separate index. No alignment layer. No cross-encoder. CROSS_SEARCH is one kernel instruction.

Rosetta codec

768d ⇄ 1024d translation

The nomic (768d) and Qwen/ImageBind (1024d) spaces are different geometries. The Rosetta codec learns a projection matrix between them via Procrustes alignment — find the orthogonal W minimising ‖AW − B‖ over paired embeddings. Once calibrated, a displacement learned in one space broadcasts to all four models.

Opponent channels

Disagreement becomes signal

When two models embed the same content, decomposing their disagreement gives five channels: agreement, A-exclusive, B-exclusive, magnitude dispute, and the purple channel — emergent information in neither model alone. Used for contradiction detection, ad filtering, semantic denoising, and divergence interrupts when the models see something fundamentally differently.

Synesthesia

Data becomes cross-modal

Any 768d nomic vector can be packed into a 16×16 RGB image (768 = 256 × 3) and re-embedded through ImageBind vision. No Rosetta projection needed — the vision encoder preserves local structure automatically. Numerical data becomes frequency sweeps, rhythms, chords, or OFDM-style spectrograms — then embeds through ImageBind audio. Different encoders surface different structural properties of the same data.

Giulio Tononi's Integrated Information Theory provides the design principle. A shared embedding space that every subsystem reads and writes has fundamentally higher Φ (integration) than a collection of independent modules reporting to a dashboard — the whole genuinely exceeds the sum of its parts.

§ 05 · Two-Tier Compute

The intern and the swarm.

Most agent systems have exactly one model doing everything, and it blocks the whole system while it thinks. leOS runs a two-tier architecture: a lightweight intern model and an army of bots, all on CPU, in parallel with the main agent. Nothing ever competes for GPU memory.

The Intern · Qwen3 0.8B · CPU

A small model that never blocks.

The intern is a 0.8-billion parameter model running CPU-only with num_gpu=0. It's never user-facing. It's called via the kernel's ASSIST instruction, which checks the reflex arc first (maybe the answer is already cached) before invoking the model at all. When it does run, it processes at ~100-200 tokens/second — not fast by GPU standards, but free, because it never touches the GPU the main 9B model is using.

The intern handles work across 40+ modules:

  • Failure analysis — 2-3 sentence root-cause summaries injected into retry prompts so the main agent doesn't repeat mistakes
  • Deliverable summarization — purposeful summaries at scope boundaries instead of blind truncation
  • Context compaction — summarises older messages before the main model ever runs again
  • Decision tree evaluation — LLMNode questions cost 0-2 intern calls instead of burning main model context on routing logic
  • Lesson extraction — compact statements from flagged learning experiences
  • Two-pass structured output — main model generates in thinking mode, intern does the JSON pass so reasoning isn't constrained
  • Status companion — answers user questions while the main agent is busy, routing through an embedding-classified decision tree and posting notes to a shared blackboard the main agent reads when it returns
The Bots · Pure CPU workers

The system's subconscious.

Bots run on schedules, monitor data sources, detect anomalies, and only escalate to an LLM when something genuinely needs language understanding. A bot cycle (perceive → evaluate → act) runs entirely on CPU — HTTP requests, file reads, embedding comparisons (~0.1ms each), threshold checks, regex patterns. The system can run dozens of bot cycles per minute without touching the main model.

Bots are assembled, not programmed. The factory combines reusable templates:

  • 14 perception typesperceive_web, perceive_api, perceive_rss, perceive_file, perceive_kb, perceive_partition, perceive_observation, perceive_kernel, perceive_diff, perceive_multi, perceive_port, and more
  • 18 action typesact_record, act_alert, act_kb, act_escalate, act_chain, act_ingest, act_spawn_bot, act_displace, act_emit, act_llm, and more
  • Scoped work containers — parent-child scopes let agents spawn sub-work without polluting the parent's reasoning
  • Price watchers, channel monitors, KB gap scanners — all from the same primitives

The dreaming engine itself is a scoped agent: during idle time it operates in a System Self-Improvement scope, spawning child scopes for scope health review, capability audits, reflex optimisation, KB gap analysis, and context compaction. The system uses the same machinery to improve itself that it uses to do anything else.

§ 06 · Novel Techniques

Mechanisms you won't find
anywhere else.

leOS borrows mathematics from character animation, cosmology, the demoscene, neuroscience, and video compression — and applies it directly to the embedding medium. These aren't metaphors. They're the same math on different data.

VSA · Turing-complete

Vector symbolic algebra

Three primitives — bundling (addition), binding (circular convolution via FFT, O(d log d)), and permutation (cyclic shift) — form a Turing-complete computing framework (Kleyko et al., Proc. IEEE, 2022). The same three ops compose sets, sequences, trees, and graphs into a single fixed-width vector.

Displacement Codec · H.264 for cognition

Trajectories compress like video

Every task-to-response is recorded as a tangent vector on the hypersphere. Similar trajectories compress into shared I-frames, P-frames, and B-frames. The codec stores the pattern of transformation, not the output. Reconstructing a response costs a vector lookup.

Reflex Arc · Conformal bounds

The LLM stops running

When enough consistent displacements accumulate in a region (5+ by default), the reflex engine fires cached responses with conformal confidence bounds. Familiar patterns bypass the LLM entirely and replay from geometric cache in microseconds.

SDF Regions · from the demoscene

Semantic signed distance fields

Named ellipsoidal regions in embedding space define semantic boundaries using signed distance field math. Union is min(a,b), intersection is max(a,b), subtraction is max(a,−b) — arbitrarily complex semantic filters from trivial operations. The gradient gives a free "direction to nearest boundary" vector.

Gravitational Lens · Barnes-Hut

Queries bend toward competence

Dense SDF regions deflect nearby queries toward them, like light bending around a galaxy. Implemented with the Barnes-Hut tree — the same O(n log n) algorithm used for galactic N-body simulation. Frequently-used vectors exert more pull over time.

Holographic Cache · HRR

Many keys in one vector

Circular convolution stores multiple key-value pairs in one fixed-width vector: record = k₁⊗v₁ + k₂⊗v₂ + … + kₙ⊗vₙ. Retrieve with v_i ≈ k_i† ⊗ record. Based on Plate's Holographic Reduced Representations. Noise after 10-20 compositions is handled by the cleanup memory — error correction analogous to digital systems.

Living Medium · mycelium model

A field that grows where you work

An information-density field modeled on mycorrhizal networks. Grows toward areas of activity via success-density feedback. Prunes neglected regions. Hub vectors become knowledge redistributors (inspired by Simard's mother tree research on scale-free mycorrhizal topology).

Residue Arithmetic · CRT

Integer math without an ALU

Via the Chinese Remainder Theorem. Pick coprime moduli (e.g. 7, 11, 13, 17, 19, 23 — product ≈ 3.2M), assign random digit vectors, BIND them. Addition becomes binding. Comparison becomes cosine similarity. Integer math up to ~3.2M using only vector ops the embedding hardware already supports. Solves subset-sum via resonator networks.

Resonator Network · factorization

Pulling bindings apart

Given a composite vector s = x₁ ⊗ x₂ ⊗ … ⊗ xₖ and candidate codebooks, each factor estimate iteratively updates until convergence in 5-50 iterations. The inverse of VSA binding — the mechanism behind NP-hard search inside embedding space.

Predictive Coding · Friston

System 1 & System 2

Based on Karl Friston's active inference framework. A lightweight linear predictor estimates the expected output before running any agent. Confident → use prediction directly (System 1, fast). Uncertain → full LLM runs (System 2). Target ratio: ~80% of routine tasks handled without LLM inference.

Inception Hierarchy · MRL

Multi-resolution views

Same vector queried at multiple Matryoshka scales. 32d shows broad regions ("work", "media"), 128d shows subregions ("project notes", "code"), 1024d shows individual documents. Continuous landscape. Zooming costs nothing — same vector under different projections.

Activation Steering · compiled geometry

Steering vectors replace prompts

Run the intern on contrastive example pairs, compute the mean hidden-state difference, normalize. At inference, inject via forward hook: hidden += α · steering. One tensor addition per token — microseconds. Replaces fragile prompt engineering with compiled geometric subroutines.

Generation Probe · read the mind

Route before the model speaks

Before the intern generates a single token, a linear probe reads its hidden state after prefill and predicts REFLEX (cache hit), ASSIST (intern handles it), or ESCALATE (main model). Trained online from accumulated outcomes. Reaches 92%+ accuracy with use.

Void Detection · knowledge gaps

Maps of what isn't known

The KB void map probes the space between knowledge clusters. "Know Python, know async, but no article on Python async" is a void. When frequency crosses a threshold, the dreaming engine autonomously researches and fills these gaps during idle time.

Scopes · ephemeral context

Work containers with their own memory

Parent and child scopes let agents spawn sub-work without polluting the parent's reasoning. Only deliverables cross scope boundaries. The context assembler builds agent prompts from eight priority tiers, each with its own token budget.

Dreaming Engine · idle consolidation

Sleep for the substrate

Runs during idle time. Consolidates the displacement codec, runs void detection, grows and prunes the living medium, compacts stale scopes, audits bones, renders deferred thought monologues, generates reflections from unprocessed learning. The system uses itself to improve itself.

§ 07 · Drift Detection

Catching hallucination
with astronomy.

leOS has a quality-control system that catches agents hallucinating, looping, or producing shallow non-answers — without making a single LLM call. Everything is pure vector geometry on the unit hypersphere. The core concept is borrowed from cosmological observation.

→ diverging
Redshift

The response is receding.

The displacement vector — the tangent from task embedding to response embedding, computed via the logarithmic map — is unusually long compared to what the neighborhood predicts. In astronomy, redshift means an object is moving away from the observer.

In leOS, redshift means the response is semantically receding from the task. Drift, off-topic wander, confabulation, hallucination. The agent's mouth is working but it's answering a different question.

← converging
Blueshift

The response is collapsing in.

The displacement is suspiciously short, or task-response cosine similarity exceeds 0.92. In astronomy, blueshift means an object is approaching.

In leOS, blueshift means the response is echoing the task back in different words. A non-answer like "I'll do that!" or "Great question — let me think about it." Catches the failure mode of appearing to engage without producing output.

For each response, the detector queries the displacement log for the K most similar past tasks and computes the mean and standard deviation of their displacement magnitudes. If the actual displacement exceeds the prediction by more than 1.8σ: redshift. Below the prediction by 1.5σ: blueshift. Fewer than 3 similar past tasks: void — unexplored territory. Pairwise cosine similarity above 0.85 across the last N responses flags a semantic loop even when the text differs.

The drift detector replaces LLM-based quality checking with pure math. It runs on every agent response automatically and costs zero tokens. The metaphor also drives the emotion parameters for voice synthesis — redshift produces uncertain delivery, convergence produces calm confidence, void produces a contemplative hush.

§ 08 · Voice & Thought Canvas

Agents narrate their work.
In their own voice.

Drift detection, voice synthesis, visual rendering, and the knowledge base all connect into a single learning loop. A flagged learning experience becomes a narrated thought video. The video gets triple-embedded (vision, audio, text) and stored as a searchable KB article. Future agents find it by meaning and learn from past mistakes without anyone writing documentation.

ChatterboxTTS · zero-shot cloning

The voice.

ChatterboxTTS is a two-stage neural TTS (T3 autoregressive + S3Gen decoder). Voice cloning is zero-shot: provide 5-30 seconds of reference audio and the model matches timbre, pitch, and cadence. No fine-tuning. Reference audio can come from direct uploads, video extraction via ffmpeg, or URLs processed through yt-dlp.

Drift state drives emotion. The EmotionMapper converts the geometric drift classification into TTS parameters per line:

  • Redshift → uncertain: 0.85× speech rate, restrained exaggeration, pauses before speaking. Mild: [sniff]. Strong: [sigh].
  • Blueshift → excited: 1.15× rate, high exaggeration, flowing delivery. Mild: [chuckle]. Strong: [laugh].
  • Convergence → confident: normal pace, pauses after statements. Strong: [clears throat].
  • Void → contemplative: 0.75× rate, intimate exaggeration, pauses both sides. Strong: [gasp].

The paralinguistic tags are rendered by the same voice model that produces the speech — a [sigh] during redshift sounds like a real sigh from the speaker. Voice modulation also adjusts the TTS sampling itself: blueshift lowers min_p (more creative output), redshift raises it (more stable). The voice isn't just speaking differently — the model is generating differently. All output is watermarked with resemble-perth as AI-generated.

Thought Canvas · 2D Gaussian splats

The visual workspace.

The thought canvas is a 256×224 pixel numpy array (deliberately SNES-era resolution — the visual output is a byproduct of computation, not the point of it) where agents render 2D Gaussian splats while they work. Each splat is 8 floats: position, scale, rotation, color, opacity.

The renderer uses accumulated summation — for each pixel, the color is the sum of all splat contributions weighted by their gaussian falloff. This is order-independent: no z-sorting pass. Hundreds of splats at 256×224 render in single-digit milliseconds on CPU with numpy vectorization. The /thought page streams it live.

When the system ingests images, the SplatFitter decomposes them into splat representations — iterative optimization fits the gaussian parameters to a target image, then stores the parameters alongside the image's embedding. Over time this builds a learned mapping from concept-space to splat-space. An agent wanting to visualize a concept searches this cache for the nearest match, renders splats, embeds the result, and refines via perceptual feedback from nomic-vision. The system learns to draw by practicing.

The MonologueRenderer combines it all: canvas frames + ChatterboxTTS audio + EmotionMapper params → composite MP4 → triple-embed (vision + audio + text) → knowledge base article. Cross-modal search retrieves thought videos by query in any modality.

§ 09 · Media Ingest

Every tool.
On every asset.

The media pipeline runs every applicable analyzer on incoming media and lands the results in embedding space. The philosophy: you don't know in advance what you'll want to search for, so extract everything, embed everything, and let the geometry sort out relevance later.

Images

Ten passes, one asset.

Every incoming image goes through every tool that might produce useful signal:

  • Thumbnail generation + metadata extraction
  • Universal upscaling (small images benefit all downstream ML)
  • YOLO classification — top-5 predictions
  • YOLO object detection — bounding boxes
  • YOLO segmentation — pixel-level masks with area coverage
  • OpenCV face detection
  • Tesseract OCR with smart upscaling
  • Combined description generation
  • nomic-vision embedding (768d, cross-modal with text)
  • nomic-text embedding of the description
  • ImageBind embedding (1024d, cross-space divergence measured)
  • Auto-tagging from every analysis result
Videos · Audio

Transcribed, keyframed, embedded per-frame.

Videos: ffprobe metadata, multi-strategy keyframe extraction (scene-detect, low-threshold, timed fallback) with perceptual deduplication up to 30 frames, Whisper speech transcription, audio extraction. The entire image pipeline then runs on every extracted keyframe. Audio spectrograms via matplotlib.

Audio: metadata, Whisper transcription, spectrogram, ImageBind audio embedding.

The original video file is deleted after keyframe extraction to save disk — all the information survives in the embeddings and analysis records. A 2-hour video becomes 30 embedded keyframes, a full transcript, and an audio vector. All of it is cross-searchable.

All processing runs through a single-worker job queue so simultaneous submissions don't step on each other. YouTube and TikTok both route through yt-dlp automatically. The same MEDIA_INGEST kernel instruction handles URLs, files, streams, and uploads.

What this unlocks

Because every modality lands in a shared space, questions that usually require three different tools become one query. "Find the video clip where someone is explaining SDR with a red flag visible in frame" runs in a single cross-modal search: the text query embeds, the 30 keyframes per video embed, the transcripts embed, the audio embeds, and cosine similarity does the rest. No "video search API" required.

§ 10 · The Membrane

A semantic boundary,
not an API gateway.

The temptation is to bolt a traditional API gateway onto leOS — a collection of REST endpoints that external things POST to. We didn't. The membrane treats every entry point as an embedding-space citizen with an intent vector, a topic region, and a subscription fan-out. Data arriving through the membrane gets perceived, embedded, evaluated, and published to whoever is listening in the correct semantic neighborhood.

This works equally well for an IoT temperature reading processed in milliseconds and a 50 GB FITS astronomy catalog processed over hours, streamed incrementally, and resumable across server restarts. The difference is the processing path, not the model.

Real-time ports

The small-data side.

Create a named port via POST /ports. Every port has an intent vector derived from its name + description, which lets agents and external systems discover it semantically — "what can leOS accept about genomics?" returns matching ports.

Push data with POST /in/<port_id> or streaming with /in/<port_id>/stream. The ingestion pipeline runs normalize → perceive → embed → evaluate → store → notify on every item. Port config decides which field to embed, which reference text to compare against for signal detection, and what partition to land in.

The subscription bus (SSE, webhooks, internal queues) fans out every event to whoever's listening, with topic wildcards and filters.

Dataset job engine

The big-data side.

Upload a file, create a job, watch it stream results. The reader never loads the full file — peak memory is one chunk plus the already-loaded embedding models.

Supported formats: .csv, .tsv, .jsonl, .fasta, .npy, .npz, .fits (astropy), .h5/.hdf5 (h5py), .parquet (pyarrow), plain text.

Jobs are resumable across restarts with checkpoint recovery. Results stream live — you don't wait for a 50 GB file to finish before seeing the first flagged row. Every output port exposes /rows, /download, /csv, /embeddings (as a numpy file), and /search (semantic search within the results, while the job is still running).

Embedding strategies for scientific data

imagebind_audio

Treat any 1D array as a waveform

A stellar spectrum, a protein expression profile, a seismic trace, an EEG channel — all become 1024d vectors via ImageBind's audio encoder. They now coexist in the same space as sounds, images, and text. This is the strategy that makes searching spectra by plain English description possible.

imagebind_image

2D slices into the shared space

Telescope images, microscopy slides, medical scans, FITS image HDUs. Each 2D array → 1024d ImageBind vector. Cross-queryable with text, audio, and any other embedded modality.

numeric_direct

Pure numerical vectors

For gene expression profiles, physics simulation states, financial time-series — anywhere the numbers themselves are the semantic content. Random projection into 768d, normalized to the unit hypersphere. Preserves Euclidean structure while entering embedding space.

row_to_text

Template-based conversion

Catalog rows become descriptive sentences via user templates: "galaxy ra {ra} dec {dec} redshift {z:.2f} type {class}". Embeds via nomic-text. Mixed-type tabular data where the meaning of each row matters.

Scientific use cases

Astronomy

SDSS spectra anomaly scan

Upload a FITS catalog. Port embeds each spectrum's flux array via ImageBind audio encoder. The flag_void_region analysis bone identifies spectra landing in low-density regions of the embedding space — anomalous observations that don't cluster with known types. No labeled training set required.

POST /outputs/<id>/search "broadened emission lines consistent with AGN outflow"
Genomics

FASTA sequence sorting

Stream a FASTA file sequence-by-sequence. Embed via ImageBind audio (sequences as waveforms) or numeric_direct. Void detection flags unusual sequences automatically. A bot wakes up, escalates the most unusual ones to the main LLM for interpretation, and writes the interpretations back to the knowledge base.

Physics · Time-series

Simulation state tracking

HDF5 snapshots via h5py, NumPy arrays via mmap. Same object at different time steps produces a displacement vector that encodes what changed and in what semantic direction. After 10,000 snapshots, the reflex arc has learned which regions correspond to which regimes — subsequent runs route through cache.

Chemistry · Biology

Cross-modal correlation

Because everything lands in the same space, a molecular descriptor (numeric_direct) and a paper abstract (Qwen3) and a crystal structure image (imagebind_image) are all cosine-comparable. Find papers related to a compound you've never seen before — by its properties, not its name.

MCP protocol

Claude Desktop, Cursor, Windsurf

leOS exposes an MCP server with tools for store, search, cross-modal search, embed, status, media-ingest, and arbitrary kernel execution. Any MCP-compatible client can use leOS as a remote brain with full cross-modal semantic search and the learning substrate behind it.

Self-learning adapters

Learn APIs by description

API_LEARN ingests an API spec and stores it as a reusable adapter. The spec itself gets embedded, so agents find the right adapter semantically. leOS publishes its own API as a learnable spec — other leOS instances can learn it and call it.

Traditional tools — pandas, numpy, scikit-learn — treat a million-row dataset as a matrix to be filtered by explicit rules. The researcher must know what they're looking for before they look. leOS treats the same dataset as a million points in semantic space. Anomaly detection requires no labeled training set; low-density regions of the embedding space are unusual by definition. The second million-row dataset of the same type processes faster than the first. That isn't incremental improvement — it's a fundamentally different relationship between a researcher and their data.

§ 11 · The Learning Loop

Six mechanisms,
one growing substrate.

Every interaction feeds back into one or more of these systems. None require manual training. The substrate gets faster, smarter, and more knowledgeable automatically.

Mechanism 01

Displacement codec

Every task-to-response trajectory recorded as a tangent vector. Similar trajectories compress via I/P/B frames. The codec stores the pattern of transformation, not the output.

Mechanism 02

Reflex arc

Enough consistent displacements in a region graduate into cached responses with conformal confidence bounds. Familiar patterns bypass the LLM in microseconds.

Mechanism 03

Skeleton library

Successful bone chains become pre-validated patterns. FABRIK tries known skeletons first (similarity ≥ 0.80) before assembling anything new.

Mechanism 04

Tool-selection memory

Every session records which tools got used. Usage history feeds back as a 20% weight in scoring. The system learns that "PDF table extraction" reliably needs doc_query.

Mechanism 05

Skill assimilator

Tracks capability gaps — tasks where no tool scored well. Gap vectors cluster naturally. When a cluster crosses a frequency threshold, the system can generate a new tool from existing parts.

Mechanism 06

Self-extending instructions

When an LLM escalation succeeds on a novel task, the displacement compiler captures the trajectory and creates a permanent reflex entry. One successful call teaches the system to handle all similar tasks without LLM involvement.

§ 12 · Peer-Reviewed

Not speculative.
Grounded in published work.

The approach leOS takes is built on recent work across several fields. The mathematical proofs exist and the experimental results are published.

01

VSAs are Turing complete

Kleyko, Davies, Frady, Kanerva et al. — Proc. IEEE, 2022

Proved by emulating a (2,4) Turing machine and Rule 110 cellular automaton using only bundling, binding, and permutation. The emulated machine executed over 10⁹ error-free updates.

02

Coconut — reasoning without tokens

Meta AI — December 2024

LLM reasoning entirely in continuous latent space, outperforming chain-of-thought. Continuous thought vectors encode multiple alternative reasoning paths simultaneously — breadth-first search natively in continuous space.

03

nGPT — the hypersphere pays off

NVIDIA — 2024

Constraining representations to unit norm and expressing transformations as hypersphere displacements produced a 10× training speedup.

04

Neural Field Turing Machine

Malhotra et al. — August 2025

A differentiable, continuous-field computer. O(N) scaling with Turing completeness. Demonstrates cellular automata, PDE solving, and image refinement in one architecture.

05

Residue Hyperdimensional Computing

Kymn et al. — Neural Computation, Jan 2025

Unified residue number systems with HD vectors. Addition and multiplication as separate binding operators. Resources scale only logarithmically with numeric range. Solves NP-complete subset-sum via resonator networks.

06

RenderFormer — neural pipeline

Microsoft Research — SIGGRAPH 2025

First model to learn a complete graphics pipeline without ray tracing or rasterization. Scenes as triangle tokens. Rendering is pure attention over embeddings.

07

LatentMAS — shared-vector collaboration

Zou et al., Princeton / Stanford — Nov 2025

LLM agents collaborating through shared continuous latent space achieve 14.6% higher accuracy, 70-84% fewer tokens, 4× faster inference. The shared space is the coordination mechanism.

08

LangSplat — CLIP on Gaussians

CVPR 2024 · LangSplatV2 NeurIPS 2025

Compresses 512d CLIP embeddings to 3d per Gaussian via scene-specific autoencoder. LangSplatV2 reaches 476 FPS for feature splatting — a 42× speedup. Points in a 3D scene carry natural-language meaning.

§ 13 · Direction

Where we're taking it.

leOS is a single-developer project built in the open. The mission is to build the growing, adapting substrate that AI agents need to become genuinely capable — not a static tool library, but a living system that gets smarter, faster, and more knowledgeable with every interaction.

The current milestone is a clean public release: a first-time user should be able to install, launch, and complete real tasks — token reports, web research, scientific dataset ingestion, small app creation — without tripping on anything.

  1. Public release One-command install with everything automated. No manual setup steps for end users.
  2. Platform, not tool Enough exposed surface area that developers in any industry can build their own systems on top of leOS — embedding-indexed registries, plug-in bones, hostable services, MCP access, and the full membrane protocol.
  3. Continuous learning Agents stopped only by detected problems (loops, stalls, drift) — never by time alone. Every task grows the skeleton library.
  4. Self-funding development Use LP fees from the associated Solana token to sustain solo development indefinitely, free of investor pressure or roadmap capture.
§ 14 · Funding

The leOS token.

A Solana-based community token whose liquidity-pool fees feed directly into ongoing development. Trading the token literally funds the next feature.

Launch details coming soon

Funded by fees.
Not by VCs.

Every swap on the official liquidity pool sends a share of fees to the development wallet. As long as people trade the token, leOS keeps shipping — without subscription fees, without investor control, without a roadmap dictated by anyone's exit plan.

The contract address, chart link, and launch date will be dropped on this page as soon as they're live.

Chain
Solana
Funding model
LP fees
Ticker
TBA
Contract
— coming soon —
§ 15 · Join in

Read the code.
Watch it grow.

leOS is open source and developed publicly. Clone the repo, run it locally, and watch an AI system that actually learns from every interaction.