back to home

dchisholm125 / graph-oriented-generation

Graph-Oriented Generation (GOG)

55 stars
8 forks
8 issues
Python

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing dchisholm125/graph-oriented-generation in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/dchisholm125/graph-oriented-generation)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

Graph-Oriented Generation & Symbolic Reasoning Membrane An empirical research program into the structure of meaning > **Active research.** Benchmarks are reproducible, results are real, > and the research is ongoing. Contributions, challenges, and > replications welcome via issues or pull requests. --- What this repository is This repository documents two connected research programs. **GOG (Graph-Oriented Generation)** replaces probabilistic vector retrieval with deterministic graph traversal over a codebase's actual import dependency structure. It is complete, benchmarked, and validated. The paper is available here: **SRM (Symbolic Reasoning Membrane)** is the deeper investigation that GOG made possible. It asks: if structure can control a language model's output reliably, what is the minimum structure required? What are the atomic units of meaning that a language model recognizes, responds to, and can combine into richer concepts? Eighteen experiments later, that question has a partial and surprising answer. The full research paper is here: --- The core finding Across four language model architectures — Qwen 2.5 (0.5B), Gemma 3 (1B), LLaMA 3.2 (1B), and SmolLM2 (360M) — we found consistent empirical evidence for a primitive layer underlying language model behavior. Anna Wierzbicka proposed in the 1970s that all human languages share approximately 65 irreducible semantic concepts — WANT, KNOW, FEEL, GOOD, BAD, DO, HAPPEN — from which all other meaning is constructed. Cowen and Keltner identified 27 universal emotional states in 2017. We tested whether these primitives appear as measurable activation patterns in small language models. They do. Specifically: **The Layer 0a/0b distinction is real and architecture-independent.** Scaffolding primitives — SOMEONE, TIME, PLACE — produce abstract, relational responses. Content primitives — FEAR, GRIEF, JOY, ANGER, RELIEF, NOSTALGIA — produce phenomenological, embodied responses. The activation gap between these two classes averaged +0.245 across all four models. The direction was consistent in every model tested. **Primitive composition produces predictable Layer 1 concepts.** Eleven operator-seed combinations matched pre-registered predictions in three out of four model architectures: | Combination | Predicted | Validated | |-------------|-----------|-----------| | KNOW + FEAR | dread / awareness | ✓ 3/4 models | | FEEL + GRIEF | heartbreak / sorrow | ✓ 3/4 models | | WANT + FEAR | anxiety / avoidance | ✓ 3/4 models | | WANT + ANGER | ambition / revenge | ✓ 3/4 models | | TIME + GRIEF | mourning / melancholy | ✓ 3/4 models | | TIME + NOSTALGIA | memory / reminiscence | ✓ 3/4 models | | TIME + RELIEF | healing / recovery | ✓ 3/4 models | | WANT + GRIEF | longing / yearning | ✓ 3/4 models | | WANT + NOSTALGIA | longing / regret | ✓ 3/4 models | | FEEL + JOY | delight / bliss | ✓ 3/4 models | | KNOW + NOSTALGIA | wisdom / reflection | ✓ 3/4 models | **The scaling pattern has an implication.** The primitive activation gap is largest in the smallest model and narrows as model size increases — not because content primitives weaken, but because larger models develop richer phenomenological access to scaffolding primitives too. As language models scale, they appear to converge toward a more coherent internal representation of the primitive layer. This may partly explain why larger models reason better — they are closer to the atoms of meaning. --- Architecture The SRM proposes a three-layer architecture: Structure is deterministic. Language is emergent. The membrane carries structure into the language space and lets emergence do the rest. This is not a trained system. It is a theoretical architecture grounded in eighteen empirical experiments. Building it is the next phase. --- GOG: The Foundation GOG was the first indication that something deeper was possible. It demonstrated that replacing natural language prompts with deterministic symbolic specifications dramatically improves correctness in small language models — a 0.5B parameter model that fails completely on a reasoning task with a natural language prompt succeeds completely with a symbolic spec. The key result: | Tier | Input | Correctness | Time | |------|-------|-------------|------| | RAG | 53,137-token corpus + raw prompt | FAIL 2/5 | 5.71s | | GOG | 6,323-token context + raw prompt | PARTIAL 4/5 | 11.63s | | SRM | 6,323-token context + symbolic spec | **PASS 5/5** | **0.94s** | The model did not fail because it could not write correct code. It failed because it could not reason about what to write. When the reasoning was done externally and passed in as structure, the language capability was sufficient. GOG is complete and documented. The full paper, benchmark code, and reproduction instructions are in . --- Repository Structure --- Reproducing the SRM experiments All experiments run locally via Ollama. No API keys required. Results are saved to and . The living summary document accumulates findings across experiments. --- Reproducing the GOG benchmark Local LLM (Ollama) Cloud API (MiniMax) Available environment variables for the Cloud API benchmark: | Variable | Default | Description | |----------|---------|-------------| | | *(required)* | Your MiniMax API key | | | | Model to benchmark ( , ) | | | | API endpoint (OpenAI-compatible) | Full instructions including cloud CLI benchmark are in . --- Open questions The primitive composition map covers 30 combinations out of hundreds possible. The mechanistic explanation for the Layer 0a/0b distinction is unknown. The full SRM pipeline has not been implemented. A trained membrane does not exist. These are not gaps to apologize for. They are directions. Specific questions we cannot pursue alone: • Does the Layer 0a/0b distinction appear in mechanistic interpretability analysis of model internals? • Does the activation gap scale predictably beyond 1B parameters? • Do the same primitives drift to…