back to home

lsdefine / GenericAgent

AI-powered PC agent loop for desktop automation and intelligent task execution

View on GitHub
732 stars
124 forks
23 issues
PythonJavaScriptBatchfile

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing lsdefine/GenericAgent in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/lsdefine/GenericAgent)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

English | δΈ­ζ–‡ --- 🌟 Overview **GenericAgent** is a minimal, self-evolving autonomous agent framework. Its core is just **~3,300 lines of code**. Through **7 atomic tools + a 92-line Agent Loop**, it grants any LLM system-level control over a local computer β€” covering browser, terminal, filesystem, keyboard/mouse input, screen vision, and mobile devices (ADB). Its design philosophy: **don't preload skills β€” evolve them.** Every time GenericAgent solves a new task, it automatically crystallizes the execution path into an skill for direct reuse later. The longer you use it, the more skills accumulate β€” forming a skill tree that belongs entirely to you, grown from 3,300 lines of seed code. > **πŸ€– Self-Bootstrap Proof** β€” Everything in this repository, from installing Git and running to every commit message, was completed autonomously by GenericAgent. The author never opened a terminal once. πŸ“‹ Core Features β€’ **Self-Evolving**: Automatically crystallizes each task into an skill. Capabilities grow with every use, forming your personal skill tree. β€’ **Minimal Architecture**: ~3,300 lines of core code. Agent Loop is just 92 lines. No complex dependencies, zero deployment overhead. β€’ **Strong Execution**: Injects into a real browser (preserving login sessions). 7 atomic tools take direct control of the system. β€’ **High Compatibility**: Supports Claude / Gemini / Kimi and other major models. Cross-platform. 🧬 Self-Evolution Mechanism This is what fundamentally distinguishes GenericAgent from every other agent framework. | What you say | What the agent does the first time | Every time after | |---|---|---| | *"Read my WeChat messages"* | Install deps β†’ reverse DB β†’ write read script β†’ save skill | **one-line invoke** | | *"Monitor stocks and alert me"* | Install mootdx β†’ build selection flow β†’ configure cron β†’ save skill | **one-line start** | | *"Send this file via Gmail"* | Configure OAuth β†’ write send script β†’ save skill | **ready to use** | After a few weeks, your agent instance will have a skill tree no one else in the world has β€” all grown from 3,300 lines of seed code. 🎯 Demo Showcase | πŸ§‹ Food Delivery Order | πŸ“ˆ Quantitative Stock Screening | |:---:|:---:| | | | | *"Order me a milk tea"* β€” Navigates the delivery app, selects items, and completes checkout automatically. | *"Find GEM stocks with EXPMA golden cross, turnover > 5%"* β€” Screens stocks with quantitative conditions. | | 🌐 Autonomous Web Exploration | πŸ’° Expense Tracking | πŸ’¬ Batch Messaging | | | | | | Autonomously browses and periodically summarizes web content. | *"Find expenses over Β₯2K in the last 3 months"* β€” Drives Alipay via ADB. | Sends bulk WeChat messages, fully driving the WeChat client. | πŸ“… Latest News β€’ **2026-03-23:** Support personal WeChat as a bot frontend β€’ **2026-03-10:** Released million-scale Skill Library β€’ **2026-03-08:** Released "Dintal Claw" β€” a GenericAgent-powered government affairs bot β€’ **2026-03-01:** GenericAgent featured by Jiqizhixin (ζœΊε™¨δΉ‹εΏƒ) β€’ **2026-01-11:** GenericAgent V1.0 public release --- πŸš€ Quick Start Method 1: Standard Installation Method 2: Windows Portable Version (Recommended for beginners) Download portable version (19MB, unzip and run) Full guide: WELCOME_NEW_USER.md Method 3: Android (Termux) --- πŸ€– Bot Interfaces (Optional) QQ Bot Uses WebSocket long connection β€” **no public webhook required**: Add to : > Create a bot at the QQ Open Platform to get AppID / AppSecret. After the first message, user openid is logged in . Lark (Feishu) **Inbound support**: text, rich text post, images, files, audio, media, interactive cards / share cards **Outbound support**: streaming progress cards, image replies, file / media replies **Vision model**: Images are sent as true multimodal input to OpenAI Vision-compatible backends on the first turn Full setup: assets/SETUP_FEISHU.md WeCom (Enterprise WeChat) DingTalk Telegram Bot πŸ“Š Comparison with Similar Tools | Feature | GenericAgent | OpenClaw | Claude Code | |------|:---:|:---:|:---:| | **Codebase** | ~3,300 lines | ~530,000 lines | Open-sourced (large) | | **Deployment** | + API Key | Multi-service orchestration | CLI + subscription | | **Browser Control** | Real browser (session preserved) | Sandbox / headless browser | Via MCP plugin | | **OS Control** | Mouse/kbd, vision, ADB | Multi-agent delegation | File + terminal | | **Self-Evolution** | Autonomous skill growth | Plugin ecosystem | Stateless between sessions | | **Out of the Box** | 10 .py files + 5 skills | Hundreds of modules | Rich CLI toolset | 🧠 How It Works GenericAgent accomplishes complex tasks through **Layered Memory Γ— Minimal Toolset Γ— Autonomous Execution Loop**, continuously accumulating experience during execution. 1️⃣ **Layered Memory System** > _Memory crystallizes throughout task execution, letting the agent build stable, efficient working patterns over time._ β€’ **L0 β€” Meta Rules**: Core behavioral rules and system constraints of the agent β€’ **L2 β€” Global Facts**: Stable knowledge accumulated over long-term operation β€’ **L3 β€” Task Skillss**: Workflows for completing specific task types 2️⃣ **Autonomous Execution Loop** > _Perceive environment state β†’ Task reasoning β†’ Execute tools β†’ Write experience to memory β†’ Loop_ The entire core loop is just **92 lines of code** ( ). 3️⃣ **Minimal Toolset** > _GenericAgent provides only **7 atomic tools**, forming the foundational capabilities for interacting with the outside world._ | Tool | Function | |------|------| | | Execute arbitrary code | | | Read files | | | Write files | | | Patch / modify files | | | Perceive web content | | | Control browser behavior | | | Human-in-the-loop confirmation | > Additionally, 2 **memory management tools** ( , ) allow the agent to persist context and accumulate experience across sessions. 4️⃣ **Capability Extension Mechanism** > _Capable of dynamically creating new tools._ Via , GenericAgent can dynamically install Python packages, write new s…