back to home

KindXiaoming / pykan

Kolmogorov Arnold Networks

16,207 stars
1,554 forks
264 issues
Jupyter NotebookPython

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing KindXiaoming/pykan in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/KindXiaoming/pykan)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

Kolmogorov-Arnold Networks (KANs) This is the github repo for the paper "KAN: Kolmogorov-Arnold Networks" and "KAN 2.0: Kolmogorov-Arnold Networks Meet Science". You may want to quickstart with hellokan, try more examples in tutorials, or read the documentation here. Kolmogorov-Arnold Networks (KANs) are promising alternatives of Multi-Layer Perceptrons (MLPs). KANs have strong mathematical foundations just like MLPs: MLPs are based on the universal approximation theorem, while KANs are based on Kolmogorov-Arnold representation theorem. KANs and MLPs are dual: KANs have activation functions on edges, while MLPs have activation functions on nodes. This simple change makes KANs better (sometimes much better!) than MLPs in terms of both model **accuracy** and **interpretability**. A quick intro of KANs here. Installation Pykan can be installed via PyPI or directly from GitHub. **Pre-requisites:** **For developers** **Installation via github** **Installation via PyPI:** Requirements After activating the virtual environment, you can install specific package requirements as follows: **Optional: Conda Environment Setup** For those who prefer using Conda: Efficiency mode For many machine-learning users, when (1) you need to write the training loop yourself (instead of using model.fit() ); (2) you never use the symbolic branch, it is important to call model.speed() before training! Otherwise, the symbolic branch is on, which is super slow because the symbolic computations are not parallelized! Computation requirements Examples in tutorials are runnable on a single CPU typically less than 10 minutes. All examples in the paper are runnable on a single CPU in less than one day. Training KANs for PDE is the most expensive and may take hours to days on a single CPU. We use CPUs to train our models because we carried out parameter sweeps (both for MLPs and KANs) to obtain Pareto Frontiers. There are thousands of small models which is why we use CPUs rather than GPUs. Admittedly, our problem scales are smaller than typical machine learning tasks but are typical for science-related tasks. In case the scale of your task is large, it is advisable to use GPUs. Documentation The documentation can be found here. Tutorials **Quickstart** Get started with hellokan.ipynb notebook. **More demos** More Notebook tutorials can be found in tutorials. Advice on hyperparameter tuning Many intuition about MLPs and other networks may not directly transfer to KANs. So how can I tune the hyperparameters effectively? Here is my general advice based on my experience playing with the problems reported in the paper. Since these problems are relatively small-scale and science-oriented, it is likely that my advice is not suitable to your case. But I want to at least share my experience such that users can have better clues where to start and what to expect from tuning hyperparameters. • Start from a simple setup (small KAN shape, small grid size, small data, no reguralization ). This is very different from MLP literature, where people by default use widths of order or higher. For example, if you have a task with 5 inputs and 1 outputs, I would try something as simple as . If it doesn't work, I would gradually first increase width. If that still doesn't work, I would consider increasing depth. You don't need to be this extreme, if you have better understanding about the complexity of your task. • Once an acceptable performance is achieved, you could then try refining your KAN (more accurate or more interpretable). • If you care about accuracy, try grid extention technique. An example is here. But watch out for overfitting, see below. • If you care about interpretability, try sparsifying the network with, e.g., . It would also be advisable to try increasing lamb gradually. After training with sparsification, plot it, if you see some neurons that are obvious useless, you may call to get the pruned model. You can then further train (either to encourage accuracy or encouarge sparsity), or do symbolic regression. • I also want to emphasize that accuracy and interpretability (and also parameter efficiency) are not necessarily contradictory, e.g., Figure 2.3 in our paper. They can be positively correlated in some cases but in other cases may dispaly some tradeoff. So it would be good not to be greedy and aim for one goal at a time. However, if you have a strong reason why you believe pruning (interpretability) can also help accuracy, you may want to plan ahead, such that even if your end goal is accuracy, you want to push interpretability first. • Once you get a quite good result, try increasing data size and have a final run, which should give you even better results! Disclaimer: Try the simplest thing first is the mindset of physicists, which could be personal/biased but I find this mindset quite effective and make things well-controlled for me. Also, The reason why I tend to choose a small dataset at first is to get faster feedback in the debugging stage (my initial implementation is slow, after all!). The hidden assumption is that a small dataset behaves qualitatively similar to a large dataset, which is not necessarily true in general, but usually true in small-scale problems that I have tried. To know if your data is sufficient, see the next paragraph. Another thing that would be good to keep in mind is that please constantly checking if your model is in underfitting or overfitting regime. If there is a large gap between train/test losses, you probably want to increase data or reduce model ( is more important than , so first try decreasing , then ). This is also the reason why I'd love to start from simple models to make sure that the model is first in underfitting regime and then gradually expands to the "Goldilocks zone". Citation Contact If you have any questions, please contact zmliu@mit.edu Author's note I would like to thank everyone who's interested in KANs. When I designed KANs and wrote codes, I h…