Best Open Source parser Libraries
A curated list of the most popular GitHub repositories tagged with parser. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.
#1opendatalab/MinerU
Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.
#2markedjs/marked
A markdown parser and compiler. Built for speed.
#3swc-project/swc
Rust-based platform for the Web
#4cheeriojs/cheerio
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
#5postcss/postcss
Transforming styles with JS plugins
#6oxc-project/oxc
⚓ A collection of high-performance JavaScript tools.
#7nikic/PHP-Parser
A PHP parser written in PHP
#8erusev/parsedown
Better Markdown Parser in PHP
#9terser/terser
🗜 JavaScript parser, mangler and compressor toolkit for ES6+
#10tobymao/sqlglot
Python SQL Parser and Transpiler
#11bytedance/Dolphin
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
#12mvdan/sh
A shell parser, formatter, and interpreter with bash and zsh support; includes shfmt
#13microsoft/tsdoc
A doc comment standard for TypeScript
#14globalizejs/globalize
A JavaScript library for internationalization and localization that leverages the official Unicode CLDR JSON data
#15fb55/htmlparser2
The fast & forgiving HTML and XML parser
#16NaturalIntelligence/fast-xml-parser
Validate XML, Parse XML and Build XML rapidly without C/C++ based libraries and no callback.
#17carthage-software/mago
Mago is a toolchain for PHP that aims to provide a set of tools to help developers write better code.
#18remarkablemark/html-react-parser
📝 HTML to React parser.
#19ubugeeei/vize
Unofficial High-Performance Vue.js Toolchain in Rust
#20kreuzberg-dev/tree-sitter-language-pack
Comprehensive tree-sitter grammar compilation with polyglot bindings — Rust, Python, Node.js, Go, Java, Ruby, Elixir, PHP, C#, WASM, and CLI. 305+ languages.
#21csskit/csskit
Refreshing CSS