back to home

Best Open Source parser Libraries

A curated list of the most popular GitHub repositories tagged with parser. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1opendatalab/MinerU

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

56,399Python
Explore Repo

#2markedjs/marked

A markdown parser and compiler. Built for speed.

36,664TypeScript
Explore Repo

#3swc-project/swc

Rust-based platform for the Web

33,299Rust
Explore Repo

#4cheeriojs/cheerio

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

30,192TypeScript
Explore Repo

#5postcss/postcss

Transforming styles with JS plugins

28,974TypeScript
Explore Repo

#6oxc-project/oxc

⚓ A collection of high-performance JavaScript tools.

20,765Rust
Explore Repo

#7nikic/PHP-Parser

A PHP parser written in PHP

17,427PHP
Explore Repo

#8erusev/parsedown

Better Markdown Parser in PHP

15,021PHP
Explore Repo

#9terser/terser

🗜 JavaScript parser, mangler and compressor toolkit for ES6+

9,256JavaScript
Explore Repo

#10tobymao/sqlglot

Python SQL Parser and Transpiler

9,037Python
Explore Repo

#11bytedance/Dolphin

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

8,865Python
Explore Repo

#12mvdan/sh

A shell parser, formatter, and interpreter with bash and zsh support; includes shfmt

8,563Go
Explore Repo

#13microsoft/tsdoc

A doc comment standard for TypeScript

4,939TypeScript
Explore Repo

#14globalizejs/globalize

A JavaScript library for internationalization and localization that leverages the official Unicode CLDR JSON data

4,835JavaScript
Explore Repo

#15fb55/htmlparser2

The fast & forgiving HTML and XML parser

4,801TypeScript
Explore Repo

#16NaturalIntelligence/fast-xml-parser

Validate XML, Parse XML and Build XML rapidly without C/C++ based libraries and no callback.

3,076JavaScript
Explore Repo

#17carthage-software/mago

Mago is a toolchain for PHP that aims to provide a set of tools to help developers write better code.

3,018Rust
Explore Repo

#18remarkablemark/html-react-parser

📝 HTML to React parser.

2,401TypeScript
Explore Repo

#19ubugeeei/vize

Unofficial High-Performance Vue.js Toolchain in Rust

675Rust
Explore Repo

#20kreuzberg-dev/tree-sitter-language-pack

Comprehensive tree-sitter grammar compilation with polyglot bindings — Rust, Python, Node.js, Go, Java, Ruby, Elixir, PHP, C#, WASM, and CLI. 305+ languages.

334Rust
Explore Repo

#21csskit/csskit

Refreshing CSS

275Rust
Explore Repo