back to home

Best Open Source parser Libraries

A curated list of the most popular GitHub repositories tagged with parser. Select any project to visualize its architecture and dive into the codebase using RepoMind's AI engine.

#1opendatalab/MinerU

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

54,581Python
Analyze Code

#2markedjs/marked

A markdown parser and compiler. Built for speed.

36,612TypeScript
Analyze Code

#3swc-project/swc

Rust-based platform for the Web

33,237Rust
Analyze Code

#4cheeriojs/cheerio

The fast, flexible, and elegant library for parsing and manipulating HTML and XML.

30,117TypeScript
Analyze Code

#5postcss/postcss

Transforming styles with JS plugins

28,960TypeScript
Analyze Code

#6tree-sitter/tree-sitter

An incremental parsing system for programming tools

23,858Rust
Analyze Code

#7vectordotdev/vector

A high-performance observability data pipeline.

21,350Rust
Analyze Code

#8oxc-project/oxc

⚓ A collection of high-performance JavaScript tools.

19,122Rust
Analyze Code

#9nikic/PHP-Parser

A PHP parser written in PHP

17,409PHP
Analyze Code

#10json-iterator/go

A high-performance 100% compatible drop-in replacement of "encoding/json"

13,931Go
Analyze Code

#11terser/terser

🗜 JavaScript parser, mangler and compressor toolkit for ES6+

9,240JavaScript
Analyze Code

#12tobymao/sqlglot

Python SQL Parser and Transpiler

8,947Python
Analyze Code

#13bytedance/Dolphin

The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.

8,827Python
Analyze Code

#14pdfminer/pdfminer.six

Community maintained fork of pdfminer - we fathom PDF

6,904Python
Analyze Code

#15boa-dev/boa

Boa is an embeddable Javascript engine written in Rust.

6,898Rust
Analyze Code

#16fkling/astexplorer

A web tool to explore the ASTs generated by various parsers.

6,501JavaScript
Analyze Code