back to home

cloudwego / sonic-rs

A fast Rust JSON library based on SIMD.

View on GitHub
850 stars
59 forks
14 issues
RustShellC

AI Architecture Analysis

This repository is indexed by RepoMind. By analyzing cloudwego/sonic-rs in our AI interface, you can instantly generate complete architecture diagrams, visualize control flows, and perform automated security audits across the entire codebase.

Our Agentic Context Augmented Generation (Agentic CAG) engine loads full source files into context on-demand, avoiding the fragmentation of traditional RAG systems. Ask questions about the architecture, dependencies, or specific features to see it in action.

Source files are only loaded when you start an analysis to optimize performance.

Embed this Badge

Showcase RepoMind's analysis directly in your repository's README.

[![Analyzed by RepoMind](https://img.shields.io/badge/Analyzed%20by-RepoMind-4F46E5?style=for-the-badge)](https://repomind.in/repo/cloudwego/sonic-rs)
Preview:Analyzed by RepoMind

Repository Overview (README excerpt)

Crawler view

sonic-rs [![Build Status][actions-badge]][actions-url] [![codecov][codecov-badge]][codecov-url] [actions-badge]: https://github.com/cloudwego/sonic-rs/actions/workflows/ci.yml/badge.svg [actions-url]: https://github.com/cloudwego/sonic-rs/actions [codecov-badge]: https://codecov.io/gh/cloudwego/sonic-rs/graph/badge.svg [codecov-url]: https://codecov.io/gh/cloudwego/sonic-rs English | 中文 A fast Rust JSON library based on SIMD. It has some references to other open-source libraries like sonic_cpp, serde_json, sonic, simdjson, rust-std and more. ***For Golang users to use , please see for_Golang_user.md*** ***For users to migrate from to , can see serdejson_compatibility*** Requirements/Notes • Faster in x86_64 or aarch64, other architecture is fallback and maybe very slower. • ~~Requires Rust nightly version~~ Support Stable Rust now. • Please add the compile options • Should enable feature to avoid false-positive if you are using LLVM-sanitizer in your program. Don't enable this feature in production, since it will cause 30% performance loss in serialize. Quick to use sonic-rs To ensure that SIMD instruction is used in sonic-rs, you need to add rustflags and compile on the host machine. For example, Rust flags can be configured in Cargo config. Add sonic-rs in Features • Serde into Rust struct as and . • Parse/Serialize JSON for untyped , which can be mutable. • Get specific fields from a JSON with the blazing performance. • Use JSON as a lazy array or object iterator with the blazing performance. • Support , and (just like Golang's ) in default. • The floating parsing precision is as Rust std in default. Benchmark The main optimization in sonic-rs is the use of SIMD. However, we do not use the two-stage SIMD algorithms from . We primarily use SIMD in the following scenarios: • parsing/serialize long JSON strings • parsing the fraction of float number • Getting a specific elem or field from JSON • Skipping white spaces when parsing JSON More details about optimization can be found in performance.md. Benchmarks environment: AArch64 benchmark data can be found in benchmark_aarch64.md. Benchmarks: • Deserialize Struct: Deserialize the JSON into Rust struct. The defined struct and testdata is from json-benchmark • Deseirlize Untyped: Deseialize the JSON into an untyped document The serialize benchmarks work oppositely. All deserialized benchmarks enabled UTF-8 validation and enabled in to get sufficient precision as Rust std. Deserialize Struct The benchmark will parse JSON into a Rust struct, and there are no unknown fields in JSON text. All fields are parsed into struct fields in the JSON. Sonic-rs is faster than simd-json because simd-json (Rust) first parses the JSON into a , then parses the into a Rust struct. Sonic-rs directly parses the JSON into a Rust struct, and there are no temporary data structures. The flamegraph is profiled in the citm_catalog case. Deserialize Untyped The benchmark will parse JSON into a document. Sonic-rs seems faster for several reasons: • There are also no temporary data structures in sonic-rs, as detailed above. • Sonic-rs uses a memory arena for the whole document, resulting in fewer memory allocations, better cache-friendliness, and mutability. • The JSON object in is an array. Sonic-rs does not build a hashmap. Serialize Untyped We serialize the document into a string. In the following benchmarks, sonic-rs appears faster for the JSON. The JSON contains many long JSON strings, which fit well with sonic-rs's SIMD optimization. Serialize Struct The explanation is as mentioned above. Get from JSON The benchmark is getting a specific field from the . • sonic-rs::get_unchecked_from_str: without validate • sonic-rs::get_from_str: with validate • gjson::get_from_str: without validate Sonic-rs utilize SIMD to quickly skip unnecessary fields in the unchecked case, thus enhancing the performance. Usage Serde into Rust Type Directly use the or trait. Get a field from JSON Get a specific field from a JSON with the path. The return is a , which is a wrapper of a raw valid JSON slice. We provide the and apis. apis should be used in valid JSON, otherwise it may return unexpected result. Parse and Serialize into untyped Value Parse a JSON into a . JSON Iterator Parse an object or array JSON into a lazy iterator. JSON LazyValue & Number & RawNumber If we need to parse a JSON value as a raw string, we can use . If we need to parse a JSON number into an untyped type, we can use . If we need to parse a JSON number ***without loss of precision***, we can use . It likes in Golang, and can also be parsed from a JSON string. Detailed examples can be found in raw_value.rs and json_number.rs. Error handle Sonic's errors are followed as and have a display around the error position, examples in handle_error.rs. FAQs About UTF-8 By default, sonic-rs enable the UTF-8 validation, except for APIs. About floating point precision By default, sonic-rs uses floating point precision consistent with the Rust standard library, and there is no need to add an extra feature like to ensure floating point precision. If you want to achieve lossless precision when parsing floating-point numbers, such as Golang and , you can use . Acknowledgement Thanks the following open-source libraries. sonic-rs has some references to other open-source libraries like sonic_cpp, serde_json, sonic, simdjson, yyjson, rust-std and so on. We rewrote many SIMD algorithms from sonic-cpp/sonic/simdjson/yyjson for performance. We reused the de/ser codes and modified necessary parts from serde_json to make high compatibility with . We reused part codes about floating parsing from rust-std to make it more accurate. Referenced papers: • Parsing Gigabytes of JSON per Second • JSONSki: streaming semi-structured data with bit-parallel fast-forwarding Contributing Please read CONTRIBUTING.md for information on contributing to sonic-rs.