docs: update README Go 1.26 support (#931)
Co-authored-by: liu19 tklq5885@gmail.com Co-authored-by: Copilot 175728472+Copilot@users.noreply.github.com Co-authored-by: liu19 liuqiang.06@bytedance.com
版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9
京公网安备 11010802032778号
Sonic
English | 中文
A blazingly fast JSON serializing & deserializing library, accelerated by JIT (just-in-time compiling) and SIMD (single-instruction-multiple-data).
Requirement
-ldflags="-checklinkname=0".Features
APIs
see go.dev
Benchmarks
For all sizes of json and all scenarios of usage, Sonic performs best.
See bench.sh for benchmark codes.
How it works
See INTRODUCTION.md.
Usage
Marshal/Unmarshal
Default behaviors are mostly consistent with
encoding/json, except HTML escaping form (see Escape HTML) andSortKeysfeature (optional support see Sort Keys) that is NOT in conformity to RFC8259.Streaming IO
Sonic supports decoding json from
io.Readeror encoding objects intoio.Writer, aims at handling multiple values as well as reducing memory consumption.Use Number/Use Int64
Sort Keys
On account of the performance loss from sorting (roughly 10%), sonic doesn’t enable this feature by default. If your component depends on it to work (like zstd), Use it like this:
Escape HTML
On account of the performance loss (roughly 15%), sonic doesn’t enable this feature by default. You can use
encoder.EscapeHTMLoption to open this feature (align withencoding/json.HTMLEscape).Compact Format
Sonic encodes primitive objects (struct/map…) as compact-format JSON by default, except marshaling
json.RawMessageorjson.Marshaler: sonic ensures validating their output JSON but DO NOT compacting them for performance concerns. We provide the optionencoder.CompactMarshalerto add compacting process.Print Error
If there invalid syntax in input JSON, sonic will return
decoder.SyntaxError, which supports pretty-printing of error positionMismatched Types [Sonic v1.6.0]
If there a mismatch-typed value for a given key, sonic will report
decoder.MismatchTypeError(if there are many, report the last one), but still skip wrong the value and keep decoding next JSON.Ast.Node
Sonic/ast.Node is a completely self-contained AST for JSON. It implements serialization and deserialization both and provides robust APIs for obtaining and modification of generic data.
Get/Index
Search partial JSON by given paths, which must be non-negative integer or string, or nil
Tip: since
Index()uses offset to locate data, which is much faster than scanning likeGet(), we suggest you use it as much as possible. And sonic also provides another APIIndexOrGet()to underlying use offset as well as ensure the key is matched.SearchOption
Searcherprovides some options for user to meet different needs:ast.NodeuseLazy-Loaddesign, it doesn’t support Concurrently-Read by default. If you want to read it concurrently, please specify it.Set/Unset
Modify the json content by Set()/Unset()
Serialize
To encode
ast.Nodeas json, useMarshalJson()orjson.Marshal()(MUST pass the node’s pointer)APIs
Check(),Error(),Valid(),Exist()Index(),Get(),IndexPair(),IndexOrGet(),GetByPath()Int64(),Float64(),String(),Number(),Bool(),Map[UseNumber|UseNode](),Array[UseNumber|UseNode](),Interface[UseNumber|UseNode]()NewRaw(),NewNumber(),NewNull(),NewBool(),NewString(),NewObject(),NewArray()Values(),Properties(),ForEach(),SortKeys()Set(),SetByIndex(),Add()Ast.Visitor
Sonic provides an advanced API for fully parsing JSON into non-standard types (neither
structnotmap[string]interface{}) without using any intermediate representation (ast.Nodeorinterface{}). For example, you might have the following types which are likeinterface{}but actually notinterface{}:Sonic provides the following API to return the preorder traversal of a JSON AST. The
ast.Visitoris a SAX style interface which is used in some C++ JSON library. You should implementast.Visitorby yourself and pass it toast.Preorder()method. In your visitor you can make your custom types to represent JSON values. There may be an O(n) space container (such as stack) in your visitor to record the object / array hierarchy.See ast/visitor.go for detailed usage. We also implement a demo visitor for
UserNodein ast/visitor_test.go.Compatibility
For developers who want to use sonic to meet different scenarios, we provide some integrated configs as
sonic.APIConfigDefault: the sonic’s default config (EscapeHTML=false,SortKeys=false…) to run sonic fast meanwhile ensure security.ConfigStd: the std-compatible config (EscapeHTML=true,SortKeys=true…)ConfigFastest: the fastest config (NoQuoteTextMarshaler=true) to run on sonic as fast as possible. Sonic DOES NOT ensure to support all environments, due to the difficulty of developing high-performance codes. On non-sonic-supporting environment, the implementation will fall back toencoding/json. Thus below configs will all equal toConfigStd.Tips
Pretouch
Since Sonic uses golang-asm as a JIT assembler, which is NOT very suitable for runtime compiling, first-hit running of a huge schema may cause request-timeout or even process-OOM. For better stability, we advise using
PretouchMany()for huge-schema or lantency-sensitive applications beforeMarshal()/Unmarshal().Copy string
When decoding string values without any escaped characters, sonic references them from the origin JSON buffer instead of mallocing a new buffer to copy. This helps a lot for CPU performance but may leave the whole JSON buffer in memory as long as the decoded objects are being used. In practice, we found the extra memory introduced by referring JSON buffer is usually 20% ~ 80% of decoded objects. Once an application holds these objects for a long time (for example, cache the decoded objects for reusing), its in-use memory on the server may go up. -
Config.CopyString/decoder.CopyString(): We provide the option forDecode()/Unmarshal()users to choose not to reference the JSON buffer, which may cause a decline in CPU performance to some degree.GetFromStringNoCopy(): For memory safety,sonic.Get()/sonic.GetFromString()now copies return JSON. If users want to get json more quickly and not care about memory usage, you can useGetFromStringNoCopy()to return a JSON directly referenced from source.Pass string or []byte?
For alignment to
encoding/json, we provide API to pass[]byteas an argument, but the string-to-bytes copy is conducted at the same time considering safety, which may lose performance when the origin JSON is huge. Therefore, you can useUnmarshalString()andGetFromString()to pass a string, as long as your origin data is a string or nocopy-cast is safe for your []byte. We also provide APIMarshalString()for convenient nocopy-cast of encoded JSON []byte, which is safe since sonic’s output bytes is always duplicated and unique.Accelerate
encoding.TextMarshalerTo ensure data security, sonic.Encoder quotes and escapes string values from
encoding.TextMarshalerinterfaces by default, which may degrade performance much if most of your data is in form of them. We provideencoder.NoQuoteTextMarshalerto skip these operations, which means you MUST ensure their output string escaped and quoted following RFC8259.Better performance for generic data
In fully-parsed scenario,
Unmarshal()performs better thanGet()+Node.Interface(). But if you only have a part of the schema for specific json, you can combineGet()andUnmarshal()together:Even if you don’t have any schema, use
ast.Nodeas the container of generic values instead ofmaporinterface:Why? Because
ast.Nodestores its children usingarray:Array‘s performance is much better thanMapwhen Inserting (Deserialize) and Scanning (Serialize) data;map[x]) is not as efficient as Indexing (array[x]), whichast.Nodecan conduct on both array and object;Interface()/Map()means Sonic must parse all the underlying values, whileast.Nodecan parse them on demand.CAUTION:
ast.NodeDOESN’T ensure concurrent security directly, due to its lazy-load design. However, you can callNode.Load()/Node.LoadAll()to achieve that, which may bring performance reduction while it still works faster than converting tomaporinterface{}Ast.Node or Ast.Visitor?
For generic data,
ast.Nodeshould be enough for your needs in most cases.However,
ast.Nodeis designed for partially processing JSON string. It has some special designs such as lazy-load which might not be suitable for directly parsing the whole JSON string likeUnmarshal(). Althoughast.Nodeis better thenmaporinterface{}, it’s also a kind of intermediate representation after all if your final types are customized and you have to convert the above types to your custom types after parsing.For better performance, in previous case the
ast.Visitorwill be the better choice. It performs JSON decoding likeUnmarshal()and you can directly use your final types to represents a JSON AST without any intermediate representations.But
ast.Visitoris not a very handy API. You might need to write a lot of code to implement your visitor and carefully maintain the tree hierarchy during decoding. Please read the comments in ast/visitor.go carefully if you decide to use this API.Buffer Size
Sonic use memory pool in many places like
encoder.Encode,ast.Node.MarshalJSONto improve performance, which may produce more memory usage (in-use) when server’s load is high. See issue 614. Therefore, we introduce some options to let user control the behavior of memory pool. See option package.Faster JSON Skip
For security, sonic use FSM algorithm to validate JSON when decoding raw JSON or encoding
json.Marshaler, which is much slower (1~10x) than SIMD-searching-pair algorithm. If user has many redundant JSON value and DO NOT NEED to strictly validate JSON correctness, you can enable below options:Config.NoValidateSkipJSON: for faster skipping JSON when decoding, such as unknown fields, json.Unmarshaler(json.RawMessage), mismatched values, and redundant array elementsConfig.NoValidateJSONMarshaler: avoid validating JSON when encodingjson.MarshalerSearchOption.ValidateJSON: indicates if validate located JSON value whenGetJSON-Path Support (GJSON)
tidwall/gjson has provided a comprehensive and popular JSON-Path API, and a lot of older codes heavily relies on it. Therefore, we provides a wrapper library, which combines gjson’s API with sonic’s SIMD algorithm to boost up the performance. See cloudwego/gjson.
Community
Sonic is a subproject of CloudWeGo. We are committed to building a cloud native ecosystem.