Details of Go JSON serialization of []byte type

Recently, when writing web api with go, I encountered a problem about []byte type JSON serialization, make a note of it.

The cause of the matter is that I want to make a web service to save records containing any JSON content and read them out normally. This requirement is actually very simple, as JSON fields are all text, you can just read and write JSON text directly. But considering that JSON text will involve the problem of escaping, I use the direct storage of []byte type data. The database also supports this type of storage. The model structure is roughly as follows:

type record struct{
Content []byte`json:"content"`
}

With this interface for creating requests, there is no problem to bind the http body to this structure, but if the structure is returned when reading, the Content field is actually a base64 encoded string in the final response because the Content type is []byte. After a colleague’s suggestion, I changed the []byte type to json.RawMessage is actually just another name for []byte, so why is there such a difference? With doubts, I looked at the details of json serialization in go and realized.

The core of go’s built-in encoding/json package for serializing JSON logic is in the newTypeEncoder function in the encoding/json/encode.go file.

// newTypeEncoder constructs an encoderFunc for a type.
// The returned encoder only checks CanAddr when allowAddr is true.
func newTypeEncoder(t reflect.Type, allowAddr bool) encoderFunc {
if t.Kind() != reflect.Pointer && allowAddr && reflect.PointerTo(t).Implements(marshalerType) {
return newCondAddrEncoder(addrMarshalerEncoder, newTypeEncoder(t, false))
}
if t.Implements(marshalerType) {
return marshalerEncoder
}
if t.Kind() != reflect.Pointer && allowAddr && reflect.PointerTo(t).Implements(textMarshalerType) {
return newCondAddrEncoder(addrTextMarshalerEncoder, newTypeEncoder(t, false))
}
if t.Implements(textMarshalerType) {
return textMarshalerEncoder
}

switch t.Kind() {
case reflect.Bool:
return boolEncoder
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
return intEncoder
case reflect.Uint, reflect.Uint8, reflect.Uint16, reflect.Uint32, reflect.Uint64, reflect.Uintptr:
return uintEncoder
case reflect.Float32:
return float32Encoder
case reflect.Float64:
return float64Encoder
case reflect.String:
return stringEncoder
case reflect.Interface:
return interfaceEncoder
case reflect.Struct:
return newStructEncoder(t)
case reflect.Map:
return newMapEncoder(t)
case reflect.Slice:
return newSliceEncoder(t)
case reflect.Array:
return newArrayEncoder(t)
case reflect.Pointer:
return newPtrEncoder(t)
default:
return unsupportedTypeEncoder
}
}

The short answer is that the code will get the type of the object needs to be serialized into json through reflection, and then return different encoderFunc in newTypeEncoder according to the different types. and the different processing logic of []byte type and json.RawMessage lies here. For the []byte type, it will be recognized as reflect.Slice, return newSliceEncoder(t). Let’s look at it.

func newSliceEncoder(t reflect.Type) encoderFunc {
// Byte slices get special treatment; arrays don't.
if t.Elem().Kind() == reflect.Uint8 {
p := reflect.PointerTo(t.Elem())
if !p.Implements(marshalerType) && !p.Implements(textMarshalerType) {
return encodeByteSlice
}
}
enc := sliceEncoder{newArrayEncoder(t)}
return enc.encode
}

Here, the encodeByteSlice is returned directly as for type []byte , which is a special treatment, as for other types of slice sliceEncoder will be returned. base64 encoding is just done for []byte in encodeByteSlice.

RawMessage is also a []byte type at the bottom, so why wouldn’t it be? Because it implements json.Marshaler, so it returns marshalerEncoder in newTypeEncoder, and the implementation here handles it differently for []byte.

func marshalerEncoder(e *encodeState, v reflect.Value, opts encOpts) {
if v.Kind() == reflect.Pointer && v.IsNil() {
e.WriteString("null")
return
}
m, ok := v.Interface().(Marshaler)
if !ok {
e.WriteString("null")
return
}
b, err := m.MarshalJSON()
if err == nil {
// copy JSON into buffer, checking validity.
err = compact(&e.Buffer, b, opts.escapeHTML)
}
if err != nil {
e.error(&MarshalerError{v.Type(), err, "MarshalJSON"})
}
}

The first call will be to the type’s own MarshalJSON, and the logic of the MarshalJSON implementation of json.RawMessage is to return []byte directly. marshalerEncoder in the returned []byte will be compacted.

RawMessage returns a compacted string instead of a base64-encoded string due to special handling in the encoding/json package.Finally everything is clear.

--

--

--

Building interesting things with coding

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

How To Invest In Your AWS Partnership: ISV Edition

Learning Solidity

Setting Up ScalaTion for CSCI 4360

The FOSS Contributor Fund at Indeed

FOSS Contributor Fund logo

Jenkins for CI/CD….

5 beginner friendly PyTorch functions you didn’t know you needed

Installing Kubernetes from binaries: Pt.1 — Preparing your cluster

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Hulb

Hulb

Building interesting things with coding

More from Medium

go range variables

Data Types in Go — I

Concurrency in Golang

Flexibility of defining var in golang