Optional JSON fields in Go

One common kind of data stored in a configuration file is options. In this
post, I’ll talk about some nuances we have to be aware of when storing options
in JSON and unmarshaling them to Go.

Specifically, the most important difference between options and any other data
is that options are often, well… optional. Our program can have a large
number of possible configuration options, but we may want to configure any
particular invocation with only a subset – leaving all the others at their
default values.

Basics – partial unmarshaling, omitempty, and unknown fields

Let’s start with the basics. Consider the following struct that represents
the options for an imaginary program:

type Options struct {
Id string `json:”id,omitempty”`
Verbose bool `json:”verbose,omitempty”`
Level int `json:”level,omitempty”`
Power int `json:”power,omitempty”`
}

This struct has 4 options, but in real programs there may be dozens.

Suppose we want to specify these options in a JSON configuration file. A full
listing of options may look something like:

{
“id”: “foobar”,
“verbose”: false,
“level”: 10,
“power”: 221
}

If all options are always specified in our configuration files, there’s not
much to talk about. Just call json.Unmarshal and all is well.

In reality, things are rarely so simple. We may want to handle a number of
special cases:

The JSON configuration can be missing some fields, and we’ll want our Go
struct to have default values for those.
The JSON configuration can have extra fields which our struct doesn’t have.
Depending on the scenario, we may want to either ignore these or report an
error.

For (1), Go’s json package will assign values only to fields found in the
JSON; other fields will just keep their Go zero values. For example, if the JSON didn’t
have the level field at all, the Options struct unmarshaled from it
would have 0 for Level. If this behavior is undesirable, check out the next
section.

For (2), the json package is very permissive by default and will ignore
unknown fields. That is, suppose the input JSON is:

{
“id”: “foobar”,
“bug”: 42
}

json.Unmarshal will happily parse this into Options, setting Id to
“foobar”, Level and Power to 0 and Verbose to false. It will
ignore the bug key.

This behavior is what you want in some cases, but not in others. Luckily, the
json package makes it configurable by providing an explicit option to a
JSON decoder with DisallowUnknownFields:

dec := json.NewDecoder(bytes.NewReader(jsonText))
dec.DisallowUnknownFields()

var opts Options
if err := dec.Decode(&opts2); err != nil {
fmt.Println(“Decode error:”, err)
}

Now parsing the aforementioned JSON snippet will result in an error.

Finally, you may have noticed that our Options struct has the omitempty
tag specified for all fields. This means that fields with zero values will not
be emitted to JSON. For example:

opts := Options{
Id: “baz”,
Level: 0,
}
out, _ := json.MarshalIndent(opts, “”, ” “)
fmt.Println(string(out))

Will print out:

{
“id”: “baz”
}

Because all the other fields have their zero values. If you want to always emit
all the fields instead, don’t specify omitempty.

Setting default values

In the example above we’ve seen that missing fields in the JSON representation
will be decoded to zero values in Go. This is fine if your options’ default
values are also their zero values, but this isn’t always the case. What if the
default value of Power should be 10, not 0? That is, when the JSON
doesn’t have a “power” field, you want to set Power to 10, but instead
Unmarshal sets it to zero.

You may think – this is easy to solve! I’ll just know to set Power to its
default 10 whenever it’s unmarshaled as 0 from the JSON! Hold on, though. What
happens if the JSON really had “power” specified as 0?

The way to solve it is in reverse, actually. We’ll set the default values
first, and then let json.Unmarshal override fields as needed:

func parseOptions(jsn []byte) Options {
opts := Options{
Verbose: false,
Level: 0,
Power: 10,
}
if err := json.Unmarshal(jsn, &opts); err != nil {
log.Fatal(err)
}
return opts
}

Now instead of calling json.Unmarshal directly for Options, we’ll have
to call parseOptions.

Alternatively, we can cleverly hide this logic in a custom UnmarshalJSON
method for Options:

func (o *Options) UnmarshalJSON(text []byte) error {
type options Options
opts := options{
Power: 10,
}
if err := json.Unmarshal(text, &opts); err != nil {
return err
}
*o = Options(opts)
return nil
}

With this method, any call to json.Unmarshal for the Options type will
populate the default of Power correctly. Note the usage of the options
type alias – this is to prevent infinite recursion in UnmarshalJSON.

This approach is simple and clean, but it has some downsides. First, it strongly
ties the default values of fields with the parsing logic. It’s conceivable that
we want to let user code down the line set its defaults; right now, the defaults
have to be set before unmarshaling.

The second downside is that it only works in simple cases. If our Options
struct has a slice or map of other structs, we can’t populate defaults this way.
Consider:

type Region struct {
Name string `json:”name,omitempty”`
Power int `json:”power,omitempty”`
}

type Options struct {
Id string `json:”id,omitempty”`
Verbose bool `json:”verbose,omitempty”`
Level int `json:”level,omitempty”`
Power int `json:”power,omitempty”`

Regions []Region `json:”regions,omitempty”`
}

If we want to populate defaults for the Power of each Region, we can’t
just do it on the level of Options. We have to write a custom unmarshal
method for Region. This is difficult to scale for arbitrarily nested structs
– spreading our default logic across multiple UnmarshalJSON methods is
sub-optimal.

An alternative is to use a completely different approach, by pushing default
logic to the users. We can accomplish this with pointer fields.

Default values with pointer fields

We can define our Options as:

type Options struct {
Id *string `json:”id,omitempty”`
Verbose *bool `json:”verbose,omitempty”`
Level *int `json:”level,omitempty”`
Power *int `json:”power,omitempty”`
}

It’s very similar to the original definition, except that all the fields are now
pointers. Suppose we have the following JSON text:

{
“id”: “foobar”,
“verbose”: false,
“level”: 10
}

Note that all fields except “power” are specified. We can Unmarshal this
as usual:

var opts Options
if err := json.Unmarshal(jsonText, &opts); err != nil {
log.Fatal(err)
}

But now we can actually distinguish between fields that were not specified at
all (these will get unmarshaled to a nil pointer) and fields that were
specified with zero values (these will get unmarshaled to valid pointers to
values with zero values). For example, we can write the following parsing
wrapper to unmarshal Options and set default values as needed:

func parseOptions(jsn []byte) Options {
var opts Options
if err := json.Unmarshal(jsonText, &opts); err != nil {
log.Fatal(err)
}

if opts.Power == nil {
var v int = 10
opts.Power = &v
}

return opts
}

Note how we set opts.Power; this is one of the inconveniences of working
with pointers, because there is no syntax in Go to take the address of literals
of built-in types like int. This isn’t too much trouble, though, as some
simple helper functions can make our life more pleasant:

func Bool(v bool) *bool { return &v }
func Int(v int) *int { return &v }
func String(v string) *string { return &v }
// etc…

With these in hand, we could have simply written opts.Power = Int(10).

The most useful trait of this approach is that it doesn’t force us to assign
default values at the point where the JSON is parsed. We can pass Options
into user code and let that deal with defaults when nil fields are
encountered.

So are pointers the magic solution to our “distinguish unspecified values from
zero values” problem? Sort of. Pointers are certainly a viable solution that
should work well. The official protobuf package uses the same approach for proto2
Protocol Buffers that distinguish between required and optional fields. So this
method has absolutely been battle tested!

That said, it’s not perfect. First of all, even though Go is really good at
hiding the extra syntactic burden of dealing with pointers most of the time,
in some cases a bit still leaks through (like taking the address of a built-in
literal, as shown above). Another potential issue is performance. Pointers often
mean heap allocation and may cause performance issues in some scenarios, though
when talking about option structs this is unlikely to be a problem.

Flatlogic Admin Templates banner

Leave a Reply

Your email address will not be published. Required fields are marked *