Decoding custom formats with Viper

Originally published at https://sagikazarmark.hu.

A frequently requested feature for Viper is adding more value formats and decoders. For example, parsing character (dot, comma, semicolon, etc) separated strings into slices. What most people don’t know is that Viper can already be extended with custom decoders without adding any more code to the core.

tl;dr Show me the code!

The old way

One of the primary reasons for Viper’s popularity is its simple interface that provides easy access to config values. Of course I’m talking about getters!

s := viper.GetString("key.otherkey")
i := viper.GetInt("server.address.port")

Unfortunately, this simple interface also makes Viper bloated at the same time: in order to support a wide range of use cases Viper needs to have tons of these getters. And people want more!

But adding more and more getters cannot be the solution for supporting more formats. There has to be another way!

Another problem with the use of getters (combined with the global Viper instance) is the bad practice of calling Viper from all over the places, not just the in the main function which has numerous disadvantages: it hard wires Viper into the application, makes testing much harder, etc.

Although this is slightly off topic for this post, it’s worth mentioning because the solution for both problems is the same.

Getting complex data structures

Let’s take look at the “vendor lock-in” (Viper leaving the main function) problem first. How can we avoid Viper leaking into places where it shouldn't be?

Well, we could start passing around config structs instead of calling Viper. There is only one problem: in case of large and complex structures it takes ages to construct these structs using getters:

config := Config{
Server: ServerConfig{
Address: viper.GetString("server.address"),
},

Log: LogConfig{
Level: viper.GetString("log.level"),
Format: viper.GetString("log.format"),
},

// ...
}

If only there was a way to automatically unmarshal config values (similarly to encoding packages)…

If you are familiar with Viper, you should know the answer by now: of course there is a way! And it’s even called Unmarshal (and UnmarshalKey):

var config Config

// Try to unmarshal everything...
err := viper.Unmarshal(&config)
if err != nil {
// ...
}

var serverAddress string

// ...or just get a single key
err := viper.UnmarshalKey("server.address", &serverAddress)
if err != nil {
// ...
}

Under the hood, both functions use a library called mapstructure. Mapstructure allows you to unmarshal values from map[string]interface{} (which is exactly how Viper stores configuration internally) to your custom types.

The unmarshal functions provide a better experience for getting values out of Viper, because you don’t need to use getters to build complex structs.

Decode custom formats

Now that we cleared the “vendor lock-in” problem with Viper, let’s focus on our original topic: decoding custom value formats. I already told you that the solution for both issues is the same, so if mapstructure is the answer to one, it has to be the answer to the other as well.

And of course it is! Or rather a feature of mapstructure, called a decode hook. A decode hook is a function that’s called for every leaf in the config tree as mapstructure iterates over it. The hook can examine the data, the data type and the target type and can return a new (parsed) value if necessary.

Let’s take a look at a general example:

func SometHookFunc() mapstructure.DecodeHookFuncType {
// Wrapped in a function call to add optional input
//parameters (eg. separator)
return func(
f reflect.Type, // data type
t reflect.Type, // target data type
data interface{}, // raw data
) (interface{}, error) {
// Check if the data type matches the expected one
if f.Kind() != reflect.String {
return data, nil
}

// Check if the target type matches the expected one
if t != reflect.TypeOf(MyType{}) {
return data, nil
}

// Format/decode/parse the data and
//return the new value
return MyType(data), nil
}
}

When chained together, decode hooks are executed sequentially (without stopping propagation), so each hook needs to check that both the source and the target data type indicate that the data is intended for them. Hooks must also be properly ordered for the exact same reason (ie. more generic hooks need to come later in the chain).

In Viper, decode hooks can be passed to the Unmarshal and UnmarshalKey functions:

viper.Unmarshal(&config, viper.DecodeHook(hookFunc))

// OR

viper.UnmarshalKey("key", &config, viper.DecodeHook(hookFunc))

Viper also comes with a set of default hooks which can be overridden by passing a custom decode hook to one of the above functions:

mapstructure.ComposeDecodeHookFunc(
mapstructure.StringToTimeDurationHookFunc(),
mapstructure.StringToSliceHookFunc(","),
)

Advanced decoding patterns

Let’s see a couple practical examples using decode hooks.

tl;dr Show me the code!

Complete example with custom type

Let’s say you want to split a dot notated string into a slice (eg. coming from an environment variable). The first thing you need is a custom type:

type DotSeparatedStringList []string

Then you need to define a decode hook that splits a string and returns it as the custom type. Based on the generic example above, it should look something like this:

func DotSeparatedStringListHookFunc() mapstructure.DecodeHookFuncType {
return func(
f reflect.Type,
t reflect.Type,
data interface{},
) (interface{}, error) {
// Check that the data is string
if f.Kind() != reflect.String {
return data, nil
}

// Check that the target type is our custom type
if t != reflect.TypeOf(DotSeparatedStringList{}) {
return data, nil
}

// Return the parsed value
return DotSeparatedStringList(strings.Split(data.(string), ".")), nil
}
}

Then all you have to do is use this custom type to unmarshal a dot separated value:

v := viper.New()

// This could come from an environment variable or any other config source
v.Set("key", "foo.bar.baz.bat")

var s DotSeparatedStringList

v.UnmarshalKey("key", &s, viper.DecodeHook(DotSeparatedStringListHookFunc()))

fmt.Printf("Dot separated list (DotSeparatedStringListHookFunc): %#v\n", s)
// Dot separated list (DotSeparatedStringListHookFunc): main.DotSeparatedStringList{"foo", "bar", "baz", "bat"}

TextUnmarshaler example

You don’t necessarily have to write a complete decode hook every time. An alternative is implementing the encoding.TextUnmarshaler interface and using a decode hook that comes with mapstructure.

The first step is creating a custom type again, implementing the above mentioned interface:

// SemicolonSeparatedStringList is a string list that implements
// encoding.TextUnmarshaler and decodes a semicolon separated string list.
type SemicolonSeparatedStringList []string

func (s *SemicolonSeparatedStringList) UnmarshalText(text []byte) error {
*s = strings.Split(string(text), ";")

return nil
}

Then (based on the above example) you can use the custom type and the decode hook:

v := viper.New()

// This could come from an environment variable or any other config source
v.Set("key", "foo;bar;baz;bat")

var s SemicolonSeparatedStringList

v.UnmarshalKey("key", &s, viper.DecodeHook(mapstructure.TextUnmarshallerHookFunc()))

fmt.Printf("Semicolon separated list (TextUnmarshallerHookFunc): %#v\n", s)
// Semicolon separated list (TextUnmarshallerHookFunc): main.SemicolonSeparatedStringList{"foo", "bar", "baz", "bat"}

Builtin comma separated string example

As mentioned earlier, Viper comes with a default set of decode hooks. One of those hooks splits a string by a comma separator when the target type is a slice: mapstructure.StringToSliceHookFunc(",").

Using this hook, you can avoid defining custom types and configuring decode hooks. You can simply unmarshal a comma separated string using a string slice as the target:

v := viper.New()

// This could come from an environment variable or any other config source
v.Set("key", "foo,bar,baz,bat")

var s []string

v.UnmarshalKey("key", &s)

fmt.Printf("Comma separated list (builtin decode hook function): %#v\n", s)
// Comma separated list (builtin decode hook function): []string{"foo", "bar", "baz", "bat"}

Conclusion

Viper’s config decoding mechanism is already powerful enough to serve most needs that may arise for parsing custom value formats. In addition, it saves Viper from becoming more bloated with builtin getters and decoder functions by offloading this task to custom target types and decode hooks.

For more decode hook examples check out the mapstructure documentation and source code.

Originally published at https://sagikazarmark.hu.

Software engineer, Open Source enthusiast. Prefer solving architectural problems over coding. Currently hacking Kubernetes.