# Saving and Loading Models

You may wish to save models so that they can be loaded and run in a later session. The easiest way to do this is via BSON.jl.

Save a model:

```
julia> using Flux
julia> model = Chain(Dense(10, 5, NNlib.relu), Dense(5, 2), NNlib.softmax)
Chain(
Dense(10 => 5, relu), # 55 parameters
Dense(5 => 2), # 12 parameters
NNlib.softmax,
) # Total: 4 arrays, 67 parameters, 524 bytes.
julia> using BSON: @save
julia> @save "mymodel.bson" model
```

Load it again:

```
julia> using Flux # Flux must be loaded before calling @load
julia> using BSON: @load
julia> @load "mymodel.bson" model
julia> model
Chain(
Dense(10 => 5, relu), # 55 parameters
Dense(5 => 2), # 12 parameters
NNlib.softmax,
) # Total: 4 arrays, 67 parameters, 524 bytes.
```

Models are just normal Julia structs, so it's fine to use any Julia storage format for this purpose. BSON.jl is particularly well supported and most likely to be forwards compatible (that is, models saved now will load in future versions of Flux).

If a saved model's parameters are stored on the GPU, the model will not load later on if there is no GPU support available. It's best to move your model to the CPU with `cpu(model)`

before saving it.

Previous versions of Flux suggested saving only the model weights using `@save "mymodel.bson" params(model)`

. This is no longer recommended and even strongly discouraged. Saving models this way will only store the trainable parameters which will result in incorrect behavior for layers like `BatchNorm`

.

```
julia> using Flux
julia> model = Chain(Dense(10 => 5,relu),Dense(5 => 2),softmax)
Chain(
Dense(10 => 5, relu), # 55 parameters
Dense(5 => 2), # 12 parameters
NNlib.softmax,
) # Total: 4 arrays, 67 parameters, 524 bytes.
julia> weights = Flux.params(model);
```

Loading the model as shown above will return a new model with the stored parameters. But sometimes you already have a model, and you want to load stored parameters into it. This can be done as

```
using Flux: loadmodel!
using BSON: @load
# some predefined model
model = Chain(Dense(10 => 5, relu), Dense(5 => 2), softmax)
# load one model into another
model = loadmodel!(model, @load("mymodel.bson"))
```

This ensures that the model loaded from `"mymodel.bson"`

matches the structure of `model`

. `Flux.loadmodel!`

is also convenient for copying parameters between models in memory.

`Flux.loadmodel!`

— Function`loadmodel!(dst, src)`

Copy all the parameters (trainable and non-trainable) from `src`

into `dst`

.

Recursively walks `dst`

and `src`

together using `Functors.children`

, and calling `copyto!`

on parameter arrays or throwing an error when there is a mismatch. Non-array elements (such as activation functions) are not copied and need not match. Zero bias vectors and `bias=false`

are considered equivalent (see extended help for more details).

**Examples**

```
julia> dst = Chain(Dense(Flux.ones32(2, 5), Flux.ones32(2), tanh), Dense(2 => 1; bias = [1f0]))
Chain(
Dense(5 => 2, tanh), # 12 parameters
Dense(2 => 1), # 3 parameters
) # Total: 4 arrays, 15 parameters, 316 bytes.
julia> dst[1].weight ≈ ones(2, 5) # by construction
true
julia> src = Chain(Dense(5 => 2, relu), Dense(2 => 1, bias=false));
julia> Flux.loadmodel!(dst, src);
julia> dst[1].weight ≈ ones(2, 5) # values changed
false
julia> iszero(dst[2].bias)
true
```

**Extended help**

Throws an error when:

`dst`

and`src`

do not share the same fields (at any level)- the sizes of leaf nodes are mismatched between
`dst`

and`src`

- copying non-array values to/from an array parameter (except inactive parameters described below)
`dst`

is a "tied" parameter (i.e. refers to another parameter) and loaded into multiple times with mismatched source values

Inactive parameters can be encoded by using the boolean value `false`

instead of an array. If `dst == false`

and `src`

is an all-zero array, no error will be raised (and no values copied); however, attempting to copy a non-zero array to an inactive parameter will throw an error. Likewise, copying a `src`

value of `false`

to any `dst`

array is valid, but copying a `src`

value of `true`

will error.

## Checkpointing

In longer training runs it's a good idea to periodically save your model, so that you can resume if training is interrupted (for example, if there's a power cut). You can do this by saving the model in the callback provided to `train!`

.

```
julia> using Flux: throttle
julia> using BSON: @save
julia> m = Chain(Dense(10 => 5, relu), Dense(5 => 2), softmax)
Chain(
Dense(10 => 5, relu), # 55 parameters
Dense(5 => 2), # 12 parameters
NNlib.softmax,
) # Total: 4 arrays, 67 parameters, 524 bytes.
julia> evalcb = throttle(30) do
# Show loss
@save "model-checkpoint.bson" model
end;
```

This will update the `"model-checkpoint.bson"`

file every thirty seconds.

You can get more advanced by saving a series of models throughout training, for example

`@save "model-$(now()).bson" model`

will produce a series of models like `"model-2018-03-06T02:57:10.41.bson"`

. You could also store the current test set loss, so that it's easy to (for example) revert to an older copy of the model if it starts to overfit.

`@save "model-$(now()).bson" model loss = testloss()`

Note that to resume a model's training, you might need to restore other stateful parts of your training loop. Possible examples are stateful optimizers (which usually utilize an `IdDict`

to store their state), and the randomness used to partition the original data into the training and validation sets.

You can store the optimiser state alongside the model, to resume training exactly where you left off. BSON is smart enough to cache values and insert links when saving, but only if it knows everything to be saved up front. Thus models and optimizers must be saved together to have the latter work after restoring.

```
opt = Adam()
@save "model-$(now()).bson" model opt
```