`GRU`

`function`

defined in module
`Flux`

```
GRU(in => out)
```

Gated Recurrent Unit layer. Behaves like an RNN but generally exhibits a longer memory span over sequences. This implements the variant proposed in v1 of the referenced paper.

The integer arguments ```
in
```

and ```
out
```

describe the size of the feature vectors passed as input and as output. That is, it accepts a vector of length ```
in
```

or a batch of vectors represented as a ```
in x B
```

matrix and outputs a vector of length ```
out
```

or a batch of vectors of size ```
out x B
```

.

This constructor is syntactic sugar for ```
Recur(GRUCell(a...))
```

, and so GRUs are stateful. Note that the state shape can change depending on the inputs, and so it is good to ```
reset!
```

the model between inference calls if the batch size changes. See the examples below.

See this article for a good overview of the internals.

```
julia> g = GRU(3 => 5)
Recur(
GRUCell(3 => 5), # 140 parameters
) # Total: 4 trainable arrays, 140 parameters,
# plus 1 non-trainable, 5 parameters, summarysize 792 bytes.
julia> g(rand(Float32, 3)) |> size
(5,)
julia> Flux.reset!(g);
julia> g(rand(Float32, 3, 10)) |> size # batch size of 10
(5, 10)
```

Batch size changes

Failing to call ```
reset!
```

when the input batch size changes can lead to unexpected behavior. See the example in
```
RNN
```

.

```
GRUCell
```

s can be constructed directly by specifying the non-linear function, the ```
Wi
```

and ```
Wh
```

internal matrices, a bias vector ```
b
```

, and a learnable initial state ```
state0
```

. The ```
Wi
```

and ```
Wh
```

matrices do not need to be the same type. See the example in
```
RNN
```

.

Methods

There is
1
method for `Flux.GRU`

:

Backlinks

The following pages link back here: