`LSTM`

`function`

defined in module
`Flux`

```
LSTM(in => out)
```

Long Short Term Memory recurrent layer. Behaves like an RNN but generally exhibits a longer memory span over sequences.

The arguments ```
in
```

and ```
out
```

describe the size of the feature vectors passed as input and as output. That is, it accepts a vector of length ```
in
```

or a batch of vectors represented as a ```
in x B
```

matrix and outputs a vector of length ```
out
```

or a batch of vectors of size ```
out x B
```

.

This constructor is syntactic sugar for ```
Recur(LSTMCell(a...))
```

, and so LSTMs are stateful. Note that the state shape can change depending on the inputs, and so it is good to ```
reset!
```

the model between inference calls if the batch size changes. See the examples below.

See this article for a good overview of the internals.

```
julia> l = LSTM(3 => 5)
Recur(
LSTMCell(3 => 5), # 190 parameters
) # Total: 5 trainable arrays, 190 parameters,
# plus 2 non-trainable, 10 parameters, summarysize 1.062 KiB.
julia> l(rand(Float32, 3)) |> size
(5,)
julia> Flux.reset!(l);
julia> l(rand(Float32, 3, 10)) |> size # batch size of 10
(5, 10)
```

Batch size changes

Failing to call ```
reset!
```

when the input batch size changes can lead to unexpected behavior. See the example in
```
RNN
```

.

```
LSTMCell
```

s can be constructed directly by specifying the non-linear function, the ```
Wi
```

and ```
Wh
```

internal matrices, a bias vector ```
b
```

, and a learnable initial state ```
state0
```

. The ```
Wi
```

and ```
Wh
```

matrices do not need to be the same type. See the example in
```
RNN
```

.

Methods

There is
1
method for `Flux.LSTM`

:

Backlinks

The following pages link back here: