Surrogate Gradients¶

Spiking neurons use a discontinuous activation (a spike is emitted when the membrane voltage crosses a threshold). This discontinuity makes standard backpropagation through time (BPTT) impossible because the gradient of the spike function is zero almost everywhere.

Surrogate gradients solve this by replacing the true gradient with a smooth approximation during the backward pass.

Available Surrogates¶

btorch provides several surrogate gradient functions in btorch.models.surrogate:

Class	Forward	Backward
`Sigmoid`	Heaviside	Sigmoid derivative
`ATan`	Heaviside	Arc-tangent derivative
`Triangle`	Heaviside	Piecewise linear
`Erf`	Heaviside	Gaussian (error function) derivative

Usage¶

Most neuron constructors accept a surrogate_function argument:

from btorch.models.neurons import LIF
from btorch.models.surrogate import ATan

neuron = LIF(
    n_neuron=100,
    surrogate_function=ATan(),
)

If not specified, a sensible default (usually ATan) is used.

Choosing a Surrogate¶

ATan — Smooth, well-behaved gradients; good default for most tasks.
Sigmoid — Stronger gradient far from threshold; can help with very sparse activity.
Triangle — Computationally cheap; bounded support.
Erf — Very smooth; sometimes helps with optimization stability.

See the Neurons API for constructor details.