Skip to content

Surrogate Gradients

Spiking neurons use a discontinuous activation (a spike is emitted when the membrane voltage crosses a threshold). This discontinuity makes standard backpropagation through time (BPTT) impossible because the gradient of the spike function is zero almost everywhere.

Surrogate gradients solve this by replacing the true gradient with a smooth approximation during the backward pass.

Available Surrogates

btorch provides several surrogate gradient functions in btorch.models.surrogate:

Class Forward Backward
Sigmoid Heaviside Sigmoid derivative
ATan Heaviside Arc-tangent derivative
Triangle Heaviside Piecewise linear
Erf Heaviside Gaussian (error function) derivative

Usage

Most neuron constructors accept a surrogate_function argument:

from btorch.models.neurons import LIF
from btorch.models.surrogate import ATan

neuron = LIF(
    n_neuron=100,
    surrogate_function=ATan(),
)

If not specified, a sensible default (usually ATan) is used.

Choosing a Surrogate

  • ATan — Smooth, well-behaved gradients; good default for most tasks.
  • Sigmoid — Stronger gradient far from threshold; can help with very sparse activity.
  • Triangle — Computationally cheap; bounded support.
  • Erf — Very smooth; sometimes helps with optimization stability.

See the Neurons API for constructor details.