No CrossRef data available.
Published online by Cambridge University Press: 22 April 2025
We investigate causal computations, which take sequences of inputs to sequences of outputs such that the $n$th output depends on the first
$n$ inputs only. We model these in category theory via a construction taking a Cartesian category
$\mathbb{C}$ to another category
$\mathrm{St}(\mathbb{C})$ with a novel trace-like operation called “delayed trace,” which misses yanking and dinaturality axioms of the usual trace. The delayed trace operation provides a feedback mechanism in
$\mathrm{St}(\mathbb{C})$ with an implicit guardedness guarantee. When
$\mathbb{C}$ is equipped with a Cartesian differential operator, we construct a differential operator for
$\mathrm{St}(\mathbb{C})$ using an abstract version of backpropagation through time (BPTT), a technique from machine learning based on unrolling of functions. This obtains a swath of properties for BPTT, including a chain rule and Schwartz theorem. Our differential operator is also able to compute the derivative of a stateful network without requiring the network to be unrolled.