Differentiable causal computations via delayed trace (extended version)

David Sprunger; Shin-ya Katsumata

doi:10.1017/S0960129524000331

Differentiable causal computations via delayed trace (extended version)

Published online by Cambridge University Press: 22 April 2025

David Sprunger and

Shin-ya Katsumata

Show author details

David Sprunger*: Affiliation:
Indiana State University, Terre Haute, IN, USA
Shin-ya Katsumata: Affiliation:
National Institute of Informatics, Chiyoda, Tokyo, Japan
*: Corresponding author: David Sprunger; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We investigate causal computations, which take sequences of inputs to sequences of outputs such that the $n$th output depends on the first $n$ inputs only. We model these in category theory via a construction taking a Cartesian category $\mathbb{C}$ to another category $\mathrm{St}(\mathbb{C})$ with a novel trace-like operation called “delayed trace,” which misses yanking and dinaturality axioms of the usual trace. The delayed trace operation provides a feedback mechanism in $\mathrm{St}(\mathbb{C})$ with an implicit guardedness guarantee. When $\mathbb{C}$ is equipped with a Cartesian differential operator, we construct a differential operator for $\mathrm{St}(\mathbb{C})$ using an abstract version of backpropagation through time (BPTT), a technique from machine learning based on unrolling of functions. This obtains a swath of properties for BPTT, including a chain rule and Schwartz theorem. Our differential operator is also able to compute the derivative of a stateful network without requiring the network to be unrolled.

Keywords

Delayed trace operators artesian differential categories recurrent neural networks backpropagation through time signal flow graphs

Type: Special Issue: Differential Structures
Information: Mathematical Structures in Computer Science , Volume 35 , 2025 , e10

DOI: https://doi.org/10.1017/S0960129524000331 [Opens in a new window]
Copyright: © The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abel, A. and Pientka, B. (2013). Wellfounded recursion with copatterns: A unified approach to termination and productivity. In: Morrisett, G. and Uustalu, T. (eds.), ACM SIGPLAN International Conference on Functional Programming, ICFP’13, Boston, MA, USA - September 25-27, 2013. ACM, 185–196.Google Scholar

Baez, J. C. and Master, J. (2020). Open petri nets. Mathematical Structures in Computer Science 30 (3), 314–341.CrossRef Google Scholar

Basold, H., Bonsangue, M., Hansen, H. and Rutten, J. (2014). (Co)Algebraic Characterizations of Signal Flow Graphs. Springer International Publishing, 124–145.CrossRef Google Scholar

Blute, R., Cockett, J. and Seely, R. (2006). Differential categories. Mathematical Structures in Computer Science 16 (6) 1049–1083.CrossRef Google Scholar

Blute, R., Cockett, J. and Seely, R. (2009). Cartesian differential categories. Theory and Applications of Categories 22 (23) 622–672.Google Scholar

Bonchi, F., Holland, J., Piedeleu, R., Sobociński, P. and Zanasi, F. (2019). Diagrammatic algebra: From linear to concurrent systems. Proceedings of the ACM on Programming Languages 3 (POPL), 128.CrossRef Google Scholar

Bonchi, F., Sobociński, P. and Zanasi, F. (2014). A categorical semantics of signal flow graphs. In Baldan, P. and Gorla, D. (eds.), CONCUR 2014 - Concurrency Theory - 25th International Conference, CONCUR 2014, Rome, Italy, September 2-5, 2014. Proceedings, vol. 8704. Lecture Notes in Computer Science. Springer, 435–450.CrossRef Google Scholar

Bonchi, F., Sobociński, P. and Zanasi, F. (2015). Full abstraction for signal flow graphs. In Rajamani, S. K. and Walker, D. (eds.), Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, January 15-17, 2015. ACM, 515–526.CrossRef Google Scholar

Carette, T., de Visme, M. and Perdrix, S. (2021). Graphical language with delayed trace: Picturing quantum computing with finite memory. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021. IEEE, 1–13.CrossRef Google Scholar

Cockett, J. and Cruttwell, G. (2014). Differential structure, tangent structure, and SDG. Applied Categorical Structures 22 (2) 331–417.CrossRef Google Scholar

Cockett, J., Cruttwell, G. and Gallagher, J. (2011). Differential restriction categories. Theory and Applications of Categories 25 (21) 537–613.Google Scholar

Cockett, R., Cruttwell, G., Gallagher, J., Pacaud Lemay, J.-S., MacAdam, B., Plotkin, G. and Pronk, D. (2020). Reverse derivative categories. In Fernández, M. and Muscholl, A. (eds.), 28th EACSL Annual Conference on Computer Science Logic (CSL 2020), vol. 152. Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 18:1–18:16.Google Scholar

Crochiere, R. and Oppenheim, A. (1975). Analysis of linear digital networks. Proceedings of the IEEE 63 (4) 581–595.CrossRef Google Scholar

Cruttwell, G. S. H. (2017). Cartesian differential categories revisited. Mathematical Structures in Computer Science 27 (1) 70–91.CrossRef Google Scholar

Cruttwell, G. S. H., Gavranović, B., Ghani, N., Wilson, P. and Zanasi, F. (2022). Categorical foundations of gradient-based learning. In Sergey, I. (ed.), Programming Languages and Systems. Springer International Publishing, 1–28.CrossRef Google Scholar

Ehresmann, C. (1963). Catégories structurées. Annales scientifiques de l’École Normale Supérieure 80 (4) 349–426.CrossRef Google Scholar

Ehrhard, T. and Regnier, L. (2003). The differential lambda-calculus. Theoretical Computer Science 309 (1–3) 1–41.CrossRef Google Scholar

Ehrhard, T. and Regnier, L. (2005). Differential interaction nets. Theoretical Computer Science 123 35–74.Google Scholar

Elliott, C. (2018). The simple essence of automatic differentiation. PACMPL 2 (ICFP) 70:1–70:29.Google Scholar

Fong, B., Spivak, D. and Tuys, R. (2019). Backprop as functor: A compositional perspective on supervised learning. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–13. https://doi.org/10.1109/LICS.2019.8785665CrossRef Google Scholar

Gadducci, F. and Montanari, U. (2000). The tile model. In Plotkin, G. D., Stirling, C. and Tofte, M. (eds.), Proof, Language, and Interaction, Essays in Honour of Robin Milner. The MIT Press, 133–166.CrossRef Google Scholar

Ghica, D. and Jung, A. (2016). Categorical semantics of digital circuits. In Proceedings of the 16th Conference on Formal Methods in Computer-Aided Design, FMCAD ’16, FMCAD Inc., 41–48.CrossRef Google Scholar

Ghica, D., Jung, A. and Lopez, A. (2017). Diagrammatic semantics for digital circuits. In Goranko, V. and Dam, M. (eds.), 26th EACSL Annual Conference on Computer Science Logic, CSL 2017, August 20-24, 2017, Stockholm, Sweden, vol. 82. LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 24:1–24:16.Google Scholar

Goncharov, S. and Schröder, L. (2018). Guarded traced categories. In Baier, C. and Lago, U. D. (eds.), Foundations of Software Science and Computation Structures. Springer International Publishing, 313–330.CrossRef Google Scholar

Goodfellow, I., Bengio, Y. and Courville, A. (2016). Deep Learning. MIT Press.Google Scholar

Hansen, H., Kupke, C. and Rutten, J. (2017). Stream differential equations: Specification formats and solution methods. Logical Methods in Computer Science 13 (1) 1–52.Google Scholar

Hoshino, N., Muroya, K., and Hasuo, I. (2014). Memoryful geometry of interaction: From coalgebraic components to algebraic effects. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14, Association for Computing Machinery, 52:1–52:10.Google Scholar

Joyal, A., Street, R. and Verity, D. (1996). Traced monoidal categories. Mathematical Proceedings of the Cambridge Philosophical Society 119 447–468.CrossRef Google Scholar

Katis, P., Sabadini, N. and Walters, R. F. C. (1997). Bicategories of processes. Journal of Pure and Applied Algebra 115 (2), 141–178.CrossRef Google Scholar

Kissinger, A. and Uijlen, S. (2017). A categorical semantics for causal structure. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017. IEEE Computer Society, 1–12.CrossRef Google Scholar

Lavore, E. D., de Felice, G. and Román, M. (2022). Monoidal streams for dataflow programming. In Baier, C. and Fisman, D. (eds.), LICS ’22: 37th Annual ACM/IEEE Symposium on Logic in Computer Science, Haifa, Israel, August 2-5, 2022. ACM, 51:1–51:14.Google Scholar

Leiserson, C. E. and Saxe, J. B. (1991). Retiming synchronous circuitry. Algorithmica 6 (1) 5–35.CrossRef Google Scholar

Mac Lane, S. (1998). Categories for the Working Mathematician, vol. 5. Graduate Texts in Mathematics, Springer Science & Business Media.Google Scholar

Milius, S. (2010). A sound and complete calculus for finite stream circuits. In Proceedings of the 25th Annual IEEE Symposium on Logic in Computer Science, LICS 2010, 11-14 July 2010, Edinburgh, United Kingdom. IEEE Computer Society, 421–430.CrossRef Google Scholar

Møgelberg, R. E. (2014). A type theory for productive coprogramming via guarded recursion. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14, Association for Computing Machinery, 71:1–71:10.Google Scholar

Parhi, K. K. and Chen, Y. (2013). Signal Flow Graphs and Data Flow Graphs, Springer New York, 1277–1302.CrossRef Google Scholar

Rutten, J. (2005). A coinductive calculus of streams. Mathematical Structures in Computer Science 15 (1) 93–147.CrossRef Google Scholar

Rutten, J. (2008). Rational streams coalgebraically. Logical Methods in Computer Science 4 (3) 1–22.Google Scholar

Sprunger, D. and Katsumata, S.-y. (2019). Differentiable causal computations via delayed trace. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–12.CrossRef Google Scholar

Uustalu, T. and Vene, V. (2005). Signals and comonads. Journal of Universal Computer Science 11 (7) 1310–1326.Google Scholar

Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE 78 (10) 1550–1560.CrossRef Google Scholar

Zanasi, F. (2015). Interacting Hopf Algebras – The Theory of Linear Systems. (Interacting Hopf Algebras - la théorie des systèmes linéaires). PhD thesis, École normale supérieure de Lyon, France.Google Scholar

Article contents

Differentiable causal computations via delayed trace (extended version)

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests