Hostname: page-component-669899f699-swprf Total loading time: 0 Render date: 2025-04-30T01:13:49.017Z Has data issue: false hasContentIssue false

Differentiable causal computations via delayed trace (extended version)

Published online by Cambridge University Press:  22 April 2025

David Sprunger*
Affiliation:
Indiana State University, Terre Haute, IN, USA
Shin-ya Katsumata
Affiliation:
National Institute of Informatics, Chiyoda, Tokyo, Japan
*
Corresponding author: David Sprunger; Email: [email protected]

Abstract

We investigate causal computations, which take sequences of inputs to sequences of outputs such that the $n$th output depends on the first $n$ inputs only. We model these in category theory via a construction taking a Cartesian category $\mathbb{C}$ to another category $\mathrm{St}(\mathbb{C})$ with a novel trace-like operation called “delayed trace,” which misses yanking and dinaturality axioms of the usual trace. The delayed trace operation provides a feedback mechanism in $\mathrm{St}(\mathbb{C})$ with an implicit guardedness guarantee. When $\mathbb{C}$ is equipped with a Cartesian differential operator, we construct a differential operator for $\mathrm{St}(\mathbb{C})$ using an abstract version of backpropagation through time (BPTT), a technique from machine learning based on unrolling of functions. This obtains a swath of properties for BPTT, including a chain rule and Schwartz theorem. Our differential operator is also able to compute the derivative of a stateful network without requiring the network to be unrolled.

Type
Special Issue: Differential Structures
Copyright
© The Author(s), 2025. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Abel, A. and Pientka, B. (2013). Wellfounded recursion with copatterns: A unified approach to termination and productivity. In: Morrisett, G. and Uustalu, T. (eds.), ACM SIGPLAN International Conference on Functional Programming, ICFP’13, Boston, MA, USA - September 25-27, 2013. ACM, 185–196.Google Scholar
Baez, J. C. and Master, J. (2020). Open petri nets. Mathematical Structures in Computer Science 30 (3), 314341.CrossRefGoogle Scholar
Basold, H., Bonsangue, M., Hansen, H. and Rutten, J. (2014). (Co)Algebraic Characterizations of Signal Flow Graphs. Springer International Publishing, 124–145.CrossRefGoogle Scholar
Blute, R., Cockett, J. and Seely, R. (2006). Differential categories. Mathematical Structures in Computer Science 16 (6) 10491083.CrossRefGoogle Scholar
Blute, R., Cockett, J. and Seely, R. (2009). Cartesian differential categories. Theory and Applications of Categories 22 (23) 622672.Google Scholar
Bonchi, F., Holland, J., Piedeleu, R., Sobociński, P. and Zanasi, F. (2019). Diagrammatic algebra: From linear to concurrent systems. Proceedings of the ACM on Programming Languages 3 (POPL), 128.CrossRefGoogle Scholar
Bonchi, F., Sobociński, P. and Zanasi, F. (2014). A categorical semantics of signal flow graphs. In Baldan, P. and Gorla, D. (eds.), CONCUR 2014 - Concurrency Theory - 25th International Conference, CONCUR 2014, Rome, Italy, September 2-5, 2014. Proceedings, vol. 8704. Lecture Notes in Computer Science. Springer, 435–450.CrossRefGoogle Scholar
Bonchi, F., Sobociński, P. and Zanasi, F. (2015). Full abstraction for signal flow graphs. In Rajamani, S. K. and Walker, D. (eds.), Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2015, Mumbai, India, January 15-17, 2015. ACM, 515–526.CrossRefGoogle Scholar
Carette, T., de Visme, M. and Perdrix, S. (2021). Graphical language with delayed trace: Picturing quantum computing with finite memory. In 36th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2021, Rome, Italy, June 29 - July 2, 2021. IEEE, 1–13.CrossRefGoogle Scholar
Cockett, J. and Cruttwell, G. (2014). Differential structure, tangent structure, and SDG. Applied Categorical Structures 22 (2) 331417.CrossRefGoogle Scholar
Cockett, J., Cruttwell, G. and Gallagher, J. (2011). Differential restriction categories. Theory and Applications of Categories 25 (21) 537613.Google Scholar
Cockett, R., Cruttwell, G., Gallagher, J., Pacaud Lemay, J.-S., MacAdam, B., Plotkin, G. and Pronk, D. (2020). Reverse derivative categories. In Fernández, M. and Muscholl, A. (eds.), 28th EACSL Annual Conference on Computer Science Logic (CSL 2020), vol. 152. Leibniz International Proceedings in Informatics (LIPIcs), Dagstuhl, Germany. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 18:1–18:16.Google Scholar
Crochiere, R. and Oppenheim, A. (1975). Analysis of linear digital networks. Proceedings of the IEEE 63 (4) 581595.CrossRefGoogle Scholar
Cruttwell, G. S. H. (2017). Cartesian differential categories revisited. Mathematical Structures in Computer Science 27 (1) 7091.CrossRefGoogle Scholar
Cruttwell, G. S. H., Gavranović, B., Ghani, N., Wilson, P. and Zanasi, F. (2022). Categorical foundations of gradient-based learning. In Sergey, I. (ed.), Programming Languages and Systems. Springer International Publishing, 1–28.CrossRefGoogle Scholar
Ehresmann, C. (1963). Catégories structurées. Annales scientifiques de l’École Normale Supérieure 80 (4) 349426.CrossRefGoogle Scholar
Ehrhard, T. and Regnier, L. (2003). The differential lambda-calculus. Theoretical Computer Science 309 (1–3) 141.CrossRefGoogle Scholar
Ehrhard, T. and Regnier, L. (2005). Differential interaction nets. Theoretical Computer Science 123 3574.Google Scholar
Elliott, C. (2018). The simple essence of automatic differentiation. PACMPL 2 (ICFP) 70:170:29.Google Scholar
Fong, B., Spivak, D. and Tuys, R. (2019). Backprop as functor: A compositional perspective on supervised learning. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–13. https://doi.org/10.1109/LICS.2019.8785665CrossRefGoogle Scholar
Gadducci, F. and Montanari, U. (2000). The tile model. In Plotkin, G. D., Stirling, C. and Tofte, M. (eds.), Proof, Language, and Interaction, Essays in Honour of Robin Milner. The MIT Press, 133–166.CrossRefGoogle Scholar
Ghica, D. and Jung, A. (2016). Categorical semantics of digital circuits. In Proceedings of the 16th Conference on Formal Methods in Computer-Aided Design, FMCAD ’16, FMCAD Inc., 4148.CrossRefGoogle Scholar
Ghica, D., Jung, A. and Lopez, A. (2017). Diagrammatic semantics for digital circuits. In Goranko, V. and Dam, M. (eds.), 26th EACSL Annual Conference on Computer Science Logic, CSL 2017, August 20-24, 2017, Stockholm, Sweden, vol. 82. LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 24:1–24:16.Google Scholar
Goncharov, S. and Schröder, L. (2018). Guarded traced categories. In Baier, C. and Lago, U. D. (eds.), Foundations of Software Science and Computation Structures. Springer International Publishing, 313–330.CrossRefGoogle Scholar
Goodfellow, I., Bengio, Y. and Courville, A. (2016). Deep Learning. MIT Press.Google Scholar
Hansen, H., Kupke, C. and Rutten, J. (2017). Stream differential equations: Specification formats and solution methods. Logical Methods in Computer Science 13 (1) 152.Google Scholar
Hoshino, N., Muroya, K., and Hasuo, I. (2014). Memoryful geometry of interaction: From coalgebraic components to algebraic effects. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14, Association for Computing Machinery, 52:1–52:10.Google Scholar
Joyal, A., Street, R. and Verity, D. (1996). Traced monoidal categories. Mathematical Proceedings of the Cambridge Philosophical Society 119 447468.CrossRefGoogle Scholar
Katis, P., Sabadini, N. and Walters, R. F. C. (1997). Bicategories of processes. Journal of Pure and Applied Algebra 115 (2), 141178.CrossRefGoogle Scholar
Kissinger, A. and Uijlen, S. (2017). A categorical semantics for causal structure. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017. IEEE Computer Society, 1–12.CrossRefGoogle Scholar
Lavore, E. D., de Felice, G. and Román, M. (2022). Monoidal streams for dataflow programming. In Baier, C. and Fisman, D. (eds.), LICS ’22: 37th Annual ACM/IEEE Symposium on Logic in Computer Science, Haifa, Israel, August 2-5, 2022. ACM, 51:1–51:14.Google Scholar
Leiserson, C. E. and Saxe, J. B. (1991). Retiming synchronous circuitry. Algorithmica 6 (1) 535.CrossRefGoogle Scholar
Mac Lane, S. (1998). Categories for the Working Mathematician, vol. 5. Graduate Texts in Mathematics, Springer Science & Business Media.Google Scholar
Milius, S. (2010). A sound and complete calculus for finite stream circuits. In Proceedings of the 25th Annual IEEE Symposium on Logic in Computer Science, LICS 2010, 11-14 July 2010, Edinburgh, United Kingdom. IEEE Computer Society, 421430.CrossRefGoogle Scholar
Møgelberg, R. E. (2014). A type theory for productive coprogramming via guarded recursion. In Proceedings of the Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), CSL-LICS ’14, Association for Computing Machinery, 71:1–71:10.Google Scholar
Parhi, K. K. and Chen, Y. (2013). Signal Flow Graphs and Data Flow Graphs, Springer New York, 1277–1302.CrossRefGoogle Scholar
Rutten, J. (2005). A coinductive calculus of streams. Mathematical Structures in Computer Science 15 (1) 93147.CrossRefGoogle Scholar
Rutten, J. (2008). Rational streams coalgebraically. Logical Methods in Computer Science 4 (3) 122.Google Scholar
Sprunger, D. and Katsumata, S.-y. (2019). Differentiable causal computations via delayed trace. In 2019 34th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS), 1–12.CrossRefGoogle Scholar
Uustalu, T. and Vene, V. (2005). Signals and comonads. Journal of Universal Computer Science 11 (7) 13101326.Google Scholar
Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE 78 (10) 15501560.CrossRefGoogle Scholar
Zanasi, F. (2015). Interacting Hopf Algebras – The Theory of Linear Systems. (Interacting Hopf Algebras - la théorie des systèmes linéaires). PhD thesis, École normale supérieure de Lyon, France.Google Scholar