Co-speech gestures accompany or replace speech in communication. Studies investigating how autistic children understand them are scarce and inconsistent and often focus on decontextualized, iconic gestures. This study compared 73 three- to twelve-year-old autistic children with 73 neurotypical peers matched on age, non-verbal IQ, and morphosyntax. Specifically, we examined (1) their ability to understand deictic (i.e., pointing), iconic (e.g., gesturing ball), and conventional (e.g., gesturing hello) speechless video-taped gestures following verbal information in a narrative and (2) the impact of linguistic (e.g., vocabulary, morphosyntax) and cognitive factors (i.e., working memory) on their performance, to infer on the underlying mechanisms involved. Autistic children displayed overall good performance in gesture comprehension, although a small but significant difference advantage was observed in neurotypical children. Findings suggest that combining speech and gesture sequentially may be relatively spared in autism and might represent a way to alleviate the demand for simultaneous cross-modal processing.