Introduction
Suicide is a major public health issue and one of the leading causes of preventable death worldwide (World Health Organization, 2024). Pre-pandemic studies showed consistently that both physicians (Dutheil et al., Reference Dutheil, Aubert, Pereira, Dambrun, Moustafa, Mermillod, Baker, Trousselard, Lesage and Navel2019) and nurses (Davis et al., Reference Davis, Cher, Friese and Bynum2021) are at high risk for suicide compared to other employed people (Milner et al., Reference Milner, Spittal, Pirkis and LaMontagne2013), in part related to high access to lethal means and low willingness to seek help (Harvey et al., Reference Harvey, Epstein, Glozier, Petrie, Strudwick, Gayed, Dean and Henderson2021). An important risk factor for suicide is suicidal thought and behaviour (STB) (Ribeiro et al., Reference Ribeiro, Franklin, Fox, Bentley, Kleiman, Chang and Nock2016). Studies carried out during the pandemic found high levels of STBs among healthcare workers (HCWs) compared to the pre-pandemic period (Greenberg et al., Reference Greenberg, Docherty, Gnanapragasam and Wessely2020; Mediavilla et al., Reference Mediavilla, Fernández-Jiménez, Andreo, Morán-Sánchez, Muñoz-Sanjosé, Moreno-Küstner, Mascayano, Ayuso-Mateos, Bravo-Ortiz and Martínez-Alés2023; Mortier et al., Reference Mortier, Vilagut, Ferrer, Serra, Molina, López‐Fresneña, Puig, Pelayo‐Terán, Pijoan, Emparanza, Espuga, Plana, González‐Pinto, Ortí‐Lucas, Salázar, Rius, Aragonès, Cura‐González, Aragón‐Peña, Campos, Parellada, Pérez‐Zapata, Forjaz, Sanz, Haro, Vieta, Pérez‐Solà, Kessler, Bruffaerts and Alonso2021b; Murata et al., Reference Murata, Rezeppa, Thoma, Marengo, Krancevich, Chiyka, Hayes, Goodfriend, Deal, Zhong, Brummit, Coury, Riston, Brent and Melhem2021; Sahimi et al., Reference Sahimi, Mohd Daud, Chan, Shah, Rahman and Nik Jaafar2021; Xiaoming et al., Reference Xiaoming, Ming, Su, Wo, Jianmei, Qi, Hua, Xuemei, Lixia, Jun, Lei, Zhen, Lian, Jing, Handan, Haitang, Xiaoting, Xiaorong, Ran, Qinghua, Xinyu, Jian, Jing, Guanghua, Zhiqin, Nkundimana and Li2020; Xu et al., Reference Xu, Wang, Chen, Ai, Shi, Wang, Hong, Zhang, Hu, Li, Cao, Lv, Du, Li, Yang, He, Chen, Chen, Luo, Zhou, Tan, Tu, Jiang, Han and Kuang2021; Zhou et al., Reference Zhou, Wang, Sun, Qian, Liu, Wang, Qi, Yang, Song, Zhou, Zeng, Liu, Li and Zhang2020).
Risk factors identified in these studies span various risk domains and include pre-pandemic lifetime factors, current mental disorders and emotional problems (e.g., burn-out, traumatic stress, anxiety and depression), loneliness and social isolation, financial stress, and pandemic-specific factors, such as having been in quarantine, moral injury, interpersonal and health-related stress (Eyles et al., Reference Eyles, Moran, Okolie, Dekel, Macleod-Hall, Webb, Schmidt, Knipe, Sinyor, McGuinness, Arensman, Hawton, O’Connor, Kapur, O’Neill, Olorisade, Cheng, Higgins, John and Gunnell2021; García-Iglesias et al., Reference García-Iglesias, Gómez-Salgado, Fernández-Carrasco, Rodríguez-Díaz, Vázquez-Lara, Prieto-Callejero and Allande-Cussó2022; Mortier et al., Reference Mortier, Vilagut, Alayo, Ferrer, Amigo, Aragonès, Aragón-Peña, Asúnsolo Del Barco, Campos, Espuga, González-Pinto, Haro, López Fresneña, Martínez de Salázar, Molina, Ortí-Lucas, Parellada, Pelayo-Terán, Pérez-Gómez, Pérez-Zapata, Pijoan, Plana, Polentinos-Castro, Portillo-Van Diest, Puig, Rius, Sanz, Serra, Urreta-Barallobre, Kessler, Bruffaerts, Vieta, Pérez-Solá, Alonso, Alonso, Alayo, Alonso, Álvarez, Amann, Amigo, Anmella, Aragón, Aragonés, Aragonès, Arizón, Asunsolo, Ayora, Ballester, Barbas, Basora, Bereciartua, Ignasi Bolibar, Bonfill, Cotillas, Cuartero, de Paz, Cura, Jesus Del Yerro, Diaz, Domingo, Emparanza, Espallargues, Espuga, Estevan, Fernandez, Fernandez, Ferrer, Ferreres, Fico, Forjaz, Barranco, Garcia Torrecillasc Garcia-ribera, Garrido, Gil, Gomez, Gomez, Pinto, Haro, Hernando, Insigna, Iriberri, Jimenez, Jimenez, Larrauri, Leon, Lopez-Fresneña, Lopez, Lopez-Atanes Juan Antonio Lopez-Rodriguez, Lopez-Cortacans, Marcos, Martin, Martin, Martinez-Cortés, Martinez-Martinez, Martinez de Salazar, Martinez, Marzola, Mata, Molina, de Dios Molina, Molinero, Mortier, Muñoz, Murru, Olmedo, Ortí, Padrós, Pallejà, Parra, Pascual, Pelayo, Pla, Plana, Aznar, Gomez, Zapata, Pijoan, Polentinos, Puertolas, Puig, Quílez, Quintana, Quiroga, Rentero, Rey, Rius, Rodriguez-Blazquez, Rojas, Romero, Rubio, Rumayor, Ruiz, Saenz, Sanchez, Sanchez-Arcilla, Sanz, Serra, Serra-Sutton, Serrano, Sola, Solera, Soto, Tarrago, Tolosa, Vazquez, Viciola, Vieta, Vilagut, Yago, Yañez, Zapico, Zorita, Zorrilla, Zurbano and Perez-Solá2022, Reference Mortier, Vilagut, Ferrer, Alayo, Bruffaerts, Cristóbal-Narváez, Del Cura-González, Domènech-Abella, Félez-Nobrega, Olaya, Pijoan, Vieta, Pérez-Solà, Kessler, Haro, Alonso, Alonso, Álvarez-Villalba, Amann, Amigo, Anmella, Aragón, Aragonès, Aragonès, Arizón, Asunsolo, Ayora, Ballester, Barbas, Basora, Bereciartua, Bravo, Bolíbar, Bonfill, Cotillas-Rodero, Cuartero, De Paz, Del Yerro, De Vocht, Díaz, Domingo, Emparanza, Espallargues, Espuga, Estevan-Burdeus, Fernández, Fernández, Ferreres, Fico, Forjaz, García-Barranco, García-Ribera, García-Torrecillas, Garrido-Barral, Gil, Giola-Insigna, Gómez, Gómez, González-Pinto, Hernando, Iriberri, Jansen, Jiménez, Jiménez, Larrauri, León-Vázquez, López-Atanes, López-Fresneña, López-Rodríguez, López-Rodríguez, López-Cortacans, Marcos, Martín, Martín, Martínez-Cortés, Martínez-Martínez, De Salázar, Martínez, Marzola, Mata, Molina, Molina, Molinero, Muñoz-Ruipérez, Murru, Navarro, Olmedo-Galindo, Ortí-Lucas, Padrós, Pallejà, Parra, Pascual, Pelayo-Terán, Pla, Plana, Pérez-Aznar, Pérez-Gómez, Pérez-Zapata, Polentinos- Castro, Puértolas, Puig, Quílez, Quintana, Quiroga, Rentero, Rey, Rius, Rodríguez-Blázquez, Rojas-Giraldo, Romero- Barzola, Rubio, Ruiz, Rumayor, Sáenz, Sánchez, Sánchez-Arcilla, Sanz, Serra, Serra-Sutton, Serrano, Solà, Solera, Soto, Tarragó, Tolosa, Vázquez, Viciola, Voorspoels, Yago-González, Yáñez-Sánchez, Zapico, Zorita, Zorrilla and Zurbano2021a).
Identifying individuals at the highest risk for future STBs is a significant challenge in the field of mental health research, especially given the relatively low occurrence of STBs. Over the past five decades, traditional statistical approaches have been predominant in predicting STB risk (Nordin et al., Reference Nordin, Zainol, Mohd Noor and Chan2023), which often, due to their limited capacity as to including a wide range of predictor variables, require the researcher to define a priori a limited set of predictors to be included in the models. This approach has been criticized because variable selection is often based on predefined theoretical frameworks that only consider some of the potentially relevant predictors for STBs (Franklin et al., Reference Franklin, Ribeiro, Fox, Bentley, Kleiman, Huang, Musacchio, Jaroszewski, Chang and Nock2017), resulting in relatively simple models with limited predictive accuracy (Boudreaux et al., Reference Boudreaux, Rundensteiner, Liu, Wang, Larkin, Agu, Ghosh, Semeter, Simon and Davis-Martin2021).
Advanced analytical approaches, such as machine learning (ML) models, have demonstrated higher predictive accuracy of STBs than traditional statistical approaches (e.g. linear regression, generalized linear models or analysis of variance) including a limited number of variables based on predefined theoretical frameworks (Schafer et al., Reference Schafer, Kennedy, Gallyer and Resnik2021). ML models handle complex interactions and high-dimensional data effectively by capturing non-linear relationships and efficiently processing and analysing large volumes of data with multiple variables, overcoming the limitations of traditional approaches (Bennett et al., Reference Bennett, Kleczyk, Hayes, Mehta, Aceves-Fernández and Travieso-Gonzalez2022). They also allow a better understanding of the complex patterns and non-evident relationships among a very large set of STB-related variables including not only commonly considered factors such as mental health and family history, but also contextual aspects such as lifestyles, access to healthcare, adverse childhood experiences and social and economic environments, among others (Favril et al., Reference Favril, Yu, Uyar, Sharpe and Fazel2022). Thus, these advanced approaches can provide a more comprehensive and accurate understanding of the factors contributing to the risk of STBs. The key is to integrate traditional approaches with the empirical power of data-driven techniques (Schafer et al., Reference Schafer, Kennedy, Gallyer and Resnik2021). An increased focus on the prediction of adverse mental health, including identification of predictors that mostly contribute to increased prediction accuracy, may lead to new hypotheses about causal associations, and ultimately, a better understanding and prevention of these outcomes, including STBs (Yarkoni and Westfall, Reference Yarkoni and Westfall2017).
Despite ML being increasingly used for STB risk prediction, to the best of our knowledge, there is no previous study using ML to develop a prediction model for STBs among HCWs. Such models could help with early identification and intervention for at-risk HCWs and also provide valuable insights into the complex interplay of factors contributing to suicidal ideation in this population. In addition, although there are clear gender differences in both absolute STB risk and the distribution of risk factors (Gradus et al., Reference Gradus, Rosellini, Horváth-Puhó, Jiang, Street, Galatzer-Levy, Lash and Sørensen2021; Jiang et al., Reference Jiang, Rosellini, Horváth-Puhó, Shiner, Street, Lash, Sørensen and Gradus2021; Miranda‐Mendizabal et al., Reference Miranda‐Mendizabal, Castellví, Alayo, Vilagut, Blasco, Torrent, Ballester, Almenara, Lagares, Roca, Sesé, Piqueras, Soto‐Sanz, Rodríguez‐Marín, Echeburúa, Gabilondo, Cebrià, Bruffaerts, Auerbach, Mortier, Kessler and Alonso2019a, Reference Miranda-Mendizabal, Castellví, Parés-Badell, Alayo, Almenara, Alonso, Blasco, Cebrià, Gabilondo, Gili, Lagares, Piqueras, Rodríguez-Jiménez, Rodríguez-Marín, Roca, Soto-Sanz, Vilagut and Alonso2019b), ML-based studies often do not take these differences into consideration, leading to a lack of gender-specific STB prediction models.
The aim of the current study is to develop an ML-based prediction model for future STBs using data from the MINDCOVID project, a large prospective cohort study of Spanish HCWs (Alonso et al., Reference Alonso, Vilagut, Mortier, Ferrer, Alayo, Aragón-Peña, Aragonès, Campos, Cura-González, Emparanza, Espuga, Forjaz, González-Pinto, Haro, López-Fresneña, Salázar, Molina, Ortí-Lucas, Parellada, Pelayo-Terán, Pérez-Zapata, Pijoan, Plana, Puig, Rius, Rodríguez-Blázquez, Sanz, Serra, Kessler, Bruffaerts, Vieta and Pérez-Solà2021; MINDCOVID, 2020). The HCW cohort was recruited just after the height of the first wave of the Spanish COVID-19 pandemic and was followed up 4 months later, including a reassessment of STBs. Importantly, predictor variables to develop the prediction model were created using all information included in the baseline survey. Although the information collected in the baseline survey was not exhaustive with regard to including all factors potentially related to HCW’s STB in the literature, it spanned various relevant risk factor domains for adverse mental health and STB, including depression, anxiety and post-traumatic stress disorder (PTSD). Variable selection techniques were employed to avoid manual selection of predictors. In addition, we aim to identify predictors that are the most important contributors to the model’s predictive accuracy, separately for men and women.
Methods
Recruitment
Data for this study come from the MINDCOVID project (Alonso et al., Reference Alonso, Vilagut, Mortier, Ferrer, Alayo, Aragón-Peña, Aragonès, Campos, Cura-González, Emparanza, Espuga, Forjaz, González-Pinto, Haro, López-Fresneña, Salázar, Molina, Ortí-Lucas, Parellada, Pelayo-Terán, Pérez-Zapata, Pijoan, Plana, Puig, Rius, Rodríguez-Blázquez, Sanz, Serra, Kessler, Bruffaerts, Vieta and Pérez-Solà2021; MINDCOVID, 2020), a multicentre, prospective, observational cohort study of Spanish HCWs, representing a convenience sample of 18 healthcare institutions (hospitals, primary care and public healthcare centres) from six Autonomous Communities in Spain and included all types of HCWs (medical doctors, nurses, auxiliary nurses, other professions involved in patient care and professions not directly involved in patient care). The cohort was assessed at two time points using web-based self-report surveys. The first assessment (T1) was conducted from 5 May through 7 September 2020, i.e., just after the height of the first wave of the Spain COVID-19 pandemic. The follow-up assessment (T2) was conducted 4 months (mean = 120.1 days [SD = 22.2]) after the T1 assessment.
Recruitment for the T1 survey was done by healthcare representatives who contacted all employed HCWs in each participating healthcare centre using administrative email distribution lists (i.e., census sampling). A total of 8,996 HCWs participated at T1, representing a mean weighted response rate across healthcare centres (weighted by achieved sample size) of 11.7% (unweighted mean response rate of 12.8%). A total of 4,809 T1 participants also participated at T2 (53.5%). For both surveys, two reminder emails were sent within 2–4 weeks after the initial invitation. For the current study, we included data from the 4,809 HCWs described previously (Alonso et al., Reference Alonso, Vilagut, Alayo, Ferrer, Amigo, Aragón-Peña, Aragonès, Campos, Del Cura-gonzález, Urreta, Espuga, González Pinto, Haro, López Fresneña, Martínez de Salázar, Molina, Ortí Lucas, Parellada, Pelayo-Terán, Pérez Zapata, Pijoan, Plana, Puig, Rius, Rodriguez-Blazquez, Sanz, Serra, Kessler, Bruffaerts, Vieta, Pérez-Solá and Mortier2022; Mortier et al., Reference Mortier, Vilagut, Alayo, Ferrer, Amigo, Aragonès, Aragón-Peña, Asúnsolo Del Barco, Campos, Espuga, González-Pinto, Haro, López Fresneña, Martínez de Salázar, Molina, Ortí-Lucas, Parellada, Pelayo-Terán, Pérez-Gómez, Pérez-Zapata, Pijoan, Plana, Polentinos-Castro, Portillo-Van Diest, Puig, Rius, Sanz, Serra, Urreta-Barallobre, Kessler, Bruffaerts, Vieta, Pérez-Solá, Alonso, Alonso, Alayo, Alonso, Álvarez, Amann, Amigo, Anmella, Aragón, Aragonés, Aragonès, Arizón, Asunsolo, Ayora, Ballester, Barbas, Basora, Bereciartua, Ignasi Bolibar, Bonfill, Cotillas, Cuartero, de Paz, Cura, Jesus Del Yerro, Diaz, Domingo, Emparanza, Espallargues, Espuga, Estevan, Fernandez, Fernandez, Ferrer, Ferreres, Fico, Forjaz, Barranco, Garcia Torrecillasc Garcia-ribera, Garrido, Gil, Gomez, Gomez, Pinto, Haro, Hernando, Insigna, Iriberri, Jimenez, Jimenez, Larrauri, Leon, Lopez-Fresneña, Lopez, Lopez-Atanes Juan Antonio Lopez-Rodriguez, Lopez-Cortacans, Marcos, Martin, Martin, Martinez-Cortés, Martinez-Martinez, Martinez de Salazar, Martinez, Marzola, Mata, Molina, de Dios Molina, Molinero, Mortier, Muñoz, Murru, Olmedo, Ortí, Padrós, Pallejà, Parra, Pascual, Pelayo, Pla, Plana, Aznar, Gomez, Zapata, Pijoan, Polentinos, Puertolas, Puig, Quílez, Quintana, Quiroga, Rentero, Rey, Rius, Rodriguez-Blazquez, Rojas, Romero, Rubio, Rumayor, Ruiz, Saenz, Sanchez, Sanchez-Arcilla, Sanz, Serra, Serra-Sutton, Serrano, Sola, Solera, Soto, Tarrago, Tolosa, Vazquez, Viciola, Vieta, Vilagut, Yago, Yañez, Zapico, Zorita, Zorrilla, Zurbano and Perez-Solá2022) that participated in both T1 and T2 assessments.
Informed consent was obtained from all participants. The study complies with the principles established by national and international regulations, including the Declaration of Helsinki and the Code of Ethics. The study was approved by the Research Integrity and Good Scientific Practices Committee of IMIM‐Parc de Salut Mar, Barcelona, Spain (2020/9203/I), and by all participating centres’ institutional review boards.
Measures
Primary outcome
The study’s primary outcome was any 30-day STBs at the 4-month follow-up (T2), assessed using a modified version of four selected items from the Columbia Suicide Severity Rating Scale (Posner et al., Reference Posner, Brown, Stanley, Brent, Yershova, Oquendo, Currier, Melvin, Greenhill, Shen and Mann2011), each with dichotomous response options (yes or no). The items assess passive suicidal ideation (SI) (‘wish you were dead or would go to sleep and never wake up’), active SI (‘have thoughts of killing yourself’), suicide plans (‘think about how you might kill yourself [e.g. taking pills, shooting yourself] or work out a plan of how to kill yourself’) and suicide attempts in the past 30 days (‘make a suicide attempt [i.e. purposefully hurt yourself with at least some intent to die]’). Following previous studies, the primary outcome labelled as ‘any STB’ was created as a dichotomous variable indicating the presence of any of the four STB outcomes (Mortier et al., Reference Mortier, Vilagut, Ferrer, Alayo, Bruffaerts, Cristóbal-Narváez, Del Cura-González, Domènech-Abella, Félez-Nobrega, Olaya, Pijoan, Vieta, Pérez-Solà, Kessler, Haro, Alonso, Alonso, Álvarez-Villalba, Amann, Amigo, Anmella, Aragón, Aragonès, Aragonès, Arizón, Asunsolo, Ayora, Ballester, Barbas, Basora, Bereciartua, Bravo, Bolíbar, Bonfill, Cotillas-Rodero, Cuartero, De Paz, Del Yerro, De Vocht, Díaz, Domingo, Emparanza, Espallargues, Espuga, Estevan-Burdeus, Fernández, Fernández, Ferreres, Fico, Forjaz, García-Barranco, García-Ribera, García-Torrecillas, Garrido-Barral, Gil, Giola-Insigna, Gómez, Gómez, González-Pinto, Hernando, Iriberri, Jansen, Jiménez, Jiménez, Larrauri, León-Vázquez, López-Atanes, López-Fresneña, López-Rodríguez, López-Rodríguez, López-Cortacans, Marcos, Martín, Martín, Martínez-Cortés, Martínez-Martínez, De Salázar, Martínez, Marzola, Mata, Molina, Molina, Molinero, Muñoz-Ruipérez, Murru, Navarro, Olmedo-Galindo, Ortí-Lucas, Padrós, Pallejà, Parra, Pascual, Pelayo-Terán, Pla, Plana, Pérez-Aznar, Pérez-Gómez, Pérez-Zapata, Polentinos- Castro, Puértolas, Puig, Quílez, Quintana, Quiroga, Rentero, Rey, Rius, Rodríguez-Blázquez, Rojas-Giraldo, Romero- Barzola, Rubio, Ruiz, Rumayor, Sáenz, Sánchez, Sánchez-Arcilla, Sanz, Serra, Serra-Sutton, Serrano, Solà, Solera, Soto, Tarragó, Tolosa, Vázquez, Viciola, Voorspoels, Yago-González, Yáñez-Sánchez, Zapico, Zorita, Zorrilla and Zurbano2021a; Nock et al., Reference Nock, Stein, Heeringa, Ursano, Colpe, Fullerton, Hwang, Naifeh, Sampson, Schoenbaum, Zaslavsky and Kessler2014).
Baseline predictor variables
The baseline survey (T1) contains 207 items that were used to create the 219 predictor variables for STBs (items with non-ordinal categorical answers were converted into as many dummy variables as the number of categories minus one). The items were organized into eight different sections based on their contents (see Supplementary Table 1 for the list of items). Due to space constraints, we provide here a short description of predictor variables and corresponding T1 survey sections: (1) eight sociodemographic variables (e.g., age, gender and marital status); (2) 14 variables related to COVID-19 exposure, infection status and perceived risk for COVID-19 infection (e.g., having received a positive COVID-19 test and having been hospitalized for COVID-19); (3) 55 items related to mental disorders, including a checklist for pre-pandemic lifetime mental disorders and screening scale items spanning five common current mental disorders, i.e., Major Depressive Disorder (PHQ-8; Kroenke et al., Reference Kroenke, Strine, Spitzer, Williams, Berry and Mokdad2009), Generalized Anxiety Disorder 7-item (Spitzer et al., Reference Spitzer, Kroenke, Williams and Löwe2006), 30-day panic attacks (item adapted from the Composite International Diagnostic Interview (CIDI) screening scale; Kessler et al., Reference Kessler, Santiago, Colpe, Dempsey, First, Heeringa, Stein, Fullerton, Gruber, Naifeh, Nock, Sampson, Schoenbaum, Zaslavsky and Ursano2013), 30-day traumatic stress symptoms (four-item abbreviated form of the PTSD Checklist, PCL-5; Zuromski et al., Reference Zuromski, Ustun, Hwang, Keane, Marx, Stein, Ursano and Kessler2019) and substance use disorder (four-item version of the CAGE Adapted to Include Drugs (CAGE-AID); Hinkin et al., Reference Hinkin, Castellon, Dickson‐Fuhrman, Daum, Jaffe and Jarvik2001). In addition, any 30-day STB was assessed (Posner et al., Reference Posner, Brown, Stanley, Brent, Yershova, Oquendo, Currier, Melvin, Greenhill, Shen and Mann2011), as well as burnout (six-item personal burnout subscale of the Copenhagen Burnout Inventory; Kristensen et al., Reference Kristensen, Borritz, Villadsen and Christensen2005), 30-day psychotic symptoms (items taken from the prodromal questionnaire; Loewy et al., Reference Loewy, Pearson, Vinogradov, Bearden and Cannon2011) and 30-day obsessive compulsive disorder symptoms (the three-item obsessing subscale of the obsessive compulsive inventory revised; Foa et al., Reference Foa, Huppert, Leiberg, Langner, Kichic, Hajcak and Salkovskis2002); (4) 30 items assessing treatment use, including healthcare service and psychotropic medication use for emotional or substance use problems, as well as barriers for treatment use; (5) 27 items assessing relevant work-related variables (e.g., type of HCWs, type of workplace, income, perceived risk for COVID-19 at work, perceived lack of healthcare centre preparedness and moral injury); (6) three items about isolation, quarantine and confinement due to COVID-19; (7) 35 items assessing 12-month serious stressful events, perceived stress (adapted peri life events scale; Dohrenwend et al., Reference Dohrenwend, Askenasy, Krasnoff and Dohrenwend1978), resilience (Connor–Davidson resilience scale; Connor and Davidson, Reference Connor and Davidson2003) and healthy habits and (8) 35 items assessing social support (Oslo social support scale; Husain et al., Reference Husain, Mukherjee, Notiar, Alavi, Tomenson, Hawa, Malik, Ahmed and Chaudhry2016), loneliness (UCLA three-item loneliness scale; Hughes et al., Reference Hughes, Waite, Hawkley and Cacioppo2004), use of social media, family functioning (Brief Assessment of Family Functioning Scale; Mansfield et al., Reference Mansfield, Keitner and Sheeran2019), parental stress (items taken from the Parental Stress Scale; Berry and Jones, Reference Berry and Jones1995), quality of life (five-level version of EQ-5D; Herdman et al., Reference Herdman, Gudex, Lloyd, Janssen, Kind, Parkin, Bonsel and Badia2011), somatic comorbidity (self-administered comorbidity questionnaire (Sangha et al., Reference Sangha, Stucki, Liang, Fossel and Katz2003) and role impairment (Sheehan disability scales; Sheehan et al., Reference Sheehan, Harnett-Sheehan and Raj1996).
Statistical analysis
The percentage of missing values across all variables analysed was moderate, with a mean missing rate of 6.5% and a median value of less than 1% (see Supplementary Table 2 for the percentage of missing values for each variable). Multiple imputation by chained equations with 10 iterations per imputation and 12 imputed datasets was used to impute missing item-level data (Van Buuren, Reference van Buuren2018) using R’s mice package (Buuren and Groothuis-Oudshoorn, Reference Buuren and Groothuis-Oudshoorn2011). The choice of 12 imputations provided a reasonable trade-off between statistical accuracy and computational efficiency, following recommendations that 5–20 imputations are generally sufficient under moderate missingness (Van Buuren, Reference van Buuren2018; White et al., Reference White, Royston and Wood2011) and that the number of imputations should be at least equal to 100 times the fraction of missing information, which in our study was below 0.1 (White et al., Reference White, Royston and Wood2011) for key performance measures such as AUC.
The regularization path of linear Support Vector Classifier (SVC) with L1 penalty (Dai and Zhao, Reference Dai and Zhao2020) was implemented to select the most critical predictor variables out of the 219 candidates by forcing some coefficients to be exactly zero, aiming to improve the accuracy and efficiency of predictive models (Montesinos López & Crossa, Reference Montesinos López, Montesinos López and Crossa2022). SVC for variable selection was applied to the 12 imputed datasets. Variables that were selected in at least 7 of the 12 imputed datasets were included in the final prediction model. This decision is justified by the work of Zhao and Long (Zhao and Long, Reference Zhao and Long2017) who propose to perform variable selection separately in each imputed dataset and then include variables that are selected with a frequency above a defined threshold. In this study, the threshold chosen was 7 out of 12 imputations to ensure consistency across more than 50% of the imputations. According to Wood et al. (Reference Wood, White and Royston2008), this strategy improves the robustness of variable selection and coefficient estimation in regression models.
To address imbalanced data (i.e., 7.9% prevalence of STBs at the 4-month follow-up), which can lead to poor minority class classification (Rezvani and Wang, Reference Rezvani and Wang2023), two different techniques were compared: (1) Synthetic Minority Oversampling Technique (SMOTE) (Chawla et al., Reference Chawla, Bowyer, Hall and Kegelmeyer2002) and (2) majority class undersampling (Devi et al., Reference Devi, Biswas, Purkayastha, Paul and Verma2021).
A random forest (RF) classifier was used to develop prediction models for STBs at the 4-month follow-up (Hammelrath et al., Reference Hammelrath, Hilbert, Heinrich, Zagorscak and Knaevelsrud2023; Navarro et al., Reference Navarro, Ouellet-Morin, Geoffroy, Boivin, Tremblay, Côté and Orri2021). Different values of the predefined hyperparameters specifying the number of decision trees to be included in the RF (n_estimators: [20, 25, 50, 75, 100]) and the maximum depth allowed for each decision tree (max_depth: [7, 8, 9, 10, 11]) were tested using a grid combination and 5-fold cross-validation. The grid combination was tested on a single stacked dataset of all 12 imputed datasets for both balancing methods (Seki et al., Reference Seki, Kawazoe and Ohe2021). The model with the selected hyperparameters was then independently trained and tested for each of the 12 imputed datasets. To decrease the risk of overfitting, 5-fold cross-validation was used in each imputed dataset. We aggregated the results of the predicted probabilities of each of the imputations into a single dataset to obtain performance metrics. The RandomForestClassifier function of the sklearn library in Python version 3.8 was used (Pedregosa et al., Reference Pedregosa, Varoquaux, Gramfort, Michel, Thirion, Grisel, Blondel, Müller, Nothman, Louppe, Prettenhofer, Weiss, Dubourg, Vanderplas, Passos, Cournapeau, Brucher, Perrot and Duchesnay2012).
Model characteristics were assessed on the test dataset through the area under the Receiver Operating Characteristic (AUROC) curve and the area under the precision (positive predictive value [PPV])–recall (sensitivity) curve. The precision–recall curve is particularly useful for imbalanced datasets (Saito and Rehmsmeier, Reference Saito and Rehmsmeier2015). These curves allowed us to evaluate recall, specificity and precision for different cut-off points. The model was applied to each of the imputed datasets and predictions from each dataset were aggregated to obtain the overall metrics values (Seki et al., Reference Seki, Kawazoe and Ohe2021). All metrics were also obtained separately for men and women.
To quantify variable importance, the Shapley Additive Explanations (SHAP) (Lundberg and Lee, Reference Lundberg and Lee2017) method was used. SHAP values represent the contribution that each variable had in the final model prediction. As SHAP values can display variability across imputations, they were obtained separately for each of the 12 imputed datasets and a combined representation of the contributions of each variable was obtained as the mean of these values (Seki et al., Reference Seki, Kawazoe and Ohe2021). Although the model is the same for both genders, the SHAP values have been obtained separately for the subsamples of men and women. SHAP summary plots are provided. In this plot, variables are ordered according to their influence on the predictions of the model. Each dot represents an individual’s SHAP value, plotted along the horizontal axis. The dots are collared based on the variable’s value, ranging from low (blue) to high (pink). If pink dots appear on the right side and blue dots on the left, it indicates that the risk increases as the value of the variable rises. Analyses were conducted with the SHAP library in Python version 3.8.
Results
Sample characteristics
Table 1 shows the descriptive characteristics of the study sample by gender at T1. Ages ranged from 18 to 71 years, with a mean age of 45.8 (SD = 11.0). The country of birth was Spain for 95.3% of the sample. Healthcare professionals were mostly women (81.1%). About one-third (34.3%) were physicians, 29.2% nurses, 8.1% auxiliary nurses, 11.5% other profession involved in patient care and 16.9% other profession not involved in patient care. A total of 19.5% had a positive test or medical diagnosis of COVID-19 (women = 19.3%, men = 20.2%). A total of 381 subjects (7.9%; women = 7.8%, men = 8.2%) reported having had 30-day STBs at T2.
Table 1. Sociodemographic and work characteristics of Spanish healthcare workers during the COVID-19 pandemic assessed at T1 (N = 4,809)

SE: standard error; SD: standard deviation.
Of the 7.9% of subjects with any STB at the 12-month follow-up, 0.9% reported a suicide attempt, 26.3% a suicide plan without attempt, 10.7% active ideation without plan or attempt and 62.1% passive ideation without active ideation, plan or attempt.
Variables selection
Out of the initial 219 candidate predictor variables, 34 variables were selected by the linear SVM (Supplementary Figure 1). Although the gender variable was not selected, it was included because gender is a key variable due to significant differences in STB risk factors between men and women (Miranda-Mendizabal et al., Reference Miranda-Mendizabal, Castellví, Parés-Badell, Alayo, Almenara, Alonso, Blasco, Cebrià, Gabilondo, Gili, Lagares, Piqueras, Rodríguez-Jiménez, Rodríguez-Marín, Roca, Soto-Sanz, Vilagut and Alonso2019; Schrijvers et al., Reference Schrijvers, Bollen and Sabbe2012), and to be able to assess relevant risk factors within each gender. This leads to a total of 35 variables being included in the model. The inclusion in the RF model is also justified by the ability of the RF model to account for complex interactions between variables (Auret and Aldrich, Reference Auret and Aldrich2012).
Random forest
The selected hyperparameters specified 50 decision trees to be included in the RF (n_estimators = 50) and a maximum depth of 9 allowed for each decision tree (max_depth = 9).
Figure 1 presents AUROC and precision–recall curves resulting from the RF fitted on the test sample using the 35 baseline selected variables. The AUROC with and without balancing techniques is higher than 0.80. In the total sample, the best result was obtained with the model without data balancing (AUROC = 0.87).

Figure 1. (a) The receiver operating characteristics curve and the area under the receiver operating characteristic curve (AUROC) for suicidal thoughts and behavior prediction. The results of the prediction using different balancing test are shown (left). (b) The precision-recall curve and the area underthe precision-recall curve of the models. The results of the prediction using different balancing test are shown (right).
Regarding the area under the precision–recall curve, large differences are observed between balancing methods, being the model without data balancing the one with the best result (area under the precision–recall curve = 0.52).
Figure 2 shows that when the goodness of fit of the model without data balancing is assessed separately by gender, the good metric properties are maintained. The AUROC curve is 0.84 and 0.86 for men and women, respectively. In the case of the area under the precision–recall curve, the values obtained are 0.45 for men and 0.54 for women.

Figure 2. (a) The receiver operating characteristics curve and the area under the receiver operating characteristic (AUROC) curve of the models for men and women. (b) The precision-recall curve and the area under the precision-recall curve of the models for men and women.
SHAP values
In the summary plot of the SHAP values (Figures 3 and 4), the variables are sorted according to their importance in the prediction model. The colour of each point on the graph represents the value of the corresponding variable: pink indicates high values and blue indicates low values. The horizontal axis (x-axis) represents the SHAP value: having values above 0 indicates that these experiences are potentially important predictor variables in predicting future suicidal ideation. Figure 3 shows the most important variables in the prediction of STBs. The ranking is headed by the number of days in the past 30 days with suicidal ideation (passive or active) followed by passive suicidal ideation, the number of days in the past 30 days with binge eating episodes and having intrusive thoughts (i.e., nasty thoughts and having difficulty in getting rid of them). Another relevant factor identified is concentration problems. Some of the factors associated with COVID-19 infection that have been identified include: having been in isolation or quarantine, fear of personal or loved ones’ infection, work-related factors and experiences during the initial pandemic outbreak, such as perceived lack of supervision at work, not getting along with co-workers, stress related to having to prioritize care among patients and work-related role impairment. Financial stress also appears as a risk factor for STBs.

Figure 3. Shapley additive explanation (SHAP) summary graph. Each point on the graph is a SHAP value for one variable. The color represents the value of the variable from low (blue) to high (pink).

Figure 4. Shapley additive explanation (SHAP) summary graph. for men (a) and women (b).
Among men (Figure 4a) the most important variables for predicting STBs at T2 included the number of days in the past 30 days with suicidal ideation (passive or active), passive suicidal ideation and frequency of intrusive thoughts at the T1 assessment.
Among women (Figure 4b), the most important baseline variables for predicting STBs (at T2) included the number of days in the past 30 days with suicidal ideation (passive or active), passive suicidal ideation and number of panic attacks.
Discussion
In this study, we developed and validated a predictive model for STBs within a 4-month period using survey data collected from HCWs during the COVID-19 pandemic in Spain. The ML-based model showed robust predictive performance for STBs and identified, out of a total of 219 variables, the 35 key predictive variables associated with STBs. Our model showed very good metric characteristics with an AUROC of 0.86 (0.86 in women and 0.84 in men) and an area under the precision–recall curve of 0.52 (0.54 and 0.45 in women and men, respectively). The results align with previous studies, as shown in the systematic review by Somé et al. (Reference Somé, Noormohammadpour and Lange2024), which found a mean AUROC of 0.81 in 84 studies, a mean recall of 0.68 in 64 studies and a mean precision of 0.41 in 46 studies. Our results improve precision metric (Nock et al., Reference Nock, Millner, Ross, Kennedy, Al-Suwaidi, Barak-Corren, Castro, Castro-Ramirez, Lauricella, Murman, Petukhova, Bird, Reis, Smoller and Kessler2022), which is challenging due to the low prevalence of STBs. ML models developed to predict STBs in previous studies have been criticized for having low precision (often below 1%) and thus producing too many false positives (Nock et al., Reference Nock, Millner, Ross, Kennedy, Al-Suwaidi, Barak-Corren, Castro, Castro-Ramirez, Lauricella, Murman, Petukhova, Bird, Reis, Smoller and Kessler2022). Our model achieved a precision of 50% with a recall of 60%. With recalls of 80%, the precision is greater than 20%. The fact that the model’s cut-off point is not predetermined allows for their selection based on the required recall and precision, depending on the objective or application of the predictive models. As the data were unbalanced (92.1% of the subjects in one category), balancing techniques were tested, but these techniques did not improve the results of the models.
STB at T1 was identified as a strong predictor at T2, consistent with previous literature (Ribeiro et al., Reference Ribeiro, Franklin, Fox, Bentley, Kleiman, Chang and Nock2016). While the association between mental disorders and suicidal ideation is well established (Franklin et al., Reference Franklin, Ribeiro, Fox, Bentley, Kleiman, Huang, Musacchio, Jaroszewski, Chang and Nock2017), a key contribution of our study is the identification of specific mental disorder symptoms as independent predictors of STBs among HCWs active during the COVID-19 pandemic, including binge eating, panic attacks, intrusive thoughts and concentration problems. These results align with studies in non-HCW populations linking STBs to eating disorders (Brown et al., Reference Brown, LaRose and Mezuk2018; Sohn et al., Reference Sohn, Dimitropoulos, Ramirez, McPherson, Anderson, Munir, Patten, McGirr and Devoe2023), panic disorder (Zhang et al., Reference Zhang, Wang, Xiong, Jian, Zhang, Xiang, Zhou and Zou2022), obsessive-compulsive disorder (Pellegrini et al., Reference Pellegrini, Maietti, Rucci, Burato, Menchetti, Berardi, Maina, Fineberg and Albert2021) and concentration difficulties (Lo et al., Reference Lo, Chan, Yip, Chui, Fung, Wong, Chu, So, Chan, Chung, Lee, Cheng, Law, Chan and Chang2023). These findings highlight the critical need for early identification and screening of mental health symptoms in HCWs, as well as the challenge of ensuring access to timely, evidence-based mental health care (Jain et al., Reference Jain, Sarfraz, Karlapati, Kazmi, Nasir, Atiq, Ansari, Shah, Aamir, Zaidi, Shakil Zubair and Jyotsana2024). Notably, our study also found that practical barriers to seeking treatment and disruptions in psychiatric or psychological care due to the COVID-19 pandemic were significant predictors of future STBs, a particularly concerning issue given the low treatment utilization among HCWs (Braquehais et al., Reference Braquehais, Gómez-Duran, Nieva, Valero, Ramos-Quiroga and Bruguera2022; Dellazizzo et al., Reference Dellazizzo, Léveillé, Landry and Dumais2021; Mortier et al., Reference Mortier, Vilagut, García-Mieres, Alayo, Ferrer, Amigo, Aragonès, Aragón-Peña, Asúnsolo Del Barco, Campos, Espuga, González-Pinto, Haro, López Fresneña, Martínez de Salázar, Molina, Ortí-Lucas, Parellada, Pelayo-Terán, Pérez-Gómez, Pérez-Zapata, Pijoan, Plana, Polentinos-Castro, Portillo-Van Diest, Puig, Rius, Sanz, Serra, Urreta-Barallobre, Kessler, Bruffaerts, Vieta, Pérez-Solá and Alonso2024; Rogoža et al., Reference Rogoža, Strumila, Klivickaitė, Diržius and Čėnaitė2021).
Consistent with prior research (Du et al., Reference Du, Jia, Hu, Ge, Cheng, Qu and Chen2023; Kavukcu and Akdeniz, Reference Kavukcu and Akdeniz2021), our study found that COVID-19-related experiences, including isolation or quarantine and fear of personal or familial infection, predicted STBs at follow-up, likely due to their traumatic and stressful nature (Portillo-Van Diest et al., Reference Portillo-Van Diest, Vilagut, Alayo, Ferrer, Amigo, Amann, Aragón-Peña, Aragonès, Asúnsolo Del Barco, Campos, Del Cura-González, Espuga, González-Pinto, Haro, Larrauri, López-Fresneña, Martínez de Salázar, Molina, Ortí-Lucas, Parellada, Pelayo-Terán, Pérez-Zapata, Pijoan, Plana, Puig, Rius, Rodríguez-Blázquez, Sanz, Serra, Urreta-Barallobre, Kessler, Bruffaerts, Vieta, Pérez-Solá, Alonso and Mortier2023). Additionally, work-related disruptions during the initial pandemic outbreak, such as perceived lack of supervision, interpersonal conflicts with co-workers, stress from prioritizing patient care and role impairment, emerged as significant predictors of later suicidal ideation. These findings underscore the need for systemic workplace reforms, including improved healthcare centre preparedness for viral outbreaks through enhanced equipment, staffing, training and protocols. Moreover, fostering supportive work environments, encouraging the reporting of interpersonal conflicts (Alshammari and Dayrit, Reference Alshammari and Dayrit2017) and implementing effective communication and conflict resolution strategies (Jerng et al., Reference Jerng, Huang, Liang, Chen, Lin, Huang, Hsieh and Sun2017) are essential. Future research should focus on delineating causal pathways underlying STBs among HCWs to inform targeted prevention efforts, addressing the critical gap in evidence-based interventions for mental health issues in this population at both individual and organizational levels (Petrie et al., Reference Petrie, Crawford, Baker, Dean, Robinson, Veness, Randall, McGorry, Christensen and Harvey2019).
Gender was not selected as a relevant predictor when the Support Vector Machine (SVM) model was applied. This result was unexpected as gender has been shown in the mental health and STB literature to be a key variable in identifying significant risk factors. The ability of the RF model to capture complex interactions, identify non-linear dependencies and consider multivariate relationships between variables (Auret and Aldrich, Reference Auret and Aldrich2012), together with the recognized clinical and theoretical relevance of gender in mental disorders and STB – given that risk factors differ significantly between men and women (Miranda‐Mendizabal et al., Reference Miranda‐Mendizabal, Castellví, Alayo, Vilagut, Blasco, Torrent, Ballester, Almenara, Lagares, Roca, Sesé, Piqueras, Soto‐Sanz, Rodríguez‐Marín, Echeburúa, Gabilondo, Cebrià, Bruffaerts, Auerbach, Mortier, Kessler and Alonso2019; Schrijvers et al., Reference Schrijvers, Bollen and Sabbe2012) – justifies the inclusion of the gender variable in the model.
Considering the evident gender differences in both the absolute risk of STBs and the prevalence of associated risk factors (Gradus et al., Reference Gradus, Rosellini, Horváth-Puhó, Jiang, Street, Galatzer-Levy, Lash and Sørensen2021; Jiang et al., Reference Jiang, Rosellini, Horváth-Puhó, Shiner, Street, Lash, Sørensen and Gradus2021; Miranda‐Mendizabal et al., Reference Miranda‐Mendizabal, Castellví, Alayo, Vilagut, Blasco, Torrent, Ballester, Almenara, Lagares, Roca, Sesé, Piqueras, Soto‐Sanz, Rodríguez‐Marín, Echeburúa, Gabilondo, Cebrià, Bruffaerts, Auerbach, Mortier, Kessler and Alonso2019a, Reference Miranda-Mendizabal, Castellví, Parés-Badell, Alayo, Almenara, Alonso, Blasco, Cebrià, Gabilondo, Gili, Lagares, Piqueras, Rodríguez-Jiménez, Rodríguez-Marín, Roca, Soto-Sanz, Vilagut and Alonso2019b), the accuracy of the model was assessed separately for men and women, and the model proved to be a good fit for both genders. For both genders, the most important factors were the number of suicidal thoughts (passive or active) and passive suicidal thoughts in the last 30 days. For men, the second most important factor was the frequency of unpleasant thoughts and the difficulty in getting rid of them, while for women it was the number of panic attacks. This is an important finding suggesting that there are gender differences in the relative importance of risk factors for STBs. Previous research has shown significant interactions between gender and certain risk factors for STBs (Miranda‐Mendizabal et al., Reference Miranda‐Mendizabal, Castellví, Alayo, Vilagut, Blasco, Torrent, Ballester, Almenara, Lagares, Roca, Sesé, Piqueras, Soto‐Sanz, Rodríguez‐Marín, Echeburúa, Gabilondo, Cebrià, Bruffaerts, Auerbach, Mortier, Kessler and Alonso2019a). These gender differences in risk factors have been linked to variations in the prevalence of internalizing and externalizing disorders between genders, as well as differences in coping strategies, including the frequency of help-seeking. These differences may be attributed to gender socialization.
Strengths and limitations
This study has some limitations. First, due to low numbers of suicide plans and suicide attempts at the 4-month follow-up, we operationalized the study outcome as any STB (i.e., having passive or active suicidal ideation with or without plan or attempt), as all four separate outcomes indicate the presence of at least passive suicidal ideation. This is in line with previous work by our group (Mortier et al., Reference Mortier, Vilagut, Ferrer, Alayo, Bruffaerts, Cristóbal-Narváez, Del Cura-González, Domènech-Abella, Félez-Nobrega, Olaya, Pijoan, Vieta, Pérez-Solà, Kessler, Haro, Alonso, Alonso, Álvarez-Villalba, Amann, Amigo, Anmella, Aragón, Aragonès, Aragonès, Arizón, Asunsolo, Ayora, Ballester, Barbas, Basora, Bereciartua, Bravo, Bolíbar, Bonfill, Cotillas-Rodero, Cuartero, De Paz, Del Yerro, De Vocht, Díaz, Domingo, Emparanza, Espallargues, Espuga, Estevan-Burdeus, Fernández, Fernández, Ferreres, Fico, Forjaz, García-Barranco, García-Ribera, García-Torrecillas, Garrido-Barral, Gil, Giola-Insigna, Gómez, Gómez, González-Pinto, Hernando, Iriberri, Jansen, Jiménez, Jiménez, Larrauri, León-Vázquez, López-Atanes, López-Fresneña, López-Rodríguez, López-Rodríguez, López-Cortacans, Marcos, Martín, Martín, Martínez-Cortés, Martínez-Martínez, De Salázar, Martínez, Marzola, Mata, Molina, Molina, Molinero, Muñoz-Ruipérez, Murru, Navarro, Olmedo-Galindo, Ortí-Lucas, Padrós, Pallejà, Parra, Pascual, Pelayo-Terán, Pla, Plana, Pérez-Aznar, Pérez-Gómez, Pérez-Zapata, Polentinos- Castro, Puértolas, Puig, Quílez, Quintana, Quiroga, Rentero, Rey, Rius, Rodríguez-Blázquez, Rojas-Giraldo, Romero- Barzola, Rubio, Ruiz, Rumayor, Sáenz, Sánchez, Sánchez-Arcilla, Sanz, Serra, Serra-Sutton, Serrano, Solà, Solera, Soto, Tarragó, Tolosa, Vázquez, Viciola, Voorspoels, Yago-González, Yáñez-Sánchez, Zapico, Zorita, Zorrilla and Zurbano2021a) and others (Benjet et al., Reference Benjet, Borges, Miah, Albor, Gutiérrez‐García, Zavala Berbena, Guzmán, Vargas‐Contreras, Hermosillo de la Torre, Hernández Uribe, Quevedo, Covarrubias Díaz, Martínez Ruiz, Valdés‐García, Martínez Jerez and Mortier2022; Nock et al., Reference Nock, Millner, Ross, Kennedy, Al-Suwaidi, Barak-Corren, Castro, Castro-Ramirez, Lauricella, Murman, Petukhova, Bird, Reis, Smoller and Kessler2022). Second, a convenience sample was used and results need to be validated using external samples. This limitation is partially addressed by obtaining a random and heterogeneous census sample of HCWs from 18 healthcare institutions in six Autonomous Communities. Third, STB is complex and difficult to predict; therefore, psychosocial and environmental factors cannot be easily excluded (Ati et al., Reference Ati, Paraswati and Windarwati2021). However, the number of variables used to predict was very large, all 219 variables collected in the survey were used. Fourth, although an SVM with an L1 regularization model was used for a large and objective selection of variables, the analysis is based on a predefined survey with a specific number of items. This implies that, although the model allows for a greater inclusion of predictor variables, there is still the limitation of not including all possible variables relevant in the context of STBs. Fifth, we have exclusively used the RF algorithm to predict STBs. Although the predictive capacity of this algorithm was effective in our study, other models may also provide meaningful and complementary results that could improve the accuracy of our results analysis.
Conclusion
In this study of Spanish HCWs during the COVID-19 pandemic, we have developed a predictive model of the risk of STBs. Our results show that RF ML algorithm has a high prediction performance for STBs (AUROC = 0.86; 0.86 in women and 0.84 in men). Importantly, our study improves the precision compared to previous research. The results generated by the proposed model help to identify and explain risk factors for STBs and contribute to the development of a first comprehensive conceptual framework for understanding STB occurrence in major epidemics and other disasters with high impact on essential workers. The most important predictors contributing to the prediction of suicide ideation in healthcare professionals were ideation frequency in the last 30 days, passive suicidal ideation and the number of days with binge eating episodes in the last 30 days.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/S2045796025000198.
Availability of data and materials
The de-identified participant data as well as the study protocol, statistical analysis plan and data dictionaries used for this study are available as from publication and upon reasonable request from the corresponding author (P.M.; [email protected]) as long as the main objective of the data-sharing request is replicating the analysis and findings as reported in this paper.
Acknowledgements
The authors would like to sincerely thank all HCWs participating in the study. They also thank Puri Barbas for the management of the project.
Author Contributions
I.A., G.V., P.M. and O.P. reviewed the literature. I.A., J.A., G.V., P.M., O.P., M.F., E.A., V.P.S., J.M.H. and R.B. conceived and designed the study. E.A., J.D.M., N.L.F., T.P., J.M.P.-T., J.I.P., M.E., M.N.P., A.G.-P., C.R., E.A., N.N.A., M.C., A.P.-Z., E.V., C.S. and V.P.-S. acquired the data. G.V., I.A., F.A. and P.M. cleaned and analysed the data. I.A., G.V., O.P. and P.M. drafted the initial version of the manuscript. All authors reviewed the initial draft, made a critical contribution to the interpretation of the data and approved the manuscript. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.
Financial support
This work was supported by grants from the Instituto de Salud Carlos III (ISCIII)/Ministerio de Ciencia e Innovación/FEDER COV20/00711 (J.A.); ISCIII-FEDER (J.A., grant number PI17/00521); Miguel Servet grant (P.M., CP21/00078) co-financed by the ISCIII and co-funded by the European Union; ISCIII Sara Borrell CD18/00049 (P.M.); ISCIII co-financed by the European Union through the European Social Fund Plus PFIS grant FI23/00004 (A.P.-V.D.); the Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement of the Generalitat de Catalunya AGAUR 2021 SGR 00624 (J.A.); PERIS, Departament de Salut SLT017/20/000009 (I.A.); and CIBER of Epidemiology & Public Health, ISCIII CB06/02/0046. Additional partial funding was received from the Gerencia Regional de Salud de Castilla y León (SACYL) GRS COVID 32/A/20 (J.M.P.-T.).
Competing interests
E.A. reports personal fees from Lundbeck, Esteve and Boehringer-Ingelheim, outside the submitted work. E.V. has received grants and served as a consultant, advisor or Continuing Medical Education (CME) speaker for the following entities: AB-Biotics, AbbVie, Adamed, Alcediag, Angelini, Biogen, Beckley-Psytech, Biohaven, Boehringer-Ingelheim, Celon Pharma, Compass, Dainippon Sumitomo Pharma, Ethypharm, Ferrer, Gedeon Richter, GH Research, Glaxo-Smith Kline, HMNC, Idorsia, Johnson & Johnson, Lundbeck, Luye Pharma, Medincell, Merck, Newron, Novartis, Orion Corporation, Organon, Otsuka, Roche, Rovi, Sage, Sanofi-Aventis, Sunovion, Takeda, Teva and Viatris, outside the submitted work. A.G.-P. has received grants and served as a consultant, advisor or CME speaker for the following entities: Janssen-Cilag, Lundbeck, Otsuka, Alter, Angelini, Novartis, Rovi, Takeda, the Spanish Ministry of Science and Innovation (CIBERSAM), the Ministry of Science (Carlos III Institute), the Basque Government and the European Framework Program of Research. J.M.H. has served as a consultant, advisor or CME speaker for the following entities: Boehringer-Ingelheim, Eli Lilly and Co. and Lundbeck, outside the submitted work. J.M.P.-T. has served as a consultant, advisor or CME speaker for the following entities: Angelini, Boehringer-Ingelheim, Eli Lilly and Co, Johnson & Johnson Lundbeck, Otsuka and Rovi, outside the submitted work. All other authors reported no conflict of interest. R.B. reports fees from Janssen-Cilag, outside the submitted work.
Ethical standards
The authors assert that all procedures contributing to this study comply with the ethical standards of the relevant national and institutional committees on human experimentation and with the Helsinki Declaration of 1975, as revised in 2000.