We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
Online ordering will be unavailable from 17:00 GMT on Friday, April 25 until 17:00 GMT on Sunday, April 27 due to maintenance. We apologise for the inconvenience.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
This methodological synthesis surveys study and instrument quality in L2 pronunciation research by scrutinizing methodological practices in designing and employing scales and rubrics that measure accentedness, comprehensibility, and intelligibility. A comprehensive coding scheme was developed, and searches were conducted in several databases. A total of 380 articles (409 samples) that employed 576 target instruments and appeared in peer-reviewed journals from 1977 to 2023 were synthesized. Results demonstrated, among other findings, strengths in reporting several listener and speaker characteristics. Areas in need of improvement include (a) more thorough evaluation and reporting of interrater reliability and instrument validity and (b) greater adherence to methodological transparency and open science practices. We conclude by discussing the implications of these findings for researchers and researcher trainers; by raising awareness of methodological and ethical challenges in psychometric research on L2 speech perception; and by providing recommendations for advancing the quality of instruments in this domain.
One of the most significant challenges in research related to nutritional epidemiology is the achievement of high accuracy and validity of dietary data to establish an adequate link between dietary exposure and health outcomes. Recently, the emergence of artificial intelligence (AI) in various fields has filled this gap with advanced statistical models and techniques for nutrient and food analysis. We aimed to systematically review available evidence regarding the validity and accuracy of AI-based dietary intake assessment methods (AI-DIA). In accordance with PRISMA guidelines, an exhaustive search of the EMBASE, PubMed, Scopus and Web of Science databases was conducted to identify relevant publications from their inception to 1 December 2024. Thirteen studies that met the inclusion criteria were included in this analysis. Of the studies identified, 61·5 % were conducted in preclinical settings. Likewise, 46·2 % used AI techniques based on deep learning and 15·3 % on machine learning. Correlation coefficients of over 0·7 were reported in six articles concerning the estimation of calories between the AI and traditional assessment methods. Similarly, six studies obtained a correlation above 0·7 for macronutrients. In the case of micronutrients, four studies achieved the correlation mentioned above. A moderate risk of bias was observed in 61·5 % (n 8) of the articles analysed, with confounding bias being the most frequently observed. AI-DIA methods are promising, reliable and valid alternatives for nutrient and food estimations. However, more research comparing different populations is needed, as well as larger sample sizes, to ensure the validity of the experimental designs.
This study aimed to design and validate a measurement tool in Turkish to assess the challenges perceived by individuals involved in the disaster response process, such as volunteers, health care personnel, firefighters, and members of nongovernmental organizations (NGOs).
Methods:
This methodological study was conducted from November 2023 through March 2024. The scale development process comprised item development, expert reviews, and language control, followed by the creation of a draft survey, pilot testing, application of the final scale, and statistical analyses. All stages, including validity and reliability analyses, were conducted in Turkish. While reliability analysis used Cronbach’s alpha, item-total correlations, intraclass correlation coefficients, test-retest reliability, Tukey’s additivity, and Hotelling’s T-squared tests, validity analysis included Exploratory and Confirmatory Factor Analyses (EFA/CFA). Software such as AMOS 22.0 and SPSS 22.0 were used to perform statistical analysis.
Results:
Findings indicated six dimensions with 23 items, with factor loadings ranging from 0.478 to 0.881. The CFA demonstrated acceptable fit indices. Test-retest analysis showed a robust positive correlation (r = 0.962) between the measurements. The scale’s total Cronbach’s alpha coefficient was 0.913. Sub-dimension reliability scores were calculated as follows: 0.865 for environmental and health, 0.802 for communication and information, 0.738 for organizational, 0.728 for logistical, 0.725 for individual, and 0.809 for other factors.
Conclusions:
This study showed that the Perceived Challenges in Disaster Response Scale (PCDRS), developed and validated in Turkish, is a reliable and valid measurement tool. It offers a foundation for understanding the challenges faced by disaster response teams and for formulating improvement strategies.
Trust in the validity of published work is of fundamental importance to scientists. Confirmation of validity is more readily attained than addressing the question of whether fraud was involved. Suggestions are made for key stakeholders - institutions and companies, journals, and funders as to how they might enhance trust in science, both by accelerating the assessment of data validity and by segregating that effort from investigation of allegations of fraud.
A prominent paradigm demonstrates many White Americans respond negatively to information on their declining population share. But this paradigm considers this “racial shift” in a single hierarchy-challenging context that produces similar status threat responses across conceptually distinct outcomes, undercutting the ability to both explain the causes of Whites’ social and political responses and advance theorizing about native majorities’ responses to demographic change. We test whether evidence for Whites’ responses to demographic change varies across three distinct hierarchy-challenging contexts: society at large, culture, and politics. We find little evidence any racial shift information instills status threat or otherwise changes attitudes or behavioral intentions, and do not replicate evidence for reactions diverging by left- versus right-wing political attachments. We conclude with what our well-powered (n = 2100) results suggest about a paradigm and intervention used prominently, with results cited frequently, to understand native majorities’ responses to demographic change and potential challenges to multiracial democracy.
Systematic searches of published literature are a vital component of systematic reviews. When search strings are not “sensitive,” they may miss many relevant studies limiting, or even biasing, the range of evidence available for synthesis. Concerningly, conducting and reporting evaluations (validations) of the sensitivity of the used search strings is rare, according to our survey of published systematic reviews and protocols. Potential reasons may involve a lack of familiarity or inaccessibility of complex sensitivity evaluation approaches. We first clarify the main concepts and principles of search string evaluation. We then present a simple procedure for estimating a relative recall of a search string. It is based on a pre-defined set of “benchmark” publications. The relative recall, that is, the sensitivity of the search string, is the retrieval overlap between the evaluated search string and a search string that captures only the benchmark publications. If there is little overlap (i.e., low recall or sensitivity), the evaluated search string should be improved to ensure that most of the relevant literature can be captured. The presented benchmarking approach can be applied to one or more online databases or search platforms. It is illustrated by five accessible, hands-on tutorials for commonly used online literature sources. Overall, our work provides an assessment of the current state of search string evaluations in published systematic reviews and protocols. It also paves the way to improve evaluation and reporting practices to make evidence synthesis more transparent and robust.
The philosophy of science suggests that, on a fundamental level, a scientific theory is only a good theory to the extent that it fulfils a set of basic criteria of adequacy. The study of the predictive mind thus should benefit from an examination and evaluation of the extent to which theories of prediction adhere to these ground rules. There are six reasonable criteria further elucidated in this chapter that are useful to assess the merit of a theory. These criteria are far from perfect benchmarks but, considered as a whole, provide a useful guideline to evaluate theories of prediction. Six criteria are applied to theories of prediction in the remainder of the book. These are: parsimony and simplicity, theoretical precision and mechanistic specificity, testability and predictive power, falsifiability, test of time, and utility. The credibility of a scientific theory is also intrinsically connected to the credibility of the experimental evidence supporting it. This book uses three criteria that provide good benchmarks: the reliability, generalizability, and the validity of the experimental evidence that has been collected.
While national rules regarding the scope, availability and issuance of utility models vary from country to country, most utility model regimes offer protection for tangible products, with many, but not all, jurisdictions excluding processes, biological materials and computer software from the scope of protection. The duration of utility model protection ranges from five to fifteen years, with most countries offering ten years of protection. In most countries, utility model applications are not formally examined and must simply disclose the product in question. Given the lack of examination, obtaining utility models is generally viewed as faster and cheaper than obtaining patents. This combination of speed and cost, in theory, makes utility models potentially attractive to small and medium enterprises (SMEs) that cannot afford to obtain full patent protection. Similar considerations have also been raised as advantageous to innovators in low-income countries.
The global landscape for existing utility model rights is a helpful starting point to the discussion on utility model innovation policy at the country-level as well as firm strategy. WIPO data indicates that approximately 3.0 million utility model applications were filed globally in 2022, a growth rate of 2.9% from the previous year and close to the global total of 3.5 million applications for standard patents. Only about one-half of the world’s countries provide for utility model systems, yet companies from around the world acquire these rights. Utility models are important players in the IP environment, and the unique qualities of the system and differential representation require specific analysis. In this chapter, we review existing empirical data and present additional data regarding UM filings and litigation worldwide. Our purpose is to provide background and context for the more detailed discussion in the remaining chapters in this book.
Psychopathology assessed across the lifespan often can be summarized with a few broad dimensions: internalizing, externalizing, and psychosis/thought disorder. Extensive overlap between internalizing and externalizing symptoms has garnered interest in bifactor models comprised of a general co-occurring factor and specific internalizing and externalizing factors. We focus on internalizing and externalizing symptoms and compare a bifactor model to a correlated two-factor model of psychopathology at three timepoints in a large adolescent community sample (N = 387; 55 % female; 83% Caucasian; M age = 12.1 at wave 1) using self- and parent-reports. Each model was tested within each time-point with 25–28 validators. The bifactor models demonstrated better fit to the data. Child report had stronger invariance across time. Parent report had stronger reliability over time. Cross-informant correlations between the factors at each wave indicated that the bifactor model had slightly poorer convergent validity but stronger discriminant validity than the two-factor model. With notable exceptions, this pattern of results replicated across informants and waves. The overlap between internalizing and externalizing pathology is systematically and, sometimes, non-linearly related to risk factors and maladaptive outcomes. Strengths and weaknesses to modeling psychopathology as two or three factors and clinical and developmental design implications are discussed.
Spiritual care is essential for the health and well-being of patients and their families, so nursing and midwifery students should have professional competency in this field.
Objectives
The present study aimed to translate the Spiritual Care Competency Self-Assessment Tool for nursing and midwifery students into Persian and evaluate its psychometric properties.
Methods
This study has a methodological study design.
Methods measures
The present study was conducted from July 4 to November 19, 2023, at the Faculty of Nursing and Midwifery in west of Iran. The tool was translated into Persian using the forward-backward translation method. The construct validity was examined using exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) with a total of 536 nursing and midwifery students. The internal consistency was assessed using Cronbach’s alpha coefficient. Also, the reliability of the tool was evaluated using the test–retest method. SPSS version 26 and Lisrel version 8 software were used in this study.
Results
Face and content validity was confirmed quantitatively and qualitatively. The results of EFA and CFA confirmed the tool with 4 factors and 28 items. CFA results indicated a well-fitting model (comparative fit index [CFI] = .97, Non-Normed Fit Index (NNFI) = .92, goodness of fit index [GFI] = .91, root mean square error of approximation [RMSEA] = .05, Standardized Root Mean Square Residual (SRMR) = .046). Pearson’s correlation coefficient confirmed a significant relationship between items, subscales, and the main scale. Also, Cronbach’s alpha coefficient (.968) and test–retest (.867) confirmed the reliability of the Persian version of the tool.
Conclusion
The present study showed that the Persian version of the EPICC Spiritual Care, with 4 factors and 28 items, was suitable for validation and that its psychometric properties were acceptable according to COSMIN criteria. In general, the results showed that the Persian version of the EPICC Spiritual Care is a valid and reliable tool that students, preceptors, and educators can use in clinical settings as a practical way of discussing and evaluating spiritual care competency in Iran.
The main principles underpinning measurement for healthcare improvement are outlined in this Element. Although there is no single formula for achieving optimal measurement to support improvement, a fundamental principle is the importance of using multiple measures and approaches to gathering data. Using a single measure falls short in capturing the multifaceted aspects of care across diverse patient populations, as well as all the intended and unintended consequences of improvement interventions within various quality domains. Even within a single domain, improvement efforts can succeed in several ways and go wrong in others. Therefore, a family of measures is usually necessary. Clearly communicating a plausible theory outlining how an intervention will lead to desired outcomes informs decisions about the scope and types of measurement used. Improvement teams must tread carefully to avoid imposing undue burdens on patients, clinicians, or organisations. This title is also available as Open Access on Cambridge Core.
This study aims to validate the Palliative and Complex Chronic Pediatric Patients QoL Inventory (PACOPED QL), a new quality-of-life (QoL) assessment tool for pediatric palliative patients with complex chronic conditions. The goal is to create a comprehensive and inclusive instrument tailored to this unique population, addressing the gap in existing tools that do not meet these specific needs.
Methods
The validation process included a literature review and consultations with experts. A pilot study refined the items, followed by a cross-sectional study involving pediatric palliative patients and their caregivers. Statistical analyses, such as Cronbach’s alpha for internal consistency and exploratory factor analysis for structural validity, were utilized.
Results
The PACOPED QL, comprising 50 items across 8 domains and 6 subdomains, demonstrated strong reliability with Cronbach’s alpha and Guttman split-half reliability both exceeding .9. Validity assessments confirmed its suitability for children with complex illnesses. The tool was refined through expert consultations and pilot testing, reducing items from an initial 85 to a final 50, ensuring relevance and clarity.
Significance of results
The PACOPED QL shows strong reliability and validity in assessing QoL in pediatric palliative patients. Its comprehensive structure makes it a promising tool for clinical practice and research, addressing a critical need for a tailored assessment in this population. The instrument’s robust psychometric properties indicate its potential utility in improving the QoL assessment and care for children with life-threatening illnesses. Further studies are encouraged to confirm its effectiveness across various settings.
A recent debate on implicit measures of racial attitudes has focused on the relative roles of the person, the situation, and their interaction in determining the measurement outcomes. The chapter describes process models for assessing the roles of the situation and the person-situation interaction on the one hand and stable person-related components on the other hand in implicit measures. Latent state-trait models allow one to assess to what extent the measure is a reliable measure of the person and/or the situation and the person-situation interaction (Steyer, Geiser, & Fiege, 2012). Moreover, trait factor scores as well as situation-specific residual factor scores can be computed and related to third variables, thereby allowing one to assess to what extent the implicit measure is a valid measure of the person and/or the situation and the person-situation interaction. These methods are particularly helpful when combined with a process decomposition of implicit-measure data such as a diffusion-model analysis of the IAT (Klauer, Voss, Schmitz, & Teige-Mocigemba, 2007).
P. M. Bentler has shown that Rao's canonical factor analysis is in effect a psychometric analysis, leading to factors that are maximally assessible from the data. He contrasts this with Kaiser and Caffrey's alpha factor analysis that leads to factors that maximally represent the true factors in the content domain. Noting the problems associated with factors that may be highly assessible, but not very representative, or vice versa, Bentler suggests the need for a technique that would, insofar as possible, be optimal with respect to both criteria. Such a technique is presented here, and is shown to resolve into a traditional scaling method, which in turn acquires a richer psychometric interpretation.
Adequate measurement of psychological phenomena is a fundamental aspect of theory construction and validation. Forming composite scales from individual items has a long and honored tradition, although, for predictive purposes, the power of using individual items should be considered. We outline several fundamental steps in the scale construction process, including (1) choosing between prediction and explanation; (2) specifying the construct(s) to measure; (3) choosing items thought to measure these constructs; (4) administering the items; (5) examining the structure and properties of composites of items (scales); (6) forming, scoring, and examining the scales; and (7) validating the resulting scales.
This chapter focuses on experimental designs, in which one or more factors are randomly assigned and manipulated. The first topic is statistical power or the likelihood of obtaining a significant result, which depends on several aspects of design. Second, the chapter examines the factors (independent variables) in a design, including the selection of levels of a factor and their treatment as fixed or random, and then dependent variables, including the selection of items, stimuli, or other aspects of a measure. Finally, artifacts and confounds that can affect the validity of results are addressed, as well as special designs for studying mediation. A concluding section raises the possibility that traditional conceptualizations of design – generally focusing on a single study and on the question of whether a manipulation has an effect – may be inadequate in the current world where multiple-study research programs are the more meaningful unit of evidence, and mediational questions are often of primary interest.
The procedure for a preliminary ruling is central in the ‘complete system of remedies’ offered by the Union to its citizens. Since Article 263 TFEU grants only a very reduced standing to ‘non-privileged applicants’, Article 267 TFEU became the main gate for individuals to bring their claims against the EU before the European Court of Justice. Yet, claims for breaches of fundamental rights by the Union are not at all common in the procedure for a preliminary ruling. This chapter investigates the (real) use and (realistic) potential of Article 267 TFEU as a means for the protection of fundamental rights against breaches by the EU institutions. The chapter maps all instances in which individuals used the procedure for a preliminary ruling to bring a claim against the Union for breaches of their fundamental rights since the coming into force of the Treaty of Lisbon. Using this mapping exercise, the chapter identifies how individuals raise this type of claims in the procedure, discusses the accessibility of the procedure for individual applicants, and assesses the shortcomings of the procedure as a means to redress breaches of fundamental rights by the Union. It argues that these shortcomings have to do with the structure and design of the procedure itself.
Parrots are popular companion animals but show prevalent and at times severe welfare issues. Nonetheless, there are no scientific tools available to assess parrot welfare. The aim of this systematic review was to identify valid and feasible outcome measures that could be used as welfare indicators for companion parrots. From 1,848 peer-reviewed studies retrieved, 98 met our inclusion and exclusion criteria (e.g. experimental studies, captive parrots). For each outcome collected, validity was assessed based on the statistical significance reported by the authors, as other validity parameters were rarely provided for evaluation. Feasibility was assigned by considering the need for specific instruments, veterinary-level expertise or handling the parrot. A total of 1,512 outcomes were evaluated, of which 572 had a significant P-value and were considered feasible. These included changes in behaviour (e.g. activity level, social interactions, exploration), body measurements (e.g. body weight, plumage condition) and abnormal behaviours, amongst others. Many physical and physiological parameters were identified that either require experimental validation, or veterinary-level skills and expertise, limiting their potential use by parrot owners themselves. Moreover, a high risk of bias undermined the internal validity of these outcomes, while a strong taxonomic bias, a predominance of studies on parrots in laboratories, and an underrepresentation of companion parrots jeopardised their external validity. These results provide a promising starting point for validating a set of welfare indicators in parrots.
This chapter is written for conversation analysts and is methodological. It discusses, in a step-by-step fashion, how to code practices of action (e.g., particles, gaze orientation) and/or social actions (e.g., inviting, information seeking) for purposes of their statistical association in ways that respect conversation-analytic (CA) principles (e.g., the prioritization of social action, the importance of sequential position, order at all points, the relevance of codes to participants). As such, this chapter focuses on coding as part of engaging in basic CA and advancing its findings, for example as a tool of both discovery and proof (e.g., regarding action formation and sequential implicature). While not its main focus, this chapter should also be useful to analysts seeking to associate interactional variables with demographic, social-psychological, and/or institutional-outcome variables. The chapter’s advice is grounded in case studies of published CA research utilizing coding and statistics (e.g., those of Gail Jefferson, Charles Goodwin, and the present author). These case studies are elaborated by discussions of cautions when creating code categories, inter-rater reliability, the maintenance of a codebook, and the validity of statistical association itself. Both misperceptions and limitations of coding are addressed.