We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
In experimental economics, where subjects participate in different sessions, observations across subjects of a given session might exhibit more correlation than observations across subjects in different sessions. The main goal of this paper is to clarify what are session-effects: what can cause them, what forms they can take, and what are the potential problems. It will be shown that standard solutions are at times inadequate, and that their properties are sometimes misunderstood.
The data volumes generated by theWidefield ASKAP L-band Legacy All-sky Blind surveY atomic hydrogen (Hi) survey using the Australian Square Kilometre Array Pathfinder (ASKAP) necessitate greater automation and reliable automation in the task of source finding and cataloguing. To this end, we introduce and explore a novel deep learning framework for detecting low signal-to-noise ratio (SNR) Hi sources in an automated fashion. Specifically, our proposed method provides an automated process for separating true Hi detections from false positives when used in combination with the source finding application output candidate catalogues. Leveraging the spatial and depth capabilities of 3D convolutional neural networks, our method is specifically designed to recognize patterns and features in three-dimensional space, making it uniquely suited for rejecting false-positive sources in low SNR scenarios generated by conventional linear methods. As a result, our approach is significantly more accurate in source detection and results in considerably fewer false detections compared to previous linear statistics-based source finding algorithms. Performance tests using mock galaxies injected into real ASKAP data cubes reveal our method’s capability to achieve near-100% completeness and reliability at a relatively low integrated SNR $\sim3-5$. An at-scale version of this tool will greatly maximise the science output from the upcoming widefield Hi surveys.
Correspondence analysis can be described as a technique which decomposes the departure from independence in a two-way contingency table. In this paper a form of correspondence analysis is proposed which decomposes the departure from the quasi-independence model. This form seems to be a good alternative to ordinary correspondence analysis in cases where the use of the latter is either impossible or not recommended, for example, in case of missing data or structural zeros. It is shown that Nora's reconstitution of order zero, a procedure well-known in the French literature, is formally identical to our correspondence analysis of incomplete tables. Therefore, reconstitution of order zero can also be interpreted as providing a decomposition of the residuals from the quasi-independence model. Furthermore, correspondence analysis of incomplete tables can be performed using existing programs for ordinary correspondence analysis.
Various models for sets of congeneric tests are considered, including models appropriate for the analysis of multitrait-multimethod data. All models are illustrated with real data. The special cases when two or more tests within a set are tau-equivalent or parallel are also considered. All data analyses are done within the framework of a general model by Jöreskog [1970].
In van der Heijden and de Leeuw (1985) it was proposed to use loglinear analysis to detect interactions in a multiway contingency table, and to explore the form of these interactions with correspondence analysis. After performing the exploratory phase of the analysis, we will show here how the results found in this phase can be used for confirmation.
A computer program can be a means of communicating the structure of an algorithm as well as a tool for data analysis. From this perspective high-level matrix-oriented languages like PROC MATRIX in the SAS system are especially useful because of their readability and compactness. An algorithm for the joint analysis of dissimilarity and preference data using maximum likelihood estimation is presented in PROC MATRIX code.
Loglinear analysis and correspondence analysis provide us with two different methods for the decomposition of contingency tables. In this paper we will show that there are cases in which these two techniques can be used complementary to each other. More specifically, we will show that often correspondence analysis can be viewed as providing a decomposition of the difference between two matrices, each following a specific loglinear model. Therefore, in these cases the correspondence analysis solution can be interpreted in terms of the difference between these loglinear models. A generalization of correspondence analysis, recently proposed by Escofier, will also be discussed. With this decomposition, which includes classical correspondence analysis as a special case, it is possible to use correspondence analysis complementary to loglinear analysis in more instances than those described for classical correspondence analysis. In this context correspondence analysis is used for the decomposition of the residuals of specific restricted loglinear models.
This paper is concerned with the development of a measure of the precision of a multidimensional euclidean structure. The measure is a precision index for each point in the structure, assuming that all the other points are precisely located. The measure is defined and two numerical methods are presented for its calculation. A small Monte Carlo study of the measure's behavior is performed and findings discussed.
A method is discussed which extends principal components analysis to the situation where the variables may be measured at a variety of scale levels (nominal, ordinal or interval), and where they may be either continuous or discrete. There are no restrictions on the mix of measurement characteristics and there may be any pattern of missing observations. The method scales the observations on each variable within the restrictions imposed by the variable's measurement characteristics, so that the deviation from the principal components model for a specified number of components is minimized in the least squares sense. An alternating least squares algorithm is discussed. An illustrative example is given.
Multidimensional scaling has recently been enhanced so that data defined at only the nominal level of measurement can be analyzed. The efficacy of ALSCAL, an individual differences multidimensional scaling program which can analyze data defined at the nominal, ordinal, interval and ratio levels of measurement, is the subject of this paper. A Monte Carlo study is presented which indicates that (a) if we know the correct level of measurement then ALSCAL can be used to recover the metric information presumed to underlie the data; and that (b) if we do not know the correct level of measurement then ALSCAL can be used to determine the correct level and to recover the underlying metric structure. This study also indicates, however, that with nominal data ALSCAL is quite likely to obtain solutions which are not globally optimal, and that in these cases the recovery of metric structure is quite poor. A second study is presented which isolates the potential cause of these problems and forms the basis for a suggested modification of the ALSCAL algorithm which should reduce the frequency of locally optimal solutions.
A method is developed to investigate the additive structure of data that (a) may be measured at the nominal, ordinal or cardinal levels, (b) may be obtained from either a discrete or continuous source, (c) may have known degrees of imprecision, or (d) may be obtained in unbalanced designs. The method also permits experimental variables to be measured at the ordinal level. It is shown that the method is convergent, and includes several previously proposed methods as special cases. Both Monte Carlo and empirical evaluations indicate that the method is robust.
A new procedure is discussed which fits either the weighted or simple Euclidian model to data that may (a) be defined at either the nominal, ordinal, interval or ratio levels of measurement; (b) have missing observations; (c) be symmetric or asymmetric; (d) be conditional or unconditional; (e) be replicated or unreplicated; and (f) be continuous or discrete. Various special cases of the procedure include the most commonly used individual differences multidimensional scaling models, the familiar nonmetric multidimensional scaling model, and several other previously undiscussed variants.
The procedure optimizes the fit of the model directly to the data (not to scalar products determined from the data) by an alternating least squares procedure which is convergent, very quick, and relatively free from local minimum problems.
The procedure is evaluated via both Monte Carlo and empirical data. It is found to be robust in the face of measurement error, capable of recovering the true underlying configuration in the Monte Carlo situation, and capable of obtaining structures equivalent to those obtained by other less general procedures in the empirical situation.
It is reported that (1) a new coordinate estimation routine is superior to that originally proposed for ALSCAL; (2) an oversight in the interval measurement level case has been found and corrected; and (3) a new initial configuration routine is superior to the original.
Many documents are produced over the years of managing assets, particularly those with long lifespans. However, during this time, the assets may deviate from their original as-designed or as-built state. This presents a significant challenge for tasks that occur in later life phases but require precise knowledge of the asset, such as retrofit, where the assets are equipped with new components. For a third party who is neither the original manufacturer nor the operator, obtaining a comprehensive understanding of the asset can be a tedious process, as this requires going through all available but often fragmented information and documents. While common knowledge regarding the domain or general type of asset can be helpful, it is often based on the experiences of engineers and is, therefore, only implicitly available. This article presents a graph-based information management system that complements traditional PLM systems and helps connect fragments by utilizing generic information about assets. To achieve this, techniques from systems engineering and data science are used. The overarching management platform also includes geometric analyses and operations that can be performed with geometric and product information extracted from STEP files. While the management itself is first described generically, it is also later applied to cabin retrofit in aviation. A mock-up of an Airbus A320 is utilized as the case study to demonstrate further how the platform can provide benefits for retrofitting such long-living assets.
Confounding refers to a mixing or muddling of effects that can occur when the relationship we are interested in is confused by the effect of something else. It arises when the groups we are comparing are not completely exchangeable and so differ with respect to factors other than their exposure status. If one (or more) of these other factors is a cause of both the exposure and the outcome, then some or all of an observed association between the exposure and outcome may be due to that factor.
While there are cases where it is straightforward and unambiguous to define a network given data, often a researcher must make choices in how they define the network and that those choices, preceding most of the work on analyzing the network, have outsized consequences for that subsequent analysis. Sitting between gathering the data and studying the network is the upstream task: how to define the network from the underlying or original data. Defining the network precedes all subsequent or downstream tasks, tasks we will focus on in later chapters. Often those tasks are the focus of network scientists who take the network as a given and focus their efforts on methods using those data. Envision the upstream task by asking, what are the nodes? and what are the links?, with the network following from those definitions. You will find these questions a useful guiding star as you work, and you can learn new insights by reevaluating their answers from time to time.
Drawing examples from real-world networks, this essential book traces the methods behind network analysis and explains how network data is first gathered, then processed and interpreted. The text will equip you with a toolbox of diverse methods and data modelling approaches, allowing you to quickly start making your own calculations on a huge variety of networked systems. This book sets you up to succeed, addressing the questions of what you need to know and what to do with it, when beginning to work with network data. The hands-on approach adopted throughout means that beginners quickly become capable practitioners, guided by a wealth of interesting examples that demonstrate key concepts. Exercises using real-world data extend and deepen your understanding, and develop effective working patterns in network calculations and analysis. Suitable for both graduate students and researchers across a range of disciplines, this novel text provides a fast-track to network data expertise.
Aging ships and offshore structures face harsh environmental and operational conditions in remote areas, leading to age-related damages such as corrosion wastage, fatigue cracking, and mechanical denting. These deteriorations, if left unattended, can escalate into catastrophic failures, causing casualties, property damage, and marine pollution. Hence, ensuring the safety and integrity of aging ships and offshore structures is paramount and achievable through innovative healthcare schemes. One such paradigm, digital healthcare engineering (DHE), initially introduced by the final coauthor, aims at providing lifetime healthcare for engineered structures, infrastructure, and individuals (e.g., seafarers) by harnessing advancements in digitalization and communication technologies. The DHE framework comprises five interconnected modules: on-site health parameter monitoring, data transmission to analytics centers, data analytics, simulation and visualization via digital twins, artificial intelligence-driven diagnosis and remedial planning using machine and deep learning, and predictive health condition analysis for future maintenance. This article surveys recent technological advancements pertinent to each DHE module, with a focus on its application to aging ships and offshore structures. The primary objectives include identifying cost-effective and accurate techniques to establish a DHE system for lifetime healthcare of aging ships and offshore structures—a project currently in progress by the authors.
To better understand and prevent research errors, we conducted a first-of-its-kind scoping review of clinical and translational research articles that were retracted because of problems in data capture, management, and/or analysis.
Methods:
The scoping review followed a preregistered protocol and used retraction notices from the Retraction Watch Database in relevant subject areas, excluding gross misconduct. Abstracts of original articles published between January 1, 2011 and January 31, 2020 were reviewed to determine if articles were related to clinical and translational research. We reviewed retraction notices and associated full texts to obtain information on who retracted the article, types of errors, authors, data types, study design, software, and data availability.
Results:
After reviewing 1,266 abstracts, we reviewed 884 associated retraction notices and 786 full-text articles. Authors initiated the retraction over half the time (58%). Nearly half of retraction notices (42%) described problems generating or acquiring data, and 28% described problems with preparing or analyzing data. Among the full texts that we reviewed: 77% were human research; 29% were animal research; and 6% were systematic reviews or meta-analyses. Most articles collected data de novo (77%), but only 5% described the methods used for data capture and management, and only 11% described data availability. Over one-third of articles (38%) did not specify the statistical software used.
Conclusions:
Authors may improve scientific research by reporting methods for data capture and statistical software. Journals, editors, and reviewers should advocate for this documentation. Journals may help the scientific record self-correct by requiring detailed, transparent retraction notices.