9 – Cognitive Factors in Reading Medical Images: Thinking Processes in Image Interpretation




9 Cognitive Factors in Reading Medical Images: Thinking Processes in Image Interpretation



David Manning



9.1 Clinical Relevance


A range of human factors influence the accuracy of diagnostic decision making in medical imaging. Interpreting images, identifying disease status, and communicating findings are subject to cognitive errors and biases. Such biases can have a significant impact on subsequent patient management and on the economics of healthcare. Full knowledge of the interpretation task, the nature and likelihood of bias errors, and the conditions or activities that reduce their effect are likely to enhance the quality of diagnostic judgments and decisions.



Science is a way of trying not to fool yourself. The first principle is that you must not fool yourself, and you are the easiest person to fool (Richard Feynman).



9.2 Introduction


We are very familiar with the task of observing visual scenes. We make judgments about what the scenes mean to us, and we use those judgments to make decisions on actions. We are confident in the efficiency of vision; we feel we have complete control over it and that our judgments are fair and true to the information provided. But the brain does not always tell us the full story of what we have looked at (Elkins, 1996), and we are seldom aware when our impressions are illusions (Kahneman, 2011). So, it is not surprising that studies of radiological performance over half a century have shown there is significant disagreement in the diagnostic decisions made by different radiologists given the same visual data; and this difference is remarkably resistant to change despite continuous technical improvements in the quality of medical images (Samei, 2006). This is to say that observers make errors that are independent of the images. It has been estimated (Graber et al., 2005) that cognitive factors contribute to diagnostic error in 74% of known radiological mistakes, so it is right that we should make every effort to understand the problem and reduce the incidence.


Medical image interpretation has to do with recognizing an abnormality when it is present and normality when one is not. But because we cannot recognize things we do not already know, this recognition must start from experience. We are not simple object detectors, as Frith (2007) points out, and although computers can now beat world champions in chess tournaments, humans are still much better than machines at recognizing complex objects. Information theory and developments in computer science recognize that, although our brains have apparently solved such perceptual and cognitive problems, they are very difficult problems to solve in other realms; and consequently, we have an imperfect understanding of how cognition works in radiology.


However, we know that visual information processing and decision making are key, and that experts perform better than nonexperts. We have a body of knowledge of the principles of what it takes to achieve expert performance, and we know something of how it is developed in radiology. It is relevant to clinical outcomes and how computer aids to diagnosis can enhance human performance. Studies in these areas give some ideas on the mental factors involved in the image-reading process.


Analysis of the visual task is central and identifies why some operators perform better than others, although other variables in physics and psychophysics also make their contribution (see Beutel et al. (2000) for a comprehensive treatment of this area of medical imaging).



9.3 Plan of the Chapter


Image interpretation is a problem-solving activity using visual reasoning, and this chapter aims to help understand how this is carried out and to highlight potential cognitive errors that reduce performance.




  • The first section treats the image as a source of information. The image reader must be clear about the precise nature of this information before deciding on its diagnostic meaning.



  • Image features are used in the interpretation process. Contrast and location illustrate perceptual characteristics needed to classify objects. These factors give an understanding of the image which is then communicated to the clinician for action.



  • A dual-process theory of reasoning suggests that readers recognize image features and make decisions on their meaning either very rapidly and without conscious thought or more slowly with deliberation and analysis. Probabilistic reasoning and how we update our decisions based on new information is discussed as a thinking strategy used by experts.



  • Image interpretation activities are subject to cognitive errors or biases relevant to clinical care, radiology training, and systems of working. These can be categorized and perhaps offset by a better understanding of their origins.



9.4 Overview of the Interpretation Task


Radiologists and pathologists do not usually have the benefit of direct observation of their patients. They rely instead on the interpretation of images that represent structural or functional features of their patients’ bodies. From these visual representations, they draw inferences about states of health or disease. Representation is a key word because the visual task is interpretation of the information recorded by the imaging method. This information is not the anatomy itself but an abstraction of its features. The shadow picture of an X-ray image is created from the spatial distribution of linear attenuation coefficients of an X-ray beam transmitted through a body, and a cytology slide is a scene of light transmission that may be augmented with dyes that select structures or features. Originally there was a limited range to the physical meanings behind the image features technology produced, but image information can now be highly diverse and as new imaging modalities come along so the diversity increases. Technical developments over the last few decades have added many new imaging methods using signals with entirely different information content concerning the human body and disease processes (e.g., thermography, optical coherence tomography).


Before interpretations of the image can be made, the observer needs to be clear regarding what the image features are physically representing. Medical images are not self-explanatory and require effort in their interpretation. Newcomers to radiology or pathology must first establish a knowledge of how the image was formed and then use that knowledge as a mental engine running in the background as the diagnostic problem is negotiated. Figure 9.1 illustrates an example of how the same structures can be targeted in an imaging examination (magnetic resonance imaging in this case) but displayed in three forms (T2-weighted, fluid-attenuated inversion recovery, T1-weighted gadolinium) to demonstrate different tissue characteristics.





Figure 9.1 Magnetic resonance imaging brain image at threee acquisitions giving different contrast profiles of the same tissues. T2W, T2-weighted; FLAIR, fluid-attenuated inversion recovery; T1W Gd, T1-weighted gadolinium.


For skilled and experienced readers, the range of imaging methods is a positive benefit. It is common for several imaging methods to be used in the investigation of the same diagnostic problem and the additive effect of combining two or more imaging methods outstrips the performance of any one used singularly. This advantage is due to enhanced decision making and interpretation of the problem when more diverse data are made available to the reader.


Image features are representations of physical objects and not the objects themselves, so the decision maker must understand the nature of the representation. The informed observer can then go on to use the data in a problem-solving activity.


The image is not the only patient data to which the reader has access. The clinical history of the patient will also accompany the images and this may be read in conjunction with the image to clarify and guide the process. Equally informative are previous examinations. These are particularly important to the decision-making process when the imaging examination is a repeat of a previous study or is a follow-up on the progress of disease after an intervention. Increasingly, with access to electronic health records, radiologists and pathologists can include many other sources of information in their decision making, such as lab results, medications taken, and cardiac test results.


Available clinical information and previous examinations are assumed to be helpful, but there may be some circumstances where this is not the case. Swenson et al. (1985), for example, showed that there were some benefits in scrutinizing medical images without any prior knowledge of the possible clinical condition. The problem has been discussed in terms of the perceptual interference of extra data on the visual search process rather than decision making. Curtailing search activity when an expected lesion has been identified can have the effect of missing additional lesions. This has been noted as a distracting effect of clinical information and is reported to be like the satisfaction of search effect, which is discussed later in this chapter with respect to cognitive bias.


No doubt prior information guides search and influences decisions, but for most practitioners these are positive effects and are considered more important than potential pitfalls.



9.5 Interpretation Process



9.5.1 Features in Medical Images


Consider some of the characteristics of radiological features, how they vary, and how they are organized into classes that can be agreed on by observers. Medical images display structure and they reflect function. The structural information is recognized and understood because of a knowledge base that the observer has compiled from studies of anatomy and function. Recognition takes place when the observer translates image objects into the “language” of anatomy learnt in a different visual form (e.g., textbooks or models). But there are perceptual and cognitive obstacles to the translation that make medical image interpretation subject to variance. This variance is based on readers’ knowledge, experience, and biases, and two common imaging methods illustrate some of the simple obstacles that contribute to interpretation difficulties for newcomers to the task.




  1. 1. Conventional radiography images. Take the characteristic of shape. X-ray images record features of a three-dimensional (3D) object on to a 2D receptor; and because depth cannot be represented in this process there is overlap and geometric distortion of many structures. A principle used to help recognition is to have more than one view of the region and these are termed orthogonal projections. Views from different perspectives demonstrate structures that can be recognized as a shape mentally reconstructed from the views. A powerful support to the shape-based decision is the location of the structure (Figure 9.2). Radiology uses this principle so all images of a region of interest include contextual structures or neighbors that act as frames of reference in a field of view (Herring, 2007).





    Figure 9.2 Top: Fracture is present but its location and context are uncertain because no neighboring structure is visible. This has limited value for the clinician. Bottom: Fractures are seen in context of nearby joints and allow a complete diagnostic evaluation.



    Another characteristic of objects in medical images is color. The color in radiology images is mostly black, white, and shades of gray; but these still have importance for feature recognition. The gray tone value of a feature tells the observer a great deal about its make-up and its relationship with its surroundings. This quality, known as contrast, influences the interpretation of its meaning. It is regarded with such importance that image technologies and contrast agents have continuously sought new ways of exploiting this means of demonstrating radiological features.



  2. 2. Sectional X-ray images, computed tomography, magnetic resonance and digital breast tomosynthesis images, are an effective solution to the problem of 2D / 3D. The thickness of each section is between 1 and 7 mm and so some multislice examinations have a thousand or more images. The feature recognition problem for observers in these types of exams has some similarities with conventional images but there are important differences that relate to location. Slices remove the problem of structural overlap, but they provide less information about the neighborhood relating to a feature in any single section. The observer must inspect several contiguous slices to fully appreciate 3D relations, adding to the cognitive demand of the task and the duration of the reading per case. The activity also demands much more effort from the reader in scrolling between slices and gaze tracking a selected structure through x, y, and z coordinates, as shown in Figure 9.3. It also requires a different orientation of anatomy, but the benefit is that it presents fewer perceptual difficulties because the shapes of structures are made invariant by the process.





Figure 9.3 Viewing stacked brain sections. Left: All visual fixations recorded for investigation of the stack are shown, and those associated with the currently displayed image are highlighted. Right: Three-dimensional visualization of the slices and how the gaze path moves up and down the stack, in addition to moving over an individual image.


(Reproduced from Philips (2010.)


9.5.2 Communicating Findings: Agreement on Meanings


When medical images are read, what is learned about the diagnostic problem is translated into a judgment or decision and passed on to the clinician requesting the test. Diagnostic errors in radiology, as in laboratory medicine, may result from faults in test requests (before image interpretation takes place), errors in interpretation, and/or in the clinical use of the results after the image report. Errors in radiology can been grouped as: (1) perceptual and cognitive errors related to failures in detection or interpretation and which are the main interest of this chapter; and (2) system errors where poor communication of results or failure to suggest an appropriate follow-up test are the identifying features (Pinto and Brunese, 2010). It has been pointed out by Graber et al. (2005) that diagnostic errors are often a result of an interaction between these two groups and a third component, system practices (e.g., viewing conditions, fatigue, reading workload). Together they have a serious impact on the likelihood of cognitive diagnostic errors in radiology (Lee et al., 2013), so errors of communication must feature in any treatment of cognitive failures in diagnostic imaging.


Reading medical images serves two purposes related to diagnosis:




  1. 1. A decision is made based on the visual information. The greater the precision of the judgment through a shared taxonomy of possibilities, the greater will be the diagnostic value of the test.



  2. 2. Diagnostic decisions articulated to clinicians can be converted into actions for treatment based on the new information. An agreed understanding in communicating findings is necessary to convey meaning and significance.


The clinician’s diagnostic decision and action following an imaging test depend on interpretation of the report, and although most doctors review the images themselves, what strongly influences their assessment is what the radiologist says. However, most radiology reports are made in imperfect conditions of certainty because medical images often contain subtle diagnostic signals that compete with noise. Ambiguities that mimic signals can add to this problem and the result is decisions that are less than clear-cut. The situation is typical of any meaningful decision making: it is a judgment made in uncertainty.


Radiological reports tend, therefore, to be verbal expressions of probability rather than absolute certainty. If there is a mismatch between the ascribed probability given by the radiologist and that from the clinician there is potential for misdiagnosis or inappropriate treatment. Such a misunderstanding could spring from the language of the report rather than the visual interpretation or understanding of the image features.


Hobby et al. (2000) investigated the reliability of communication of radiological reports and found large interobserver and intraobserver differences in the meanings of expressions used. Phrases such as “unlikely,” “probable,” “no evidence of,” “appears to be present,” and “unable to exclude” are all in common use in radiology reports and they attempt to convey the concept of doubt or confidence the radiologist holds for the visual decision. But in Hobby et al.’s study, terms like these were scored with very wide disagreement between report readers who were asked to give them probability values. Pathologists likewise often have difficulty communicating uncertainty in their reports (Bracamonte et al., 2016).


Visual analogue scales have been used to see if they reduce differences in interpretation (Bryant and Norman, 1980; Maxwell, 1978; McCormack et al., 1988; Robinson, 1989), but although numerical expressions of probability improve agreement between observers, they do not achieve complete congruence and have not been widely adopted. Figure 9.4 illustrates the interrater differences with some terms such as “probably present.” In simple experiments one can demonstrate how, even with these analogue scales, there are still wide differences between users in the interpretation of the level of probability of disease when such terms are used.





Figure 9.4 Visual analogue scale.



9.5.3 Mental Models


Recognizing the relevance of objects perceived is the important primary skill in reading medical images, and can be represented by mental models or schemata. These have meanings that can be expressed verbally. In studies of cognition in the interpretation of radiological information, verbal protocols have been used to access the thought activities of observers (Lesgold et al., 1988). Readers are asked to tell what thought processes they are using, how their decisions are being formed, and what significant factors they are considering based on the image information and external prompts. The prompts are those generated by the reader in drawing on similar past experiences. Comparable methods have been used to study cognition in interpreting graphs (Ratwani et al., 2008), where processes required to extract specific information and to integrate information were examined by collecting verbal protocol and eye movement data in combination.


An approach, first taken in a medical imaging context by Nodine and Kundel (1987) and others subsequently, has used recordings of visual search without verbalization. These perceptual experiments gain a different insight into the interpretation process. They contribute to our understanding because many aspects of visual search are unconscious to the observer who cannot give valid verbal data on the activity without disrupting its pattern. But both cognitive and perceptual approaches rely on mental schemata in their frameworks and they are the model most widely used in explaining recognition of diagnostic clues in images.



9.6 Cognitive Approaches



9.6.1 Outlook


The dual-process theory of reasoning has become the dominant model for understanding cognitive processing during human decision making in practical settings (Weaver et al., 2010). It has two operating systems, which do not have anatomical locations and must be thought of as methods of operation in thinking. The fast system is termed system 1. It is intuitive, operates automatically without conscious control, and arrives at solutions without effort. It is the origin of impressions that are the main source of the deliberate choices made by system 2. System 2 is deliberative and problem solving. It is a selecting process from a set of experiences or personally held options. The observer builds a suitable new hypothesis, or activates an existing one as a possible solution, and tests it against the available information. It is the slower of the two processes but, unlike system 1, it can construct thoughts in an orderly series of steps (Kahneman, 2011).


In this section the dual process is fitted to theories of recognition, experiments in visual search, and reader performance in medical image perception.



9.6.2 Recognition: Perception and Cognition


In searching a visual scene Nodine and Mello-Thoms (2000) concluded from eye-tracking experiments that a mental schema must be generated very early in the viewing process. In a very short interval (on the order of just a few hundred milliseconds) the observer’s visual attention is captured. The following stage of attention selection uses knowledge of the visual world to prompt the creation of a preferred perception. During this stage, a goal-directed mechanism plays an important role based on the observer’s expectations or intention, and it creates a preferred perception made up of definable objects (Figure 9.5). Conspicuous, well-known abnormalities are recognized rapidly and the observer finds those quickly, often without eye movements.





Figure 9.5 The global-focal model of perception in radiology.


(After Kundel and Nodine (1983).)

As Kundel (2000) points out, it is subject to change in response to added information about the implied meaning of the image. When the preferred perception is reached, a conscious decision can be made about the significance of the scene.


So-called flash experiments have investigated how much information can be gained from a single glance of short duration (~200 ms) and seem to confirm the operation of system 1 activity. When chest radiographs are presented in this way, experienced radiologists can recognize up to 70% of the abnormalities that are subsequently picked up in free search. This suggests a rapid development of appropriate schemata or at least access to existing ones at the global phase of visual search and detection.


Early work by Kundel and Nodine (1975) used these flash experiments to develop such a model, and Evans et al. (2016) have extended this recently to show that reliable decisions can be made at this stage. Figure 9.6 shows examples typical of those used to flash normal chest images and abnormal images for 200 ms; participants were asked to discriminate normal from abnormal. Performance is regularly reported as better than chance and in some cases responses included location information for the abnormalities.





Figure 9.6 Left, normal image typical of those used in 200-ms flash experiments. Centre, an example typical of the 200-ms flash images of a chest with pathology. High levels of discrimination between normal and abnormal are usually reported, although the precise number of lesions detected is often inaccurate. Right, the fracture site is a subtle crack in the radius bone in the wrist but the foveal fixations are limited to the circles at the bottom right of the image, where the reader is attending to the decision keys for his response. An accurate decision was made entirely with peripheral vision.


(Reproduced from Donovan (2005).)

It is remarkable that some of these decisions are based on scant information which is reported in some work as focal and in other studies as general. In any event, the global initial response from the whole retina, including peripheral vision, can provide an impression of abnormality at various levels to the expert and can be acted on. It can satisfy the decision process during which the observer uses system 1 thinking and by a “rule of thumb” or heuristic makes a judgment on the disease status of the case. A characteristic of some of these very fast decisions is that the observers find it difficult to fully explain how they happened or why they felt so confident, but this tends to support the involuntary way that system 1 works.


Donovan et al. (2005) reported examples where expert radiologists performed a simple task of fracture detection whilst being eye-tracked. On some occasions, they detected the fracture and reported it with confidence without any focal fixation on the fracture site. The example in Figure 9.6 shows the distance of the fracture from the foveal fixations which are shown to be on the decision keys on the display. The fracture was not a global feature and apparently received only peripheral attention.


It is likely that the fast-global impression also goes on to guide focal search (Kundel and Nodine, 1983). This search inspects image perturbations that, presumably, have been “flagged” at the global phase, and tests them against candidate objects for diagnosis. This is now a system 2 operation and through a deliberative process, the observer, with effort, arrives at a conclusion, as shown in Figure 9.5.


It might be thought that rapid scene recognition can be exploited in some way to improve search and accelerate expertise in learners. Recent work (Litchfield and Donovan, 2016) has shown, however, that flash scene previews guide search and provides benefit in improving the general performance of groups with mixed expertise in radiology, but expert performance was not contingent on seeing the scene preview, and that scene preview impaired novice diagnostic performance. It seems the role of subsecond and peripheral perception still has a great deal to tell us in relation to feature recognition and radiology expertise.


Identifying and discriminating specific objects in medical images can be thought of as a special form of object recognition. Recognition of radiological objects is only one part of a diagnostic problem-solving activity, however, because there must also be a sense of relevance and applicability to the clinical question. But it is a key stage in the process. The observer attempts to determine the meaning behind objects perceived in the context of the medical problem that initiated the acquisition of the image (Kundel, 2006).


We recognize objects in the everyday world over a range of different scales and orientations and generally the brain’s object recognition system can cope. Exceptions to this are illustrated by puzzles that are sometimes constructed to make familiar objects difficult to recognize by presenting them in an unusual context or from a strange perspective. Such puzzles are like the images of anatomical structures presented in radiology. However, Nodine and Krupinski (1998) and Smoker et al. (1984), amongst others, have used visual puzzles to investigate any special perceptual skills or spatial abilities in radiology experts compared with lay people. Studies like these have met with less success than one might expect in demonstrating any differences and, so far, it seems the special expertise of radiologists is highly specific to their professional task.


Object recognition is a matching process of the feature of interest compared with a mental template that represents it. When knowledge regarding the object is comprehensive, comparisons or matches are made easily and recognition takes place quickly. In some cases, we can identify an object from only a general outline of its shape, but for this we probably need a deep familiarity with the item (Ellis and Young, 1996). However, Kahneman (2011) makes the point that, although we marvel at examples of expert intuition and they seem almost magical to us, everyone performs some feat of intuitive expertise many times a day. An example is our perfect and rapid recognition of anger in the first word of a telephone call. Everyday intuitive abilities are so common we regard them with less awe than those from a special area of activity.


Cognition and perception are an interactive flow of information that leads to recognition. In medical image interpretation, this is taken a stage further to the point of skilled decision making involving the implications of findings. The special skill set required for this recognition has been the main attraction for researchers in expertise development (Nodine and Mello-Thoms, 2000).



9.6.3 Recognizing Objects in Medical Images


Object recognition was analyzed most extensively by Marr (1982), who took as his starting point, the premise that vision is a computation process. Representations are stored in symbolic form in the visual memory where they await retrieval given the right stimulus, as in Figure 9.7. At the schematic level the object is recognized with sufficient depth to allow connection with other schemata of known objects. It gives the opportunity to develop concepts of meaning around the object and not just its name.





Figure 9.7 A model for object recognition based on the framework from Marr (1982).


It is unlikely that all levels of recognition can be achieved with the same degree of ease. Certain levels are typical of everyday use and, because they are accessed frequently, they are recognized without conscious effort by the activity of system 1. When immediate decisions are made in a brief glance at a medical image, the meaning of the visual scene is included in a very rapid unconscious process based on heuristics. These “rules of thumb” are sometimes wrong because they use a principle of “what you see is all there is.” They do not (cannot) draw on any information other than that in the image and are very sure of themselves. But they work very well in most cases and are very economical, hence their establishment in thinking processes.


The semantic system operates at a different level where the properties of the object are worked on and recognition extends into areas of significance, relevance, and meaning. This is the type of system 2 operation that is applied to some ambiguous features in images that require time and effort in their recognition.


Kahneman (2011) suggests that heuristics do not explain all intuitive judgments. In particular, some accurate intuitions of experts may be better explained by the effects of their extended practice which provides them with a unique judgmental skill in their specialty. This approach seems to fit the speed and accuracy seen in the reading performance of expert radiologists.



9.6.4 Recognition and Radiology Performance


Levels of recognition have a clear connection with learning, expertise, and, consequently, some types of cognitive error. This might be particularly relevant in performance differences between image readers with different levels of experience. Studies of the diagnostic performance of radiologists carrying out the same task have shown how the caseload experience of individuals influences expertise (outcome). Mammography screening presents an opportunity to compare radiologists from different centers or in different countries and ask whether the number of mammograms read per radiologist (reader volume) drives both sensitivity and specificity. Esserman et al. (2002) have shown that reader volume is an important determinant of mammogram sensitivity and specificity. Higher cancer detection rates can be achieved with high specificity (low false-positive rate) in high-volume centers, suggesting the potential for optimizing mammography screening through careful attention to this effect. More recently, Suleiman et al. (2014) supported the expertise effect of volume in demonstrating that the number of cases read per year is a strong predictor of individual reader sensitivity in mammography screening even across populations with very different profiles of disease.


But it is important that the experience is specific. It has been emphasized by Gunderman et al. (2001) that expertise in one domain in radiology does not necessarily transfer to another. It is greater relevant experience in experts that allows their knowledge to be better organized and directed to the task. A practical outcome of this is that experts have a more advanced threshold between easy and hard recognition problems and a better sense of relevance. Radiology experts have gained their state through a large volume of examples and have firmly established mental templates for normal appearances, their variations, and for a range of abnormal features. Their familiarity with a greater number of image objects has shifted more of them into the “everyday” category for recognition purposes. Expert decisions are therefore more certain and a greater number of them are likely to be reached through system 1 thinking.


Robinson (1997) argues that the importance of radiology lies in the ability of its practitioners to make valuable decisions on “stress cases.” These are the difficult cases in the relationship between degrees of abnormality, doubt, uncertainty, error, and variation, as shown in Figure 9.8. The important cognitive feature of this framework is the way it characterizes “easy” and “hard” cases.





Figure 9.8 The areas of object recognition and semantic system refer to the equivalent regions outlined in the Marr model in Figure 9.7.


(After Robinson (1997).)

The cases in A are recognized easily because they are gross examples of typical or everyday cases where the reader has a stereotypical mental model of the abnormality. In B a minor abnormality is recognized with similar confidence because of its typical and clear appearance. But in C, despite the severity of the disease feature, the observer is uncertain because of doubt over the significance of the finding. The reader is operating very much in the deliberative system 2 method of thinking and carefully sifting his or her recognition schemata. Finally, in D the observer finds the case difficult because the disease feature mimics the reader’s stereotype of normality and from the individual’s knowledge-base s/he is in doubt of the meaning of the appearance.


There are other factors that have a powerful effect on whether a case is considered hard or easy. We only recognize what we know, and anything that affects the caseload experience of the observer will influence the level of recognition for the feature. An example of this effect might be the prevalence of the disease and its appearance in day-to-day clinical experience. Where many examples of the feature are seen at a high level of frequency, the ease and speed with which they will be recognized will be equally high. But this must be qualified by the understanding that the necessary learning leading to correct, easy, rapid recognition depends on feedback. Feedback on performance in decision making is an important feature in developing the knowledge base for identification and recognition because without it there is no validation stage in the process of forming accurate mental models.


Gunderman et al. (2001) have given some insight into this through studies of differences between experts and novices. Simply showing learners many examples of different lesions eventually produces some results, but if the correctness of decisions and the alternatives that could be considered is actively confirmed, effective recognition with relevance is speeded up. An experienced observer who has benefited from feedback will bring more to the problem for organizing the information into meaningful patterns and will suggest solutions beyond the perception of the learner.


One caveat to the argument that extensive exposure and feedback result in better radiological performance is they may also prime the observer to “recognize” features that are not there. This may be due partly to a bias that uses a simplicity and likelihood principle (Van Der Helm, 2000) and a reader should be mindful of this possibility. Given a certain image feature which looks like a common disease sign, it will be simpler to decide to call it something that is familiar and more accessible to recall. A cognitive process in this also asks how likely it is that the feature represents disease. If recent experience has exposed the reader to a high prevalence of the disease in routine caseload, or a recent error in missing such a feature is memorable, an image that suggests the sign will be more likely, in the view of the primed decision maker, to be a genuine presentation of the disease.



9.6.5 Recognition by Chunking Visual Information


Perturbations have been described (Kundel, 2000; Kundel and Nodine, 1983; Nodine and Mello-Thoms, 2000) as those parts of the image that radiologists find “odd.” They are recognized at the rapid first glance of an image and are unexpected anomalies generated from the global impression. The difference between the “not-odd” and the “odd” and the term perturbations indicates the disturbance they create in the observer’s model for a “normal” or “not-odd” image. The level of disturbance is related to some physical characteristics of the object. The conspicuity or salience of the object includes characteristics of size, contrast, and degree of edge sharpness as well as the nature of its surroundings. It is easier to detect and probably influences the preattentive phase of vision by targeting attention for subsequent search (system 2 thinking). But there is less evidence to show that a conspicuous object is always recognized more accurately. This tends to confirm the importance of the role that cognition plays beyond simple detection in image interpretation.


Detection is only a starting point in the decision-making process of whether an object is significant and worth reporting. Eye-tracking studies have shown that the measured conspicuity of small lung nodules in chest radiology has a bearing on their likelihood of being fixated but not on their identification as pathology (Manning et al., 2004). Detection and scrutiny of these perturbations do not always lead to a clear decision on their meaning. This decision feature of image interpretation is a difficult one for computer aids to detection and diagnosis to access. Many unreported lesions are detected by the eye–brain system but rejected at a level of recognition, so it may be that computer aids to detection offer less significant gains in performance to many experienced observers. Computer aids to improve radiological performance may find their greatest success through improving the decision-making component of the task.


Image perturbations may be at a local level for small objects, or they could be much more global in origin, as reported in the case for a nonspecific and unidentified feature in the contralateral breast of cases with a diagnosed mammogram lesion (Evans et al., 2016). Sometimes they may be a combination, where several small perturbations form a larger pattern. Perturbations provide a pop-out source of information for an immediate, intuitive decision, or they aid the plan that guides search. The speed at which perturbations are recognized at the global impression stage suggests a gestalt, whole-problem approach. Some of these aspects of visual processing are used in art and in puzzles as conflicts and “surprise” features. Interest arises when the gestalt suggests one visual decision but the subsequent focal attention suggests something else (Figure 9.9).





Figure 9.9 A global impression gives a fast response decision of a woman’s face. But there is more to the scene, and it could be that the face is a mere coincidence of the objects. However, our facial recognition systems are so strong we find the face image most compelling. Cognitive and perceptual explanations for the puzzle are possible here, among which are the proximity of the image components, allowing them to be actively grouped by our semantic systems.


This can occur in a medical image where the perturbation captured in the pop-out phase is a large low spatial frequency object and gives the observer a convincing match to a well-known pattern for a lesion. However, during search and focal attention, less obvious, smaller features contribute to the diagnostic decision and the original “flash” response is then modified (Figure 9.10).





Figure 9.10 Radiologists will often arrive at a diagnostic decision very quickly through what is thought to be a chunking of the key features demonstrated in an image (www.bcm.edu). In this chest film there is: (1) mild cardiac enlargement with (2) pulmonary venous congestion, (3) fluid within the horizontal fissure, and (4) general lymphatic engorgement. Collectively these signs indicate congestive cardiac failure to the experienced radiologist. Fast arrival at this decision probably involves little or no attention to irrelevant features, may be associated with priors, and can be thought of as an “aha!” moment (Ramachandran. 2003).


Another aspect of the grouping phenomenon illustrated by Figure 9.11 is the way that it is not always necessary to take all the available information from an image to arrive at a conclusion. What Ramachandran (2003) has called “aha!” moments occur when relevant “chunks” of information are merged to give a conclusive, confident decision even though some details may be absent from the scene. In these, the large perturbation in the image may be incomplete but is made up of smaller components that give a strong enough global message for a decision to be made – often very quickly.





Figure 9.11 The mammogram views show a large mass lesion at the global stage of viewing. Closer focal attention reveals more information in the form of adjacent architectural distortion along with scattered suspicious microcalcifications to modify and refine this diagnosis.


When radiologists are used as test bank readers in experimental work their performance often contains many examples of system 1 operations: fast and without effort. The total read duration for a single image is typically only a few seconds where inspection is carried out followed by the full diagnostic decision. It suggests that intuition and “chunking” of information plays an important part in the expert interpretation process (Manning et al., 2005).



9.6.6 What’s the Object Called?


Naming the object is important in the communication of decisions in image reading so it should be closely allied to any cognitive theories we might have on object recognition. We need to have a concept of what the object means to us before we can retrieve its name from our lexicon. In radiology, pathological features are sometimes given names that are derived entirely from what the feature looks like. These can be highly logical: for example, spiral fractures (Figure 9.12) have precise geometric appearance or sometimes compare with other well-known objects (Figure 9.13).


Jan 4, 2021 | Posted by in GENERAL RADIOLOGY | Comments Off on 9 – Cognitive Factors in Reading Medical Images: Thinking Processes in Image Interpretation

Full access? Get Clinical Tree

Get Clinical Tree app for offline access