Alan Tonnies Moore[1]

San Francisco State University


Eric Schwitzgebel

University of California, Riverside





What do people consciously experience when they read? There has been almost no rigorous research on this question, and opinions diverge radically among both philosophers and psychologists. We describe three studies of the phenomenology of reading and its relationship to memory of textual detail and general cognitive abilities. We find three main results. First, there is substantial variability in reports about reading experience, both within and between participants. Second, reported reading experience varies with passage type: passages with dialogue prompted increased reports of inner speech, while passages with vivid visual detail prompted increased reports of visual imagery. Third, reports of visual imagery experiences, inner speech experiences, and experiences of conscious visual perception of the words on the page were at best weakly related to general cognitive abilities and memory of visual and auditory details.


Keywords: reading, inner speech, visual imagery, introspection, experimental aesthetics, experience sampling



1. Introduction.

What sorts of conscious experiences do you have while reading? You are, in fact, reading at this very moment. So think, what are you experiencing right now?

Systematic studies that explicitly focus on people’s self-reported conscious experiences while reading are rare. In a way, this is surprising. Academics spend much of their lives reading. People studying the aesthetics of fiction and poetry are interested in the reading experience (Carroll, 2001; Fish, 1970; Holland, 1975; Kivy, 2008; Lamarque & Olson, 2004; Miall & Kuiken, 2002; Phelan, 2007; Robinson, 2005). However, almost all existing explicit claims about people’s conscious experiences (or phenomenology) while reading are based on unsystematic armchair introspection by the scholar in question or, in some cases, on the retrospective reports of casual readers. This has led to a bewildering array of assertions about the phenomenology of reading by psychologists, philosophers, and literary theorists.

For example, some scholars assert that, at least for them, reading a narrative, such as a story or a novel, normally involves experiences of visual imagery (e.g., Ahsen, 1984; Dennett, 1991, p. 366; Nannicelli, 2013; Wittgenstein, 1946-1948/1975, p. 44). Others express skepticism about the frequency of conscious visual imagery or its empirical relation to textual details (e.g. Berkeley, 1710/2009, Intro, Section 20; Burke 1757/1990, p. 152; Kurby & Zacks, 2013). Similarly, some scholars assert that the phenomenology of reading normally involves inner speech, inner hearing, or some sort of voice in the head (Baars, 2003; James, 1890, p. 361; Kivy, 2008; Morin, 2009; Perrone-Bertolotti et al., 2014; Velmans, 2009), while others deny that reading must involve such a voice (Brouwers et al., 2017; Faw, 2009; Reed, 1916; Woodworth, 1906, p. 704). Scholars also disagree about whether people normally have conscious visual experience of the words on the page when they read. Julian Jaynes (1976, p. 26), for example, says that once readers are absorbed in the text, they have no visual experience at all of the words on the page, that one normally experiences only the meaning of the words and has no conscious experience of the letters as they appear on the page. Russell T. Hurlburt (Hurlburt & Schwitzgebel 2007, p. 50) seems to agree that this is often the case, while Charles Siewert finds this “just about as obviously false a remark as one could make about visual experience” (1998, p. 249).

The experimental literature on the cognitive architecture involved in reading, while large, is mostly silent on the question of readers’ conscious experiences. Some of the most well understood cognitive processes involved in reading are those that process phonological (sound) information. While the body of research on phonological coding (e.g. Leinenger, 2014; Seidenberg & McClelland, 1989; Van Orden, Johnston, & Hale, 1988) and the phonological loop (Baddeley 2010; Coltheart, Rastle, Perry, Langdon, & Ziegler, 2001; Frost, 1988; Lauro, Reis, Cohen, Cecchetto, & Papagno, 2010; Paap & Noel, 1991) show that the sound of a word plays a foundational role in processing textual information, this process could be, and likely often is, unconscious. Many of the cognitive processes involved in reading are largely unconscious, such as visuospatial memory (Baddeley, 2000, 2007; Pham & Hasson, 2014) and the use of situational models in narrative comprehension (Kurby & Zacks, 2013; Zwaan, 2016; Zwaan & Radvansky, 1998). Our eyes move multiple times a second while reading, but this does not entail that we are conscious of each saccade, and similarly, attunement to the sound of the words while reading does not entail an auditory experience of any sort. Unfortunately, the experimental literature often fails appropriately to highlight the important distinction between possibly non-conscious cognitive processes involving phonological information and the conscious experience of inner speech (e.g. Kurby, Magliano, & Rapp, 2009; Leinenger, 2014). The same general trend is at work in the research on the visuospatial sketchpad and situational models of narrative comprehension. This leaves us in a rather odd position: the experimental literature on reading has told us a lot about the cognitive architecture recruited for reading but very little about the conscious experiences that people have while reading.

We see two possible explanations for the broad disagreement in reports about the experience of reading. One explanation is that people have radically different types of experience while reading, and individuals or researchers tend to overgeneralize from their own case, or from one text type or one type of reading situation, to others. Another explanation is that people are often radically mistaken about even this seemingly obvious feature of their stream of experience. These explanations are not incompatible, and versions of both are endorsed by Russell T. Hurlburt, who along with his collaborators has begun some systematic work using self-reports of sampled experience while reading long passages of text (Brouwers et al., 2017; Caracciolo & Hurlburt, 2016). Hurlburt and collaborators’ primary conclusions are that people vary considerably in their experience while reading and that inner speech is much less frequent than is commonly assumed.

Hurlburt’s work depends on his time-intensive Descriptive Experience Sampling (DES) method (Hurlburt, 2011). DES involves extensive personal interviews about individual moments of sampled experience, collected using a beeper. DES interviews are conducted by expert interviewers who can, if things go well, help participants “bracket presuppositions” in order to access their “pristine” experiences. Although we believe DES is a valuable method, it has several shortcomings for the present purpose: (1) Since it is time intensive, DES studies are always limited to small samples. (2) Since DES requires expert interviewers who follow no set script, it is difficult to replicate and it can be difficult to assess the extent to which “experimenter effects”, such as interviewer bias, are influencing the results. (3) To focus exclusively on the sampled experiences, DES normally does not include other measures, such as reading comprehension measures or other types of subjective report, which might better illuminate the cognitive processes at issue. (For extensive discussion of the methodological pros and cons of Descriptive Experience Sampling, see Hurlburt & Schwitzgebel, 2007.)

Shirley A. Long, Mark Sadoski, and their collaborators have also explored introspectively-reported imagery experience while reading. Studying fifth-graders’ (age 10 to 11 years) responses to poetry, narrative, and expository writing, Long, Winograd, and Bridge (1989) found that their respondents reported visual imagery about 60% of the time when stopped at selected points in the passages, and that the levels of imagery reports were similar for the different passage types. However, their power was limited by having only 26 participants in their design. Goetz, Sadoski, Stowe, and Fetsco (1993) similarly collected introspective imagery and emotion reports from undergraduate students while reading a full-length story. They found correlations between the presence of reported imagery and emotion, and in a paragraph-by-paragraph analysis, found that 7%-25% of participants reported visual imagery for the paragraphs in question. Unfortunately, Goetz and collaborators do not report individual differences between participants, and with only 40 participants, they had limited power to detect such differences. Furthermore, by asking about imagery and emotion after almost every paragraph, Goetz and collaborators may have created experimenter demand effects toward reporting imagery after at least some paragraphs. (For similar research, see also Goetz, Sadoski, & Olivarez, 1991; Goetz, Sadoski, Olivarez, Lee, & Roberts, 1990; Krasny & Sadoski, 2008.) Sadoski and Quast (1990) also report a relationship between introspectively rated imagery and recall of passage details. There is also a literature on the effect of imagery instructions in improving reading comprehension (e.g. Cohen & Johnson, 2012; Gambrell & Bales, 1986; Johnson, Cushman, Borden, & McCune, 2013; Sadoski, 2005) and a literature on “narrative transportation” based on Green and Brock’s (2000) influential measure, tending to find higher “transport” reflecting higher levels of reader engagement and motivation. Although these research paradigms are related to the present research question, in neither literature are introspective reports of specific imagery experiences systematically collected.

Below we present three studies of reading experience using medium-to-large samples of participants and several Likert-scaled and yes/no questions about the sampled experience. In addition to asking directly about participants’ experiences, we also include measures intended to reveal possible differences in the cognitive processes of readers who report different types of experience. In addition, we varied the types of texts to which participants were exposed, which might be expected to influence the cognitive processes employed (Kurby & Zacks, 2013; Long, Winograd, & Bridge, 1989; Nijhof & Willems, 2015) and possibly therefore also the conscious experiences of the reader. We aimed to test five hypotheses.

Hypothesis 1: Between subjects, people report very different types of experience while reading. For example, some people report frequent visual imagery while others report no visual imagery.

Hypothesis 2: Within subjects, people report variable types of experience while reading. For example, people report experiencing inner speech some of the time but not all of the time while reading.

Hypothesis 3: Different types of texts tend to evoke reports of different types of reading experience. For example, a passage describing rich visual detail will evoke more reports of visual imagery than a passage of dialogue with little explicit visual detail.

Hypothesis 4: Reports of different types of reading experience are correlated with differences in memory of the corresponding types of textual detail. For example, people reporting visual imagery will remember more visual detail.

Hypothesis 5: Reports of different types of reading experience are correlated with corresponding differences in general cognitive abilities. For example, people reporting more visual imagery will perform better on a mental folding task.

We will focus on experience modality, specifically whether participants report visual imagery, inner speech, or visual experience of the words on the page; plus self-reports of mind wandering.


2. Experiment 1

In this experiment, we presented a poem to participants, and we asked participants to focus on different aspects of their experience while reading the poem. We anticipated that focusing on a particular aspect of experience while reading might either enhance the rate at which that experience actually occurred while reading the passage in question or – alternatively but not incompatibly – enhance participants’ ability to detect the existence of such aspects if the experiences are subtle or difficult to detect. When participants finished reading, we asked them to report on their experience of the poem, and we tested memory of specific textual details.


2.1. Methods

2.1.1. Participants

            243 participants (148 female, mean age = 38.6, SD = 13.1) from the United States were recruited through Amazon Mechanical Turk (MTurk) for a small fee. Although MTurk is a relatively new tool for recruiting participants, data from MTurk appear to be as reliable as data obtained from more traditional methods (Buhrmester, Kwang, & Gosling, 2011; Casler, Bickel, & Hackett, 2013; Hauser & Schwarz, 2016). Participants were randomly assigned to one of four groups: inner speech (N = 57), visual imagery (N = 59), words on the page (N = 51), or avoidance of mind wandering (N = 76). All participants read the same passage and answered the same set of questions.


2.1.2. Text

            Participants read a modified version of the poem “The Egg and the Machine” by Robert Frost (1927/1969), about 250 words long. The poem was separated into five stanzas. Two words were repeated twice in a row (“had had” and “was was”) for use in a word recognition task.


2.1.3. Introspective Reports

            Before reading the passage, participants provided an initial set of reports about their experiences of inner speech, visual imagery, visual experience of words on the page, and mind wandering. Each prompt included examples to illustrate the relevant phenomenology. For example: “How often do you experience an inner voice when you read? Examples: you hear a voice reading in your head, you hear the characters speaking in your head”. Participants responded using seven-point Likert scales labeled “Never” (1), “Half of the Time” (4), and “Always” (7). Because there is an intuitive sense in which we always experience the words on the page when we read, we inverted the phrasing of this question, asking participants about the absence of any perceptual experience (“How often do you NOT experience the words on the page when you read? Examples: you’re so absorbed in a story that it almost seems like you’re there, your mind is filled with the ideas in the story and not the actual black letters against the white background”). We reverse coded reports of words on the page to mirror the other reports in the experiment. After completing the passage, participants provided a final set of introspective reports on their experiences during the experiment. For example: “While reading the poem, how often did you experience inner speech? Examples: you hear a voice reading in your head, you hear the characters speaking in your mind.” Thus, readers were asked for two different reports, one (before reading) about their reading experiences in general, and one (after reading) about their experience while reading the particular passage in question. (These stimulus materials are available in the Supplementary Online Materials, as are the raw data and all other materials used in these three experiments.)


2.1.4. Focus Instructions

After providing their initial introspective reports and before reading the poem, participants in the inner speech condition were told that “Many people say they hear inner speech when they read, such as a voice reading in their head or the characters speaking in their mind. There is a short poem on the next page. While reading the poem, focus on your experience of inner speech.” Participants in the other conditions were given similar instructions: to focus on their visual imagery, on the experience of “the actual words on the computer screen”, or on preventing their minds from wandering.


2.1.5. Memory of Textual Details

            After reading the poem, participants answered a series of questions presented in random order. To test the hypothesis that participants reporting greater amounts of inner speech would find rhyme more salient and memorable, we asked participants to identify 10 rhyming pairs from the poem out of a list of 20. To test the hypothesis that reporting greater visual experience of the words on the page would influence memory of visual details of the textual presentation, we asked participants to identify the poem’s font from a list of four dissimilar fonts, to identify the two words that were repeated twice (e.g. “had had”) out of a list of nine, to remember whether 10 two-word phrases appeared at the beginning or end of line, and to remember whether 10 lines appeared at the beginning or end of a stanza. Because the behavioral measure involved short-term memory, participants also performed a memory task for 3-digit numbers. Finally, a reading comprehension question tested for basic understanding of the text. (Participants also performed a Stroop task. However, reaction time data for the Stroop task were not properly recorded and so those data are excluded from analysis.)


2.1.6. Procedure

            Participants took part in the experiment on their own computers using a web browser and the Lime Survey platform. We collected demographic information and then solicited the initial set of introspective reports. We then randomly sorted participants into the four conditions. Reading the poem took about 1-2 minutes. After reading the poem, participants provided the final set of introspective reports, then responded to the behavioral measures. The experiment took about 15 minutes to complete.


2.2. Results

After excluding responses from 15 participants who answered the comprehension question incorrectly, we included 228 participants in the analysis. Initial introspective reports were intended primarily as training and correlated moderately well with the final introspective reports (r = .33 to .55). Figure 1 shows the distribution of the final set of introspective reports. Reports of experience differed substantially between participants, with participants spread widely across most of the 1-7 range for all three modalities. Participants also reported variable experience, with only a minority of participants reporting “always” or “never” experiencing visual imagery, inner speech, or the words on the page. Visual imagery was most commonly reported (M = 4.8, SD = 1.6), followed by words on the page (M = 4.6, SD = 1.6), inner speech (M = 4.3, SD = 2.0), and mind wandering (M = 2.9, SD = 1.6). Except for reports of visual imagery and words on the page, all differences between means were pairwise significant (paired t ≥ 2.2, p < .03). Contrary to our expectations, the focus instructions did not have any statistically detectable influence on the final introspective reports (ANOVAs [3, 224], F ≤ 1.4, p ≥ .26).  


Figure 1

Histogram of retrospective introspective reports of reading experience provided after reading a poem.


Participants who reported more inner speech experience did not perform detectably better on the questions targeting memory of phonological aspects of the text such as rhyming pairs (r = .11, p = .11), nor did participants reporting more experience of the words on the page perform detectably better on the questions targeting memory of visual aspects of the text such as font type (r = -.01, p = .94). G*Power analysis found that this study had sufficient sensitivity to detect a medium-sized effect (r = .18) at the .05 alpha level with a power of .80.

Reports of mind wandering correlated negatively with reports of visual imagery (r = -.33, p < .001), reports of the words on the page (r = -.22, p = .001), and with our general test of memory performance (r = -.15, p = .03). We interpret a high degree of reported mind wandering as indicating low participant engagement.


2.3. Discussion

Confirming Hypotheses 1 and 2, participants reported very different experiences while reading, and most reported variable experience in the course of reading. For example, few participants reported “always” experiencing inner speech while reading, despite the widespread view among philosophers and psychologists that reading typically involves consciously experienced inner speech. The failure of the focus instructions to have any detectable effect on the reported modality of experiences might have been due to (1) participants failing to comply with the focus instruction over the 1-2 minutes of reading, (2) a weak relationship between focus on one modality and experience in that same modality, (3) failure of the retrospective reports to capture participants’ actual experience, or (4) limited statistical power. The failure to detect a relationship between reported modality of experience and memory of corresponding features of the text might be due to either (1) failures of retrospective report, (2) limited statistical power, or (3) performance on the memory questions depending on cognitive factors unrelated to the factors influencing the subjective reports.


3. Experiment 2

For Experiment 2, we increased the number of participants to improve statistical power, interrupted the reading experience with a beep to solicit a more immediate report of a single sampled moment of experience, and changed the measures of participants’ memory of textual detail. We also wanted to explore the effects of passage type on modal experience (Hypothesis 3). Participants were either shown a poem with vivid rhythm and rhyme, a dialogue from a play written for the theater, or a prose passage full of dramatic dialogue or rich visual detail. Participants who read a poem were subsequently asked a question based on rhyme. We hypothesized that participants with abundant inner speech would be more likely to answer that question correctly. Participants given the dialogue or prose passages were either asked about the colors of objects in the story or about details of the visual presentation of the text on the screen. We hypothesized that participants reporting abundant visual imagery would more accurately remember color and that participants reporting frequently experiencing the words on the page would more accurately remember details of the visual presentation of the text.


3.1. Methods

3.1.1. Participants

We recruited 1,457 participants (800 female, mean age = 25.7, SD = 10.9), 864 from the Psychology Subject Pool at University of California, Riverside, and 593 who received a small payment through Amazon MTurk. All MTurk participants were from the United States and reported English as their first language. Participants were randomly sorted into one of three conditions: inner speech, visual imagery, or words on the page.


3.1.2. Texts

            The texts were four tightly rhymed poems, one stage dialogue, one prose passage with dramatic dialogue, and two visually vivid descriptive passages, each with minor variations. Each poem contained one novel word, used twice (e.g., the name “Tennaise”), the pronunciation of which was disambiguated by the rhyme scheme (e.g., “And their gallant moves precise / Sailing safely into port / Chased by beautiful Tenaisse?”). The dialogue and descriptive passages differed in the color terms used to describe central objects (e.g., “tan” vs “golden” leaves) and in the font and page format in which they were presented. There were two versions of each poem and four versions of each of the remaining passages, for a total of 24 unique passages. Participants in the inner speech condition were always assigned a poem. Participants in the other two conditions were assigned one of the other passages. All passages were approximately 500 words long.


3.1.3. Introspective Reports

In addition to soliciting general introspective reports before and after reading, as in Experiment 1, this study solicited concrete reports of experience at a specific moment in time. Participants heard a one-second 500 Hz beep through their computer speakers at a random time 30-90 seconds after the page loaded. After the beep, a new page loaded automatically and asked participants to report their experience “in the final split second before the beep”. Participants responded “Yes”, “No”, or “Maybe / Don’t Know” for inner speech, visual imagery, words on the page, and mind wandering. Thus, we collected three types of introspective reports: general reports about reading experience (collected before reading the assigned passage), a concrete “beeped” report about a particular moment of experience (collected immediately upon interrupting reading with a beep), and a general retrospective report about these experience while reading the passage (collected after having completed the entirety of the passage). We also asked participants to describe their experience in a free-response text box. We used these responses to exclude concrete reports from participants who reported that they had already finished reading at the time of the beep. We also intend to analyze these free responses as an exploratory basis for future research, but we will not further report on them in this article.


3.1.4. Memory of Textual Details

            After reading the passage, participants in all three conditions answered two reading comprehension questions, and either a phonological question, questions about visual details of the story, or questions about visual details of the text on the screen.

Phonological question. Participants in the inner speech condition were asked to identify a word that rhymed with the target novel word. Pronunciation of the novel word had been disambiguated by the rhyme scheme, and the word was used twice. For example, “Tenaisse” had been disambiguated by “precise”, and the memory question asked whether, in the context of the poem, the word “Tenaisse” rhymed with “vice” [correct], “ace”, “lacy”, or “spicy”.

Questions about visual details of the story. Participants in the visual imagery condition were asked about the color of an object prominent in the visual scene but whose color was not relevant to the action. The object was always described using a moderate frequency color term, while the memory question employed a basic color term in the same semantic space (for example “golden” leaves in the text and participants were asked if the leaves were “yellow” [correct], “green”, “brown”, or “red”).

Questions about visual presentation of the words on the screen. Participants in the words-on-the-page condition were asked to identify the font in which the passage was presented and the quadrant in which a particular phrase had appeared in a two-column, multi-page text.


3.1.5. Procedure

            Participants took part in the experiment on their own computers using a web browser and the Lime Survey platform. After a sound check, participants provided general reports on their experience while reading and then read one of the 24 passages, determined at random. While reading the passage, participants heard a beep at a random time in the first 30-90 seconds. This prompted a new page to load, instructing participants to provide concrete reports on their experiences. Participants then returned to the passage and finished reading. Afterwards, participants answered two comprehension questions and 1-3 memory questions depending on condition. The experiment concluded with a final set of general introspective reports. The entire study took about 15 minutes.


3.2. Results

We excluded 224 participants who answered the comprehension questions incorrectly, spent less than ten seconds reading the passage after giving the concrete report, or who indicated in their response box that they had already completed the passage at the time of the beep, leaving 1,233 participants for analysis. Exclusion rates were somewhat higher in the visually vivid descriptive passage condition (98/444 [22%]) than in the dialogue (56/507 [11%]) and poetry (70/506 [14%]) conditions. Figure 2 shows the distribution of the final set of general reports. The data are similar to Experiment 1, with visual imagery reported most often (M = 5.0, SD = 1.5), followed by inner speech (M = 4.6, SD = 1.9), words on the page (M = 4.3, SD = 1.6), then mind wandering (M = 3.5, SD = 1.7), with all differences between means pairwise significant (t ≥ 4.9, p < .001). Figure 3 shows the distribution of concrete reports. Compatible with the general report means of somewhat over 4 (“Half of the Time”), participants reported each of the types of modal experiences in somewhat more than half of the sampled moments, again with visual imagery the most frequent (visual imagery 70%, inner speech 59%, and words on the page 56%; χ2 [4] = 76.9, p < .001). Participants reported a unimodal experience in 23% of concrete reports (e.g. inner speech with no visual imagery or experience of the words on the page), while 66% reported a multi-modal experience, with 11% reporting no modal experience whatsoever. The concrete reports of mind wandering in 29% of probes are broadly consistent with Schooler et al. (2004), who found participants “caught” mind wandering while reading in 23% of probes.


Figure 2

Histogram of retrospective introspective reports of reading experience across all conditions and passage types in Experiment 2.



Figure 3

Histogram of the concrete “beeped” introspective reports of reading experience across all conditions and passage types in Experiment 2.


Participants reported more inner speech in the dramatic dialogue passages than in the passages emphasizing visual description, and conversely more visual imagery in the descriptive passages than in the dramatic dialogue passages, though the effect sizes were small to moderate. The poetry passages had both phonological and imagistic elements and tended to be intermediate. See Table 1 for details.


Table 1

Difference in reported visual imagery and inner speech by passage type across all conditions.

Passage type




Test Statistic

Mean visual imagerya




F(2,1230) = 4.3, p = .01, η2 = .007

% visual imageryb




χ2(4) = 14.4, p = .006, Cramer’s V=.076

Mean inner speecha




F(2,1230) = 11.2, p < .001, η2 = .018

% inner speechb




χ2(4) = 19.1, p = .001, Cramer’s V=.088

aThe mean general report of the experience of visual imagery or inner speech. bPercentage of affirmative concrete “beeped” reports.


Differences in reported experience did not significantly correlate with memory for corresponding passage details, although there were two marginally significant trends with small effect sizes in the predicted direction. See Table 2. G*Power analyses show that this study had sufficient sensitivity to detect weak correlations (r = .13-.14, depending on condition) at the .05 alpha level with a power of .80.


Table 2

Two-tailed Pearson correlations between introspective reports and memory of textual detail across all conditions.


Behavioral Measures

Introspective Reports

Correlation (p)


Visual Story Details

Visual Imagery



.09 (.07)

.01 (.91)


Phonological Details

Inner Speech



.04 (.44)

.08 (.08)


Details of Visual Presentation

Words on the Page



.00 (.94)

-.02 (.67)



General reports of mind wandering correlated negatively with reports of modal experience (visual imagery: r = -.29, p < .001; inner speech: r = -.11, p < .001; words on the page: r = -.17, p < .001) and also correlated negatively with our measures of memory performance (visual story detail: r = -.14, p = .005; phonological questions: r = -.12, p = .01; details of visual presentation: r = -.10, p = .04). This is consistent with the results from Experiment 1.


3.3. Discussion

            Confirming both Hypotheses 1 and 2, there was substantial variability in reported modal experiences, both between and within subjects. Extending the results of Experiment 1, and in accord with Hypothesis 3, we confirmed that different types of text elicited different types of introspective report. Participants reported the most visual imagery when they read texts with vivid descriptions of visual detail, and they reported the most inner speech when they read dramatic dialogues. Despite sufficient statistical power to detect even fairly small correlations, we found no relationship between reported modality of experience and performance on seemingly-related memory tasks, contrary to Hypothesis 4. However, there were in some cases statically marginal trends in the predicted directions.


4. Experiment 3

Individual differences in types of reading experience might correlate with differences in general cognitive abilities (Hypothesis 5). Experiment 3 was aimed at testing this hypothesis, using a “mental folding” task to test visual imagery ability and a phonological interference task to test reliance on phonological information in reading. If introspective reports of inner speech reflect reliance on phonological processing while reading, then participants who report high levels of conscious inner speech might perform worse on a word memory test designed to be difficult due to the phonological similarity of the target words.

We also wanted to see if a more powerful study could reveal a relationship between reported modality of experience and memory for corresponding details. Since the visual story detail task came closest to statistical significance in Experiment 2, in Experiment 3 we improved power by presenting visual story detail questions to all participants. We also changed the final set of introspective reports so that they matched the introspective reports used at the beginning of the study. That is, we asked participants to assess what is generally true of their reading experience both before and after the study, so that we could examine the reliability of participants’ answers to this type of question.


4.1. Methods

4.1.1. Participants

            595 participants (291 female, mean age = 34.2, SD = 11.2) were recruited through Amazon MTurk for a small fee. All participants were from the United States and reported English as their first language. Participants were randomly assigned one of three passages to read.


4.1.2. Texts

            The passages used in this study were slightly modified versions of the three prose passages used in Experiment 2. All were presented in a single page of text.


4.1.3. Introspective Reports

            This experiment recorded general and concrete reports using the same method as in Experiment 2, with two exceptions. First, because Experiments 1 and 2 did not find even a marginally significant correlation between reports of experiencing the words on the page and memory of visual details of textual presentation of the words on the page, we did not ask for this type of report. Second, at the end of the previous experiments we asked participants to report retrospectively on their experience “while reading the passage”, however in this study we asked participants exactly the same introspective questions at the end as at the beginning (e.g., about how often they experience visual imagery when they read). We prefaced these final introspective reports by saying “In light of this experiment, you may have changed your opinions about your experiences while reading. It’s fine if they changed or stayed the same, just do your best to answer truthfully”.


4.1.4. Behavioral Measures

            Participants answered two general comprehension questions in addition to two questions about visual story details, similarly to Experiment 2. The folding task consisted of 10 questions that tested the ability to mentally fold a piece of paper, taken from Ekstrom et al. (1976). For the phonological interference task, participants were shown five words in a series for one second each, drawn at random from a list of ten words and were asked to recognize the list in order from the entire set of ten. After a training set that was not used in the analysis (old, deep, foul, late, safe, great, strong, thin, long, broad), participants performed the task six times, switching off between a phonologically similar set (mad, man, map, mat, max, can, cad, cap, cat, cab) and a control set of phonologically different words (pen, rig, day, bar, cow, sup, pit, hot, few, bun). The design and word sets come from the first experiment in Baddeley (1966).


4.1.5. Procedure

            The procedure was the same as Experiment 2 through the end of the reading passage. After completing the passage, participants answered two comprehension question and two questions about visual story details, presented in random order, followed by a final set of general introspective reports. Finally, participants performed the mental folding task and the phonological interference task, again presented in random order. The experiment took about 20 minutes to complete.


4.2. Results

Using the same criteria as the previous study, we excluded responses from 48 participants, leaving 547 participants in the analysis. When probed with a beep, 82% of participants reported visual imagery, 57% reported inner speech, and 15% reported mind wandering (χ2 [4] = 562, p < .001). In the final general reports, visual imagery (M = 5.4, SD = 1.4) was reported more frequently than inner speech (M = 4.9, SD = 1.7) and mind wandering (M = 3.1, SD = 1.4), with all differences pairwise significant (paired t ≥ 6.0, p < .001). These results correspond to the results in Experiments 1 and 2 but are not strictly comparable because the final reports in Experiment 3 concern reading experience in general while the final reports in Experiments 1 and 2 concern experience while reading the presented passage. Participants reported a unimodal experience in 42% of concrete reports (e.g. inner speech with no visual imagery), while 49% reported a multi-modal experience, with 9% reporting no modal experience whatsoever.

Table 3 shows the results of two-tailed Pearson correlations between the three introspective reports recorded in this experiment. Because we solicited the same general reports at the beginning and end of the experiment, this served as a measure of test-retest reliability. We believe that these correlations (r = .70 to .87) fall within acceptable range. Since the intervening time between reports was less than 15 minutes, a longer test-retest delay might show substantially lower correlations. We did not find that participants who reported concrete experience in one modality tended to shift their final general reports to better match their concrete reports. We measured this by comparing their concrete reports in each modality (coding them as -1 for no, 0 for don’t know, and +1 for yes) with the difference between their final general report (on the 1-7 scale) and their initial general report (on the same scale, so that for example a shift from 1 “Never” to 4 “Half of the Time” would be +3). If the concrete experience report influences the final general report, then the concrete reports should correlate positively with the difference between the final general reports and the initial general reports. However, we did not find a statistically significant correlation for either modality (inner speech r = .03, p = 43; visual imagery r = -.06, p = .15).


Table 3

Pearson correlations between introspective reports in Experiment 3.




Initial General Reports

Final General Reports

Concrete Reports



Inner Speech

Visual Imagery

Mind Wander

Inner Speech

Visual Imagery

Mind Wander

Inner Speech

Visual Imagery

Mind Wander

Initial General Reports

Inner Speech










Visual Imagery

.46 (<.001)









Mind Wander

-.06 (.20)

-.03 (.54)



















Final General Reports

Inner Speech

.87 (<.001)

.46 (<.001)

-.05 (.30)







Visual Imagery

.43 (<.001)

.77 (<.001)

-.05 (.22)

.43 (<.001)






Mind Wander

.01 (.83)

-.02 (.66)

.70 (<.001)

.00 (.94)

.02 (.61)
















Concrete Reports

Inner Speech

.55 (<.001)

.13 (.002)

-.08 (.05)

.57 (<.001)

.18 (<.001)

-.05 (.23)




Visual Imagery

.11 (.01)

.40 (<.001)

-.13 (.003)

.17 (<.001)

.36 (<.001)

-.12 (.007)

.10 (.02)



Mind Wander

-.03 (.46)

-.04 (.38)

.31 (<.001)

-.09 (.03)

-.03 (.42)

.31 (<.001)

-.14 (.001)

-.30 (<.001)



Note. Statistical significance of the correlation is in parentheses. Corresponding modal reports are in bold. N = 547.


Table 4 shows the results from a planned series of two-tailed Pearson correlations between introspective reports and behavioral measures. Because this experiment did not include multiple conditions, statistical power was higher than in the previous study despite the lower total number of participants. While Experiment 2 found a marginally significant correlation between memory of visual story detail and general reports of visual imagery, this study found a significant correlation between memory of visual detail and concrete reports of imagery. Contrary to Hypothesis 5, we did not find significant relationships in the predicted directions between introspective reports and either the folding task or the phonological interference task. In fact, we found a small but statistically significant correlation in the unexpected direction between mental folding and reported visual imagery, a result for which we have no interpretation. This study had sufficient sensitivity to detect weak correlations (r = .12) at the .05 alpha level with a power of .80.


Table 4

Two-tailed Pearson correlations between introspective reports and behavioral measures in Experiment 3.

Behavioral Measures

Introspective Reports

Correlation (p)


Visual Detail Questions

Visual Imagery



.07 (.11)

.10 (.03)

Folding Task

Visual Imagery



-.09 (.04) 

.05 (.24)

Phonological Similarity

Inner Speech



-.03 (.53) 

-.01 (.92)


4.3. Discussion

Experiment 3 confirms the high levels of variability in introspective report between and within subjects, as also found in Experiments 1 and 2. We also found reasonable test-retest reliability for our general introspective measures, and we found evidence in support of a weak correlation between reported visual imagery experience and memory for visual story detail. However, we found no evidence of a relationship between introspective reports and tests of mental folding ability or susceptibility to phonological interference.


5. General Discussion

Across all three experiments we found consistent evidence supporting four broad conclusions:

Confirming Hypothesis 1, between subjects, people report very different types of experience while reading. Some people report always experiencing visual imagery while reading, while others report never doing so. Some people report always experiencing inner speech while reading, while others report never doing so. Some people report always experiencing the words on the page, while others report never doing so. Similarly, participants differed in their concrete reports of single individually sampled experiences.

Confirming Hypothesis 2, within subjects, people report variable types of experience while reading. The majority of participants avoided the end points of the Likert scale for each of the three types of modal experience. That is, only a minority of participants reported either “always” or “never” experiencing visual imagery, inner speech, or the words on the page while reading.

Confirming Hypothesis 3, in Experiment 2 we found evidence that text type influenced reported reading experience. Inner speech was more commonly reported in dramatic dialogues than in passages with vivid visual detail, and visual imagery was more commonly reported in passages with vivid visual detail than in dramatic dialogues.

Finally, although we did not anticipate this result, in all three experiments, visual imagery was reported more commonly than was inner speech for most of the types of texts we used. Even in dramatic dialogue passages, inner speech was only moderately more frequently reported than visual imagery.

We interpret these four results as broadly confirming findings by Hurlburt and colleagues, who report highly variable reading experience both within and between subjects and whose participants report inner speech in only a minority of samples (Brouwers et al., 2017; Caracciolo & Hurlburt, 2016). However, our methodology is more readily replicable than Hurlburt’s and we have many more participants. The result regarding the prevalence of inner speech is especially important due to the number of researchers who assume that inner speech is normally present as part of the conscious phenomenology of reading experience (e.g., Baars, 2003; Filik & Barber, 2011; Kivy, 2008; Morin, 2009; Rayner, et al., 2012; Velmans, 2009). Brouwers and colleagues, in discussing some of our preliminary data (available in Moore, 2016; Moore & Schwitzgebel, 2013), raise the possibility that our participants overreport inner speech due to failing to sufficiently “bracket the presupposition” that inner speech must be present. We do not necessarily reject this possibility.

The validity of these measures of reading experience is partly confirmed. In Experiment 3, we found evidence of test-retest reliability over an interval of 15 minutes. In Experiments 2 and 3, we also found the expected match between retrospective general reports and the percentages of reported modal experience in individually sampled (beeped) concrete moments of experience. For example, participants’ mean retrospective response to how often they experienced visual imagery in Experiment 2 was 5.0 on a scale where 4 was marked “Half of the Time” and 7 was marked “Always”. Broadly in accord with that result, participants reported visual imagery in 70% of sampled moments.

Contrary to Hypothesis 4, we found little evidence of a substantial relationship between modal introspective reports (visual imagery, inner speech, or visual experience of the words on the page) and memory of seeming-related textual details (of visual story detail, of phonological information such as the pronunciation of words disambiguated by rhyme scheme, or of details of the visual presentation of the text such as font). Despite sufficient power to detect small-to-medium effect sizes, across the three experiments we found only one statistically significant association in the predicted direction (concrete “beeped” reports of visual imagery and memory of visual story detail in Experiment 3). The relationship between reported conscious experience while reading and memory of seemingly-related textual detail appears small to nonexistent.

Contrary to Hypothesis 5, in Experiment 3 we found no evidence of a positive relationship between reports of visual imagery and performance on a mental folding task, and we found no evidence of a relationship between reports of inner speech and phonological interference in a later memory task.

This lack of relationship is not entirely surprising. As reviewed in Schwitzgebel (2011), there have generally been weak and unsystematic relationships between subjective measures of visual imagery such as the VVIQ (Marks, 1979), and performance on cognitive tasks widely thought to be facilitated by visual imagery. Similarly, despite continuing popularity in some educational circles, there is little evidence of a relationship between students’ preferred “learning styles” (e.g., visual vs. auditory) and educational performance (Felder & Spurlin, 2005; Pashler et al., 2008). Our results fit this general pattern.

We see two broad interpretative possibilities. One possibility, defended at length in Schwitzgebel (2011), is that introspective reports of this sort do not accurately reflect people’s actual streams of conscious experience. People are simply bad introspectors. Another possibility, advocated for example by Paivio (1986), is that conscious experiences of this sort are being accurately reported by participants but have little relationship to basic cognitive functions such as memory, mental rotation, or phonological processing. HHoweverHowever, even if that is the case, differences in experience might still be personally or aesthetically important. We remain neutral between these two interpretations. To retain this neutrality, we have framed our hypotheses and conclusions in terms of participants’ reports rather than in terms of participants’ actual experiences.

Reader response theory and some other approaches to literary criticism and aesthetics emphasize the importance of the reader’s experience while encountering a text (Fish, 1970; Holland, 1975; Iser, 1976/1978; Kivy, 2008; Robinson, 2005; Rosenblatt, 1978). Systematic experience sampling during aesthetic experiences such as reading, watching films, listening to music, or witnessing live performances on stage has great potential as a research methodology in empirical aesthetics, especially if the validity of the reports can be confirmed. We predict that experience sampling will soon become a common method for studying these types of phenomena. We hope that the present research lays some of the groundwork for this nascent field of study.



This work was supported by an Academic Senate grant from the University of California at Riverside.




1.      Ahsen, A. (1984). Reading of image in psychology and literary text. Journal of Mental Imagery, 8(3), 1-32.

2.      Baars, B. J. (2003). How brain reveals mind. Journal of Consciousness Studies, 10(9-10). 100-114.

3.      Baddeley, A. D. (1966). The influence of acoustic and semantic similarity on long-term memory for word sequences. Quarterly Journal of Experimental Psychology, 18(4), 302-309.

4.      Baddeley, A. D. (2000). The episodic buffer: A new component of working memory? Trends in cognitive sciences4(11), 417-423.

5.      Baddeley, A. D. (2007). Working memory, thought, and action. Oxford University Press.

6.      Baddeley, A. D. (2010). Working memory. Current Biology20(4), R136-R140.

7.      Berkeley, G. (1710/1965). A treatise concerning the principles of human knowledge. In C. Turbayne (Ed.), Principles, Dialogues, and Philosophical Correspondence. Macmillan.

8.      Buhrmester, M., Kwang, T., & Gosling, S. D. (2011). Amazon's Mechanical Turk: A new source of inexpensive, yet high-quality, data? Perspectives on Psychological Science6(1), 3-5.

9.      Burke, E. (1757/1990). A philosophical enquiry into the origin of our ideas of the sublime and beautiful. Oxford: Oxford University Press.

10.  Brouwers, V. P., Heavey, C. L., Lapping-Carr, L., Moynihan, S., Kelsey, J., & Hurlburt, R. T. (2017). Silent reading is not silent speaking of the text: An investigation of pristine inner experience while reading. Manuscript.

11.  Caracciolo, M., & Hurlburt, R. T. (2016). A passion for specificity. The Ohio State University Press.

12.  Carroll, N. (2001). Beyond aesthetics: Philosophical essays. Cambridge University Press.

13.  Casler, K., Bickel, L., & Hackett, E. (2013). Separate but equal? A comparison of participants and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral testing. Computers in Human Behavior29(6), 2156-2160.

14.  Cohen, M. T., & Johnson, H. L. (2011). Improving the acquisition of novel vocabulary through the use of imagery interventions. Early Childhood Education Journal38(5), 357-366.

15.  Coltheart, M., Rastle, K., Perry, C., Langdon, R., & Ziegler, J. (2001). DRC: A dual route cascaded model of visual word recognition and reading aloud. Psychological Review, 108, 204–256.

16.  Dennett, D. C. (1991). Consciousness explained. USA: Back Bay Books.

17.  Ekstrom, R. B., French, J. W., Harman, H. H., & Dermen, W. (1976). Kit of factor-referenced cognitive tests. Educational Testing Service.

18.  Faw, B. (2009). Conflicting intuitions may be based on different abilities: Evidence from mental imaging research. Journal of Consciousness Studies, 16(4), 45-68.

19.  Felder, R. M., & Spurlin, J. (2005). Applications, reliability, and validity of the index of learning styles. International Journal of Engineering Education, 21(1), 103-112.

20.  Filik, R., & Barber, E. (2011). Inner speech during silent reading reflects the reader's regional accent. PloS one6(10).

21.  Fish, S. (1970). Literature in the reader: Affective stylistics. New Literary History2(1), 123-162.

22.  Frost, R. L. (1927/1969). The egg and the machine. In The poetry of Robert Frost (269). Henry Holt and Company.

23.  Frost, R. (1998). Toward a strong phonological theory of visual word recognition: True issues and false trails. Psychological Bulletin123(1), 71.

24.  Gambrell, L. B., & Bales, R. J. (1986). Mental imagery and the comprehension-monitoring performance of fourth-and fifth-grade poor readers. Reading Research Quarterly, 454-464.

25.  Goetz, E. T., Sadoski, M., & Olivarez, Jr., A. (1991). Getting a reading on reader response: Relationships between imagery, affect, and importance ratings, recall, and imagery reports. Reading Psychology: An International Quarterly12(1), 13-26.

26.  Goetz, E. T., Sadoski, M., Stowe, M. L., Fetsco, T. G., & Kemp, S. G. (1993). Imagery and emotional response in reading literary text: Quantitative and qualitative analyses. Poetics22(1-2), 35-49.

27.  Green, M. C., & Brock, T. C. (2000). The role of transportation in the persuasiveness of public narratives. Journal of Personality and Social Psychology79(5), 701.

28.  Hauser, D. J., & Schwarz, N. (2016). Attentive Turkers: MTurk participants perform better on online attention checks than do subject pool participants. Behavior Research Methods48(1), 400-407.

29.  Holland, N. (1975). 5 readers reading. Yale University Press.

30.  Hurlburt, R. (2011). Investigating pristine inner experience: Moments of truth. Cambridge University Press.

31.  Hurlburt, R.T. & E. Schwitzgebel (2007). Describing inner experience: Proponent meets skeptic. MIT Press.

32.  Iser, W. (1976/1978). The act of reading: A theory of aesthetic response. Johns Hopkins University Press.

33.  James, W. (1890). The principles of psychology. New York: Henry Holt.

34.  Jaynes, J. (1976/2000). The origin of consciousness and the breakdown of the bicameral mind. Boston: Houghton Mifflin.

35.  Johnson, D. R., Cushman, G. K., Borden, L. A., & McCune, M. S. (2013). Potentiating empathic growth: Generating imagery while reading fiction increases empathy and prosocial behavior. Psychology of Aesthetics, Creativity, and the Arts7(3), 306.

36.  Kivy, P. (2008). The performance of reading: An essay in the philosophy of literature. John Wiley & Sons.

37.  Krasny, K. A., & Sadoski, M. (2008). Mental imagery and affect in English/French bilingual readers: A cross-linguistic perspective. Canadian Modern Language Review64(3), 399-428.

38.  Kurby, C. A., Magliano, J. P., & Rapp, D. N. (2009). Those voices in your head: Activation of auditory images during reading. Cognition, 112, 457–461.

39.  Kurby, C. A. & Zacks, J. M. (2013). The activation of modality-specific representations during discourse processing. Brain and Language126(3), 338-349.

40.  Lamarque, P., & Olsen, S. H. (2004). The philosophy of literature: Pleasure restored. The Blackwell guide to aesthetics, 195-214.

41.  Lauro, L. J. R., Reis, J., Cohen, L. G., Cecchetto, C., & Papagno, C. (2010). A case for the involvement of phonological loop in sentence comprehension. Neuropsychologia48(14), 4003-4011.

42.  Leinenger, M. (2014). Phonological coding during reading. Psychological Bulletin140(6), 1534.

43.  Long, S. A., Winograd, P. N., & Bridge, C. A. (1989). The effects of reader and text characteristics on imagery reported during and after reading. Reading Research Quarterly, 24, 353-372.

44.  Marks, D. F. (1973). Visual imagery differences in the recall of pictures. British Journal of Psychology, 64, 17-24.

45.  Miall, D. S., & Kuiken, D. (2002). A feeling for fiction: Becoming what we behold. Poetics30(4), 221-241.

46.  Moore, A.T., (2016). The experience of reading. Unpublished doctoral dissertation. Department of Philosophy, University of California at Riverside.

47.  Moore, A.T., & Schwitzgebel, E. (2013). The experience of reading: Imagery, inner speech, and seeing the words on the page. Blog post at The Splintered Mind (Aug 28, 2013). URL:

48.  Morin, A. (2009). Inner speech and consciousness. In W. P. Banks (Ed.), Encyclopedia of consciousness, vol. 1 (389-402). Oxford: Elsevier.

49.  Nannicelli, T. (2013). A philosophy of the screenplay. Routledge.

50.  Nijhof, A. D., & Willems, R. M. (2015). Simulating fiction: Individual differences in literature comprehension revealed with fMRI. PLoS One, 10(2): e0116492.

51.  Oppenheim, G. M., & Dell, G. S. (2008). Inner speech slips exhibit lexical bias, but not the phonemic similarity effect. Cognition106(1), 528-537.

52.  Paap, K. R., & Noel, R. W. (1991). Dual-route models of print to sound: Still a good horse race. Psychological Research53(1), 13-24.

53.  Pashler, H., McDaniel, M., Rohrer, D., & Bjork, R. (2008). Learning styles concepts and evidence. Psychological Science in the Public Interest, 9(3), 105-119.

54.  Paivio, A. (1986). Mental representations: A dual coding approach. Oxford. England: Oxford University Press.

55.  Perrone-Bertolotti, M., Rapin, L., Lachaux, J. P., Baciu, M., & Loevenbruck, H. (2014). What is that little voice inside my head? Inner speech phenomenology, its role in cognitive performance, and its relation to self-monitoring. Behavioural Brain Research261, 220-239.

56.  Pham, A. V., & Hasson, R. M. (2014). Verbal and visuospatial working memory as predictors of children's reading ability. Archives of Clinical Neuropsychology29(5), 467-477.

57.  Phelan, J. (2007). Experiencing fiction: Judgments, progressions, and the rhetorical theory of narrative. The Ohio State University Press.

58.  Rayner, K., Pollatsek, A., Ashby, J., & Clifton Jr, C. (2012). Psychology of reading. Psychology Press.

59.  Reed, H. B. (1916). The existence and function of inner speech in thought processes. Journal of Experimental Psychology, 1, 365-392.

60.  Robinson, J. (2005). Deeper than reason: Emotion and its role in literature, music, and art. Oxford University Press on Demand.

61.  Rosenblatt, L. M. (1978). The reader, the text, the poem: The transactional theory of the literary work. Southern Illinois University Press.

62.  Sadoski, M. (2005). A dual coding view of vocabulary learning. Reading & Writing Quarterly21(3), 221-238.

63.  Sadoski, M., Goetz, E. T., Olivarez Jr, A., Lee, S., & Roberts, N. M. (1990). Imagination in story reading: The role of imagery, verbal recall, story analysis, and processing levels. Journal of Reading Behavior22(1), 55-70.

64.  Sadoski, M., & Quast, Z. (1990). Reader response and long-term recall for journalistic text: The roles of imagery, affect, and importance. Reading Research Quarterly, 256-272.

65.  Schooler, J. W., Reichle, E. D., & Halpern, D. V. (2004). Zoning out while reading: Evidence for dissociations between experience and metaconsciousness. In D. T. Levin (Ed.), Thinking and seeing: Visual metacognition in adults and children (203-226). Massachusetts Institute of Technology.

66.  Schwitzgebel, E. (2011). Perplexities of consciousness. MIT press.

67.  Seidenberg, M. S., & McClelland, J. L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review96(4), 523.

68.  Siewert, C. (1998). The significance of consciousness. Princeton: Princeton University Press.

69.  Van Orden, G. C., Johnston, J. C., & Hale, B. L. (1988). Word identification in reading proceeds from spelling to sound to meaning. Journal of Experimental Psychology: Learning, Memory, and Cognition14(3), 371.

70.  Vellmans, M. (2009). Understanding consciousness, 2nd edition. Routledge.

71.  Wittgenstein, L. (1946-48/1975). Zettel. Translated by G. E. M. Anscombe. Edited by G. E. M. Anscombe and G. H. von Wright. Berkeley and Los Angeles: University of California Press.

72.  Woodworth, R. S. (1906). Imageless thought. The Journal of Philosophy, Psychology and Scientific Methods, 3(26), 701-708.

73.  Zwaan, R. A. (2016). Situation models, mental simulations, and abstract concepts in discourse comprehension. Psychonomic Bulletin & Review23(4), 1028-1034.

74.  Zwaan, R. A. & Radvansky, G. A. (1998). Situation models in language comprehension and memory. Psychological Bulletin, 123(2), 162.


[1] Corresponding author: Alan Tonnies Moore, Philosophy Department, Humanities 388, 1600 Holloway Avenue, San Francisco, CA 94132. Email: