Roll ‘em!: The Effects of Picture Motion on Emotional Responses


Benjamin H. Detenber, Robert F. Simons & Gary G. Bennett, Jr.

University of Delaware




An experiment investigated the effects of picture motion on individuals’ emotional reactions to images. Subjective measures (self-reports) and physiological data (skin conductance and heart rate) were obtained to provide convergent data on affective responses. Results indicate that picture motion significantly increased arousal, particularly when the image was already arousing. This finding was supported by the both skin conductance and the self-report data. Picture motion also tended to prompt more heart-rate deceleration, most likely reflecting a greater allocation of attention to the more arousing images. In this study, the influence of picture motion on affective valence was evident only in the self-report measures – positive images were experienced as more positive and negative images as more negative when the image contained motion. Implications of the results and suggestions for future research are discussed.


The distinction between form and content has always been central to media studies. While the bulk of media-effects research indicates that particular message content can and does influence people’s cognitive and emotional responses, evidence exists that indicates non-content, or structural, attributes of film and television messages also affect psychological responses. For example, recent studies have shown that screen size, viewing distance, and whether the messages are seen in black and white or in color, all can influence message processing and evaluations of the messages as well as the viewing experience (Detenber & Reeves, 1996; Lombard, 1995; Sherman & Dominick, 1988). The role that these presentation attributes play in modifying individuals’ responses to media messages may be small compared to content-driven effects, but they are nonetheless important to a complete understanding of how media effects occur.

As media technologies advance, new opportunities to experience pictorial presentations in a variety of contexts and forms arise. Home theaters are introducing larger and larger screens into our homes (Reeves, Detenber, & Steuer, 1993). High definition television (HDTV) will change both the resolution and aspect ratio of our current television system. In addition to the traditional cinema, movie screens now appear in specialized venues like amusement park rides and IMAX theaters. The world wide web makes a vast number of images, both still and moving, available to anybody with a computer and web browsing software. These technologies offer not only new ways to present images and experience them, but the possibility for fundamentally altering the effects they engender.

One attribute of media presentations that seems to have a significant impact on individual responses is whether images are moving or still. Motion constitutes a fundamental attribute of the physical world, and our brains have adapted to this fact with specialized nerve cells that detect and process motion (Goldstein, 1989; Movshon, 1990). Research on neonates indicates that motion perception is an innate ability and essential to understanding the physical world (Ball & Tronick, 1971; Barten, Birns, & Ronch, 1971). Motion also figures prominently in the world of media, for it is a defining characteristic of film, video, and new communication technologies such as multimedia. Film theorists contend that motion is highly expressive and can evoke strong emotional responses in viewers (Arnheim, 1958/1983; Giannetti, 1976). Therefore, it is not surprising that instruction in film and video production emphasizes the importance of incorporating appropriate motion into messages. In the world of new media, motion is considered highly desirable, and substantial computational resources are allocated to make images move.

Despite the widespread assumption that motion is important to media messages, very few studies have investigated the psychological effects of motion in media presentations. In general, these studies show that motion does affect cognitive processing and viewer responses in a variety of ways. For example, Kipper (1986) demonstrated that memory for the physical properties of a filmed scene (the layout) improves when the scene is shot by a moving rather than fixed camera. Another study has shown that motion on screen is associated with higher levels of cortical arousal in viewers, as measured by the alpha frequency of the electroencephalogram (EEG; Reeves, Thorson, Rothschild, McDonald, Hirsch, & Goldstein, 1985). Reeves et al. (1985) propose that the increase in cortical arousal, or attention, is an automatic response people have to certain types of motion in general, and to similar types of motion in television presentations, too. In terms of emotional responses, a recent study suggests that image motion influences self-reports of emotional arousal, but not hedonic valence (Detenber & Reeves, 1996). Surprisingly, the still versions of pictures elicited judgments of greater arousal than did the moving versions of the same pictures. The authors offered a tentative explanation for their findings based on cognitive elaboration, but cautioned that the unexpected nature of the results required that the issue receive further investigation (Detenber & Reeves, 1996).

Emotion and Psychophysiology

The study of emotion has received increased attention in recent years from communication scholars (Dillard & Wilson, 1993). To some degree, this trend reflects a recognition of the significant role that emotion, or affect, plays in everyday experience and social life. As for mass communication effects, emotions play a critical role in persuasion as well as entertainment. They are also associated with various dysfunctions (e.g., the cultivation of fearfulness in heavy television viewers). The importance of affective responses to media in a variety of research domains, and evidence suggesting that picture motion can influence emotional responses prompted the use of emotion as the dependent variable in this experiment.

The study described here adopts a dimensional view of emotion (Lang, 1995; Osgood, Suci, & Tannenbaum, 1957; Russell, & Mehrabian, 1977). In general, dimensional approaches posit two or three dimensions that underlie all emotions. The two most commonly cited dimensions are arousal and hedonic valence, and the third, less frequently used dimension, is dominance. Typically, emotion researchers characterize the valence dimension as a continuous range of affective response extending from pleasant or positive valence at one pole, to unpleasant or negative valence at the other. The dimension of autonomic arousal is characterized by a continuous response ranging from energized, excited, and alert to calm, drowsy, or peaceful. These two dimensions, valence and arousal, account for most of the independent variance in emotional responses (Greenwald, Cook, & Lang, 1989), and are the only ones used in the study.

Emotion can be manifested subjectively, physiologically, and behaviorally, and can, therefore, be measured in a number of ways. In the experiment presented here, we focused on subjective and physiological manifestations. Self-reported emotion experience was assessed using the Self-Assessment Manikin (SAM; Lang, 1980). This measure, a form of semantic differential scale, requires subjects to introspect and consciously report on their feelings of arousal and hedonic valence. To assess emotion’s physiological sequelae, we measured both skin conductance and heart rate. These measures were selected because each has shown a specific relationship to one of the two primary emotion dimensions – skin conductance to arousal and heart rate to hedonic valence. In the present context, heart rate acceleration occurs in response to pleasant stimuli, while unpleasant stimuli are accompanied by heart-rate slowing (Fitzgibbons & Simons, 1992; Greenwald et al., 1989; Simons & Fitzgibbons, 1994). Likewise, as people experience arousal (e.g., from a threatening or erotic stimulus) their sweat glands become active and skin conductance responses (SCR) become larger and more frequent (Hopkins & Fletcher, 1994). Each of these measures yields information about a different aspect of emotional experience, but in general, affective judgments and physiological measures are positively correlated (Greenwald et al., 1989; Lang, Greenwald, Bradley, & Hamm, 1993). Though the self-report and physiological measures are expected to converge, the physiological measures provide an added benefit – they are less sensitive to any demand characteristics that might be operative during the experiment and, therefore, are less susceptible to impression management.

The present experiment was designed to explore the relationship between image motion and people’s emotional responses to pictures. Specifically, we sought to determine whether or not image motion had a positive effect on emotional arousal. It was expected that motion would imbue the images with a dynamism that would lead to both a subjective excitement and a concomitant increase in the viewer’s autonomic arousal. The positive effect of motion on cortical arousal (Reeves et al., 1985) supports the plausibility of this expectation. We also wanted to determine if the impact of image motion was limited to the arousal properties of the stimulus as described by Detenber and Reeves (1996) or whether the impact of emotion extended to stimulus valence as well. Lastly, we were interested in whether the effects of motion were evident at all levels of an emotion dimension, or whether this specific image property (i.e. motion) interacted with image content. That is, might there be a synergistic effect between form and content such that motion has a differential impact along either the valence or arousal dimension? To provide a rigorous and sensitive test of these potential effects on both self-report and physiological measures, a within-subjects design was used in the present experiment instead of the between-subjects design used in the Detenber & Reeves (1996) study (Reeves & Geiger, 1994).



This experiment used two within-subject designs simultaneously: A 2 (motion) X 3 (positive, neutral, and negative valence), and a 2 (motion) X 3 (low, medium, and high arousal). The two categorical emotion variables were created by classifying the pictures according to self-reported ratings on each dimension (see below). The basic design required that each participant view 27 pictures twice (both moving and still versions) while on-line physiological data were collected. After each of the 54 presentations, subjects rated their emotional response to the image they had just seen.


Nineteen undergraduate students at the University of Delaware participated in this study. They received partial credit toward the research participation component of their introductory psychology course, or participated as part of a summer research institute experience. Data from one subject was eliminated from the heart-rate and skin conductance analyses due to technical problems during data collection. The final sample of 18 consisted of 8 males and 10 females with a mean age of 20.6 years.


The stimuli consisted of 27 images taken from a wide variety of films and television programs and were a subset of those used previously by Detenber (1995) and Detenber & Reeves (1996). Each image was a single shot (i.e., it contained no edits) that depicted some moving object, but had little or no camera movement. Content categories appearing in the International Affective Picture System (IAPS; Lang, Ohman, & Vaitl, 1988) guided the selection of the images (see Appendix A for a descriptive list of the images). The 27 pictures used in the present study were known to elicit a wide range of ratings on both valence and arousal -- the two emotion dimensions of particular interest in the present study.

Each stimulus was presented for 6 s and was either a moving or still version of the same image. The still version of each image was a single frame that was particularly representative of the full-motion clip. All images, along with an additional image instructing subjects to perform the ratings task, were stored on a video laser disc that was connected to a Macintosh computer. Stimuli were presented to subjects in one of four orders embedded in a Hypercard program used by the Macintosh to control the sequence and the timing of stimulus presentation. Whether subjects saw the moving or still version of the pictures first was counterbalanced across orders.

Response measurement

Self-report. Subjects provided ratings of valence and arousal for each of the 54 image presentations by completing a paper and pencil version of Lang’s Self-Assessment Manikin (SAM; Lang, 1980). Using the nine-point SAM scales, valence is rated by marking on or between five graphics depicting the manikin (a gender-neutral human figure) with facial expressions ranging from a broad smile to a severe frown. Arousal is rated similarly using five graphics depicting the manikin at different levels of visceral agitation. Subjects were also asked to provide a rating of ‘interest’ for each image as well by making a mark along a continuous line anchored with the words ‘boring’ and ‘interesting’.

Physiological recording. Heart-rate was obtained by attaching a Grass Photoplethysmograph Model PPS to the subject’s right ear lobe. The photocell output was fed into a Grass Model 7P1 Low Level DC Preamplifier and Model 7D Driver Amplifier (Bandpass = 1.6 - 3.0 Hz) and then into a Grass Model 7P4 Cardiotachometer where the interpulse intervals were converted into heart (pulse) rate in beats per minute (BPM).

Skin conductance responses were recorded using a Coulbourn Model S21-22 constant voltage (.5V) skin conductance coupler. Prior to recording, the palm of the nonpreferred hand was cleansed with distilled water. Beckman Standard (0.5 cm2) Ag/AgCl electrodes were then placed on the thenar and hypothenar eminence of the palm with Johnson and Johnson KY Jelly used as electrolyte.


Subjects were provided with a brief description of the stimuli, the ratings task, and the recording techniques and then signed an informed consent form. The skin conductance electrodes were affixed to the hand and the subject was led to an adjacent room equipped with a comfortable arm chair positioned approximately 1.4 m in front of the viewing device (SONY 20" Color Monitor). The photoplethysmograph was attached to the ear and the quality of the physiological recordings was inspected. Subjects then received the complete set of instructions and two ‘neutral’ practice trials were delivered. The experiment began if the instructions were understood, if the ratings task was completed properly during the practice trials, and if the physiological recordings were free of obvious noise and artifacts.

The experiment proper consisted of 54 trials under the control of two laboratory computers: a 486 PC that initiated each trial and collected the physiological data and the Macintosh that controlled the laser disc player. Each trial began with a signal, through a simple serial connection, from the PC to the Macintosh. The signal caused the Macintosh to deliver one of the 27 images in either still or moving form for 6 s. For 1 s following the completion of the clip, the viewing screen was dark, and then the instruction to rate the image was presented on screen for 4 s. Subjects were instructed to rate the image on the three scales (valence, arousal, and interest) quickly, and to return their eyes to the viewing screen prior to the appearance of the next image. The period for ratings varied randomly from 17 to 27 s. Physiological data collection began 2 s prior to the delivery of each image and continued for 10 s. At the half-way point in the experiment, the experimenter reentered the viewing room to provide a short break for the subject and to ensure that the subject was on the appropriate page in the ratings booklet. At the conclusion of the experiment, subjects were verbally debriefed and given a brief written explanation of the experiment along with some relevant citations. The entire experiment lasted less than one hour for each subject.

Data Reduction

The skin conductance and cardiotachometer channels were digitized by computer at 50 cps. The skin conductance data were displayed graphically, trial by trial, and quantified by visually identifying response onset and the largest peak that occurred with an onset latency of 0.5 - 4 s following stimulus onset. Response amplitude was defined as the difference, in mSiemens, between the identified peak and onset points.

The cardiotachometer data were edited for artifacts by visually inspecting each trial. Beats containing movement or other artifact were generally replaced by the trial average. If, however, the baseline heart rate could not be determined or if consecutive beats containing artifact were detected during image presentation, the entire trial was deleted and omitted from the appropriate condition average. The heart-rate response to each stimulus was obtained from the edited cardiotachometer record by averaging successive 25 data points, and deviating each half-second average during the 7 s post-onset epoch from the half-second average immediately preceding stimulus onset.

Data Analysis

The initial phase of data analysis involved the generation of mean valence and arousal ratings for each of the images collapsed across the moving/still dimension. Valence means were then ranked from most positive to least positive and the 27-image set was then divided into 9 positive, 9 neutral and 9 negative images. Likewise, arousal means were ranked from the lowest to highest and the images were divided into 9 low-, medium- and high-arousal categories, again collapsed across the moving/still dimension.

The heart-rate and skin conductance data were likewise grouped into both valenace and arousal categories, and all dependent measures were analyzed twice using a repeated-measures Analysis of Variance (ANOVA) with Image Category (Valence or Arousal) and Motion as the two within-subject variables. Orthogonal trends were used to represent the Category variable. In this analysis, the linear trend (1,0,-1) is equivalent to the specific contrast of positive v. negative valence or low v. high arousal, whereas the quadratic trend (1, -2, 1) is equivalent to the contrast of the middle category with the two extremes. Since the heart-rate data consisted of heart-rate change across time, the heart-rate analysis examined orthogonal trends across the fifteen half-second data points in addition to the trends assessing image Category and Motion.


SAM Ratings

Valence and arousal ratings as a function of both valence and arousal categories and picture motion are presented in Figure 1. The two left-hand panels illustrate the impact of image motion on SAM ratings of arousal (top) and valence (bottom). The effect of motion on arousal ratings was highly significant with moving images experienced as more arousing than still images (F(1,17) = 33.98, p<.001). Though the effect was apparent at each level of arousal, motion had its greatest impact on the high arousal images (Motion X Arousal Fquad(1,17) = 5.43, p<.05). Though moving images were also rated more positively than still images (F(1,17) = 15.29, p<.01), motion actually accentuated the subjective valence difference between positive and negative images. That is, motion in negative images resulted in more negative perceptions and motion in positive images resulted in more positive perceptions (Motion X Valence Flin(1,17) = 22.59, p<.001).

Insert Figure 1 about here

The two right-hand panels of Figure 1 illustrate the relationship between valence and arousal ratings. Images in the high-arousal category received more negative valence ratings than the neutral and positive images, and as is often the case, images of neutral valence were rated as less arousing than the images perceived as either positive or negative (cf. Bradley, Greenwald, Petry, & Lang, 1992; Fitzgibbons & Simons, 1992). Both of these effects were significant (F(2,34) = 27.27, p<.001 and F(2,34) = 6.63, p<.01 respectively).

Skin Conductance

As expected, skin conductance response magnitude was a function of the arousal properties of the image stimuli. As the left-hand panel of Figure 2 illustrates, high-arousal images evoked particularly large skin conductance responses (Flin(1,16) = 31.49, p<.001; Fquad(1,16) = 19.22, p<.001). The figure also illustrates the impact of image motion. As was true of arousal self-report, motion induced larger SCRs overall (F(1,16) = 12.83, p<.01), but this effect was much larger when the images were in the high arousal category (Motion X Arousal Flin(1,16) = 8.78, p<.01). The right-hand panel of Figure 2 depicts SCR magnitude as a function of image valence. There was no significant interaction between the two variables, supporting the specificity of the relationship between skin conductance activity and picture motion and the arousal properties of the image.

Insert Figure 2 about here

Heart Rate

The heart-rate response to the image stimuli was primarily deceleratory. Heart rate began slowing shortly after stimulus onset and remained below baseline for the duration of the recording interval. Heart rate per half-second is presented in Figure 3 as a function of image valence (left-hand panel), image arousal (center panel) and image motion (right-hand panel). The significant linear trend across the presentation period (Flin(1,16) = 16.90, p<.001) accounted for 70% of the relevant variance.

Heart-rate change was significantly related to both emotion properties of the stimuli (i.e. valence and arousal). Larger decelerations occurred in response to negative than to positive images (Flin(1,16) = 8.52, p<.01) and both high- and low-arousal images prompted more deceleration than medium arousal images (Fquad(1,16) = 7.62, p<.05).

Insert Figure 3 about here

The impact of motion on the heart-rate response was complex. As the third panel in Figure 3 illustrates, the impact of motion developed across time (Motion X Half-second F(14,224) = 3.39, p<.01). It appears that the differentiation of the heart-rate responses to still and moving images as the image unfolds. That is, heart rate continues to decelerate throughout the presentation period with image motion, but stabilizes in response to still images. The supporting statistical interactions (Motion with the linear and quadratic components of the Half-second variable) were, however, only marginally significant (Flin(1,16 = 3.30, p<.10; Fquad(1,16) = 4.09, p<.10). There were no interactions between motion and either valence or arousal suggesting that the two variables (Motion and Emotion) exert independent effects on heart rate.


Overall, the findings of this study provide compelling evidence that picture motion influences emotional responses to images. The results also indicate that motion, a characteristic of the presentation, or formal variable, has a fairly specific effect on the topography of the emotional response to picture content – it increases the viewer’s arousal while having a more restricted effect on the positive/negative aspect of the emotion. This differential impact of picture motion on the arousal and valence dimensions of emotion, as well as on the different sets of measures are discussed in detail below.


One clear inference to be drawn from the data of this study is that, as expected, moving pictures elicit greater arousal (both subjective and autonomic) than still pictures do, and this occurs across a range of content (i.e. whether the pictures are positive or negative). Although the results are contrary to the findings of Detenber & Reeves (1996), who found that subjective reports of arousal are greater during still images, we have considerable confidence in the present data set. First, the self-report and physiological data both yielded the same pattern of results. The convergence of arousal ratings with the skin conductance measure was striking and makes it unlikely that the present results are in any way spurious. Second, we believe that the within-subjects design is better suited to the research question than the between-subjects design used by Detenber and Reeves (1996), for it allows subjects to experience both levels of the treatment, and thereby use each as a reference or point of comparison for the other. The design may have offered greater ecological validity, as well, for moving and still images are something that people encounter juxtaposed in the real world (Greenwald, 1976). More important, perhaps, than the overall effect of motion on arousal, the data indicate that picture motion interacted with the emotional quality of the content. Specifically, picture motion dramatically increased arousal responses to the images that depicted the most arousing events (e.g., a rocket launch, a shooting, etc.), and had less of an impact on arousal responses to the medium and low arousal images. This interaction characterized both the more mindful self-report data and the automatic SCRs, indicating that this multiplicative effect occurred for different aspects of emotional experience.


As for effects on affective valence, the results suggest that picture motion did not radically alter the hedonic quality of emotional responses to images. Unlike the arousal dimension, the two measures for valence did not yield convergent results. The affective judgments, or SAM ratings, did indicate there was a significant overall increase in the valence of responses to the moving images. However, the interaction of motion with the valence quality of the images (i.e., moving versions of the negative images were rated as more negative than the still versions while positive images were rated as more positive) suggests that motion amplified the subjective evaluations of hedonic quality and did this, perhaps, through the modulation of arousal. That is, motion tended to make one’s responses to negative images (e.g., a crying face, a dead body, etc.) more negative, and responses to positive images (e.g., nature scenes, a smiling baby) more positive since these particular images tend to be those that are also the most arousing. This type of interaction did not appear in the heart-rate data, however.

The results of the heart rate analysis were complex. As was true in numerous previous studies, heart rate was sensitive to image valence, with negatively-valenced stimuli prompting more sustained deceleration than stimuli with positive valence. This heart-rate/valence relationship was not affected by motion. That is, though the verbal report of the subjects’ hedonic experience (SAM valence) was exaggerated when the stimuli contained motion, this interactive effect of motion was not evident in the valence sensitive heart-rate response. Instead, motion’s impact on the heart-rate response was general. Images containing motion were more deceleratory than still images regardless of either the arousal or valence aspects of image content, and this effect appeared to grow stronger during the course of image presentation. This lack of association between motion and either emotion dimension in the heart-rate data would suggest that motion’s effect on heart rate may not be emotional in nature at all. In fact, the deceleratory heart-rate pattern observed in the present experiment resembles the heart-rate changes associated with attentional shifts (Graham & Clifton, 1966; Lacey & Lacey, 1970). The differentiation of the heart-rate response to still and moving images toward the end of the viewing period most likely, then, reflects the ability of motion to capture and then sustain the viewers attention. Thus, the initial period of deceleration would suggest that both still and moving images are effective in capturing the subject’s attention, motion in the image images facilitates the maintenance of this attentional focus.

The Significance of Picture Motion

Based on the results of this study several comments about the psychological significance of picture motion can be made. First, it seems that motion increases the arousal level subjects experience in response to images presented on a television screen, but does not markedly alter the hedonic quality (valence) of the emotional response. Second, motion tends to heighten arousal for two distinct facets of emotional experience, the physiological, and the subjective. Third, the data suggest that picture motion may capture attention, as well as influence certain aspects of emotional responses.

The findings of the study have practical implications as well. For multimedia producers concerned with the cost of using video clips or creating animations in their products, the results suggest that the endeavor may be well worth the effort, especially if the goal is to increase excitement. Certainly, for many video and computer game designers this study confirms their apparent intuition and current practices. Producers and directors of film and video also may find in the main effect of the study (motion increases arousal) substantiating empirical evidence for their visual sensibilities. However, the interaction effect (greater increases in arousal for more arousing images) indicates that motion has a complex relationship with arousal, and how much it influences viewers’ excitement depends on what is being shown. This kind of relationship offers visual media producers myriad possibilities for achieving a desired emotional impact by combining motion, and perhaps even different types of motion, with various images. It also suggests that still images may be more appropriate in certain contexts for extremely arousing material. For example, television news stories containing arousing negative images (e.g., plane crashes, urban riots, etc.) have been shown to reduce memory for material in the newscast through retroactive inhibition (Lang, Newhagen, & Reeves, 1996). Since autonomic arousal affects cognitive processing (Lang et al., 1995), individual-level media effects (Mattes & Cantor, 1982), and media-use patterns (Zillmann, 1991), it seems likely that the use of motion in visual media will become more purposive and sophisticated as its ability to elicit arousal becomes better understood.

In addition to the observations stated above, several suggestions can be made for future research on the effects of picture motion. Since it is somewhat unclear what, if any, impact motion has on the valence dimension of emotional responses, particularly in the physiological realm, measures more sensitive to image valence than heart rate such as facial electromyographic (EMG) activity might be used to test for potential effects. Other measures, like the electroencephalograph (EEG) and tests of memory, would permit the exploration how motion affects cortical arousal and other cognitive processes. The interesting relationship between motion and heart-rate deceleration needs to be explored further in an expanded version of the present study. The implication of this relationship, that motion both ‘captures’ and then maintains attention, might also be pursued by employing a dual-task procedure. Lastly, the impact of various types of motion (e.g., movement toward the viewer versus away, or different camera moves like pans and dollies, etc.) could also be investigated by using a different set of stimulus material. These modifications to the experimental paradigm, along with replication data would contribute to a better understanding of the significance of picture motion.


Appendix A -- Description of Images with Mean Pretest Ratings


ID # Description Rating Rating Arousal Valence


Handwriting pen




Throat examination




Glider soaring in blue sky




Woman in hospital bed w/ eye patch




Two ballet dancers




Soviet flag




Swimming race




Rocket launch at night




Bird #2 (Heron in Water)




Two seals on rock




Icicles (dripping)




Soldiers removing dead body




Gun pointed through veil




Cemetery w/ horse procession




Silver Corvette driving in field




Pretty woman modeling




EKG monitor




Bullets in holster




Waves breaking




Couple in bed (view of his back)




Surfer on a wave




Nude woman moving suggestively




House in the country, with snow




Woman laughing (eating, too)




Gangster shooting Bound Man




Old Woman Drinking Wine




Man in Jock Strap Rubbing Crotch




Baby playing Patty-Cake




Person snorting cocaine




Girl’s crying face