I’m siiiiinging in the rain, I’m siinging in the Rain….
I’m part of a Film studies group, as a quantitative psychologists with more interests than brain-capacity. Most of the others are Film scholars with a humanistic/sociological bent, but with an interest in what people like me can add. And, I’m interested in what they can add to what I do. It is immensely fun when we get together, which, for this post, was April 25 at Copenhagen University, on a very lovely day which ended with wine and tapas outside on the 3-rd floor terrace.
Compelling art (deliberately vague) is psychologically interesting, because it wouldn’t be compelling unless it can harness something in the human psyche (to use Changizi’s terminology). And, those creating compelling art – be it movies or music or books or paintings – are able to harness, well, something, and that something is immensely interesting for a psychologist to try to understand. This is, of course, not my unique insight. Shimamura brings it up in his introduction, and brings up the fact that, although a whole lot of heuristics have been worked out for how to make a film (continuity editing, 180 degree rule, 30 degree rule, their violations for effect), the explanations that the film professionals give are not necessarily well grounded in, well, psychology. So, this is a perfect place for a psychologist to attempt some reverse engineering.
I was asked to present something on experimental method. I actually decided to do a little – well –tapas like sampling of quantitative methods, as the experimental variant is just one of the quantitative variants.
To prepare, I plowed through Arhur Shimamura’s edited volume “Psychocinematics”, and pulled a bunch of papers, before settling on presenting the methods in two papers: Rooney & Hennesy’s field study comparing 2D and 3D movies, and Magliano, Dijkstra & Zwaan’s work on predictive inference. Neither of those are true experiments, but we had already decided on doing a close reading of a paper by Marilyn Boltz, which is a true experiment.
The work in the first paper compared experiences of 2D and 3D movies, and did a very simple set up – ask people as they emerge from the 2D or 3D version of a movie to complete some questionnaires. The selected movie was Thor.
Participants completed 4 questionnaires: For perceived apparent reality they used a validated questionnaire: the ITC-Sense of Presence Inventory: a five-item, 5-point Likert scale validated questionnaire. (Lessiter, Freeman, Keogh & Davidoff, 2001). Typical question (that they cite) is “I had a strong sense that the characters and objects were solid”.
For attention they actually did not use attention, but self-reported distraction: How often distracted by other people, own thoughts, etc. Which, we discussed a bit. Is this really attention? Also, here they used a 6 point likert scale.
For Emotional Arousal they used Lang’s SAM, where you indicate how aroused you are, and how positive or negative you feel. Nine points. Well used scale – even I have used it.
Finally, again on a 6 point scale, they measured satisfaction basically by asking to rate “liked very much, would watch again, and bring my friends with”. Well, not like that – broken up in separate questions so as not to have them assess multiple questions on one scale.
Now, why they used so many different scales came up for discussion. In part, of course, because two of the scales were validated scale, and you simply do not just change the scale on a validated instrument.
Why on earth their self-created scales used 6, I don’t know, and it is not discussed, but it allowed me to expound a bit on ordinal scales and all that.
The only effect they found were on perceived realism and distraction (that they call attention). They reported cohen’s d, so I got to explain what effect sizes are. They also had a truly bizarre df for one of the measures (it has fractions), but I’m guessing it is one of those adjustment for data that does not fit the assumptions of the test. (Which one this could be I don’t know, because the standard deviations are close enough, but the ratings are close to ceiling, so I’m guessing some severe skew).
And, of course, we got to discuss whether people deliberately selecting a 2D version are really the same as those deliberately selecting the same movie in 3D. It is experiment-LIKE, but not a true experiment.
I then went on to the Magliano et al Moonraker paper – which is correlational. The question is: Is there some distinct features that directors use in a film – Mise an Scene, Montage, Framing, Music, Dialog – that makes viewers predict correctly what will happen next? One of the film-scholars said, of course there is, and told how she has used it in her teaching. This is her expertise though. The paper looks how everyday viewer (students) pick up on the predictions too (perhaps without knowing the source of that prediction). Very simple (but work-intensive) methodology: Participants are instructed to stop the movie when it occurs to them that they can predict what will happen next, write down what their prediction is, and note where they stopped the film. Then came the extensive coding of all the answers. Only predictions made by at least two people were included for the analysis. For all those times, the researcher looked at what was happening in the film, and whether any or all of the five predictor features were present. They weren’t always. Some predictions were likely from general genre knowledge (Bond will sleep with the good looking woman). But, more often than not there was something in the set-up of the movie that helped prediction, and the more of the predictors were present, the more of the viewers did correct predictions.
We finally moved on to the Marilyn Boltz paper, which truly is an experiment, with an impressive amount of selecting and pre-testing before running the actual experiment, but, as always with any experiment, there are places one can go in for critique. I think the interesting thing is that, depending on what our back-ground was, what we critiqued was different.
Music fills an important function in movies. (I was taught early on that when I got too scared, I could turn off the music and it wouldn’t be scary. I didn’t heed that kind of emotion regulation, though, but preferred to use the “avoidance” method.) It is often thought of as adding emotional tone to the scene, but the question here was also whether it guided the viewer’s attention, person-perception and memory. She uses schema theory as her theoretical underpinning: you set up some expectations, your person-perception, your attention, and your subsequent memory will be affected.
To test this, she first spent time selecting cuts from commercial movies that were ambiguous in nature. These came from cat-people, vertigo and a series called “the hitch-hiker”.
She then set out to select positive and negative music that she could pair with the movies (one positive, one negative for each clip). There were three different pieces of each affective type, whittled down from a larger set of nominations, and these were subsequently matched with a movie based on perceived “appropriateness” – more on this, because this was a place where we had a lot of discussion, and where we differed in the type of critique we were giving.
Each participant viewed all three clips, but with different music pairing, one with positive, one with negative and one with no music. It was all sorted and counterbalanced in all the appropriate ways to correct for order effects and other confounds. The participants were also divided into two testing groups: the first one were asked to give an extrapolation of the ending of each clip, and an interpretation of the intentions, as well as rating the intensions and emotions and moods of characters and scenes right after the clips were seen. The questions were specifically tailored for each scene. The other half did none of that, but returned a week later to participate in a memory test where they had to indicate whether described items (e.g. Flower bouquets, Tomb stones) had been present or not in the film – that is, an “old/New” recognition test. (We had some discussion about why not all of them did the memory test, or why there wasn’t an immediate memory test. I think the answer here is that in the design they wanted half of them to answer an interpretive question, and the other be tested for memory. I see no particular reason for waiting a week with the memory test. It could likely have been conducted right after, but see no argument against waiting a week either, other than the risk of participant attrition).
The findings, mainly, is that the music will induce participants to interpret the story differently (and, usually differently from when no music is present), and people will perform differently on the memory task – recalling more positive items in positive conditions, and more negative items in negative conditions, and similarly for the false alarms.
But, let’s go through some of the issues – most of them mine (because, well, my blog) but also some interesting commentary from the participants which I thought was very interesting.
First, I had an issue with the music that they used as either negative or positive. Here are the selections, with the descriptive adjectives (from the appendix of the paper). All the negative excerpt came from Tangerine Dream’s Rubycon album. This is electronic music – synth music. The excerpts were described as “Eerie, Mysterious, Unsettling”, “Eerie, Unsettling, Edgy, Suspenseful” and “Anxious, Mysterious, Evil”. I decided to listen to it – how I love Spotify – and it reminded me very much of “Bladerunner”. One of the film scholars pointed out that that soundtrack was by Vangelis, but that, yes, this was a “genre” sound – something a regular movie-goer would recognize as the music accompanying scary and eerie movies. I actually didn’t find it terribly eerie or scary when I listened to it. I just noted the style.
The positive music was more varied, and I also found it a bit odd. The article states clearly that positive music tends to be in a major key and fast paced – which is exactly a criterion I would use for inducing some kind of happy/contented emotion (and I believe that research on music and psychology would suggest something similar). The first piece is called Blossom Meadow, and characterized as a new age piano piece. It is described as “Calm, Light, Airy, Cheerful, Pleasant”. Never heard it, but I get a kind of music image from it – perhaps even that very genre indicative. Another positive piece is Schutzliesel, a German drinking song, with woodwinds, brass and percussive, described as Fun, Boisterous and Folksy. Hmmmm. I might use that to make people feel happy too, even though I have never heard it. The last one, though, is Barber’s adagio for strings. You know, Platoon (or, like I said, from that Vietnam movie where all the cute guys get shot). Described as Sad, Tender, yearning, Wistful, Solemn. Yes. Exactly. Sad induction.
What is this? Clearly, their terminology is at variance with mine as an emotion researcher (and I was not the only one that objected. Some of them got into meta-emotions also – you know, I cried my eyes out/almost shit my pants from fear – it was SOOOOOOO Goood – but psychologists have barely gotten into that area yet, from what I know).
Yet, if you look at results (which are labeled positive/negative), all the ratings for the clips paired with “positive” music goes in the same direction. In all cases, participants think that there will be no harm, the interpretation is positive, and mostly positive adjectives are used to describe the man (kind, loving, playful, etc) – but, god knows, this part is messy from a measurement point of view.
The semantic differential for each movie showed similar results – but the questions for each movie was custom made – thus not quite comparable (although they range from what one would consider positive to negative). Granted, they do not actually compare them statistically, but there is a comparison going on.
I was thinking that rather than positive negative, there is an indication of something being benign or malign – no harm or harm – safe or unsafe. Or, like one of the film scholars pointed out, genre recognition. (Here I go, the psychologist, thinking about basic human concerns, and the film scholar about what western humans have picked up from watching Hollywood style movies since before childhood amnesia sets in).
We also pursued an interesting discussion on the word “appropriate”. In the research some judges had been given a set of music pieces and asked to judge which piece of music was most “appropriate” for each film clip, which is how they decided on the different pairings. I kind of vaguely have a notion what that means (let’s do a pilot to see what seems to fit best by vote), and my main objection was that I thought they should have used the same piece for all clips, in order to make a good comparison – experimental psychologist as I am.
But, the objection from non-psychologists was that, what the hell is “appropriate”? Without any more specification? As an example, is Alex, in “a clockwork orange” singing “singing in the rain” while behaving atrociously appropriate? Certainly effective, achieving what the director intended. And, yes, if you really want to look at it there is no such thing as the vague appropriate. There is an active fitting of music for effect – and here I really get in over my head, considering that so many of the scholars there are specialist in film-music.
Still, with the flaws, I thought the paper interesting. It opened up a lot of questions, and in some ways that is what papers like this is for – making things very explicit.
Boltz, Marily G. (2001) Musical soundtracks as a schematic influence on the congitive processing of filmed events. Music Perception: An interdisciplinary Journal., 18(4) 427-454.
Magliano, Joseph P., Dijkstra, Katinka, & Zwaan, Rolf A. (1996). generating predictive inferences while viewing a movie, Discourse processes, 22(3), 199-224.
Rooney, Brendan & Hennessy, Eilis (2013). Actually in the cinema: A field study comparing real 3D and 2D movie patrons’ attention, emotion and film satisfaction, Media Psychology, 16, 441-460.