A longish tl;dr conclusion of Srull & Wyer trace

Social Priming is in the news again. Well, Pashler et al published a critique of a recent money-priming paper, and Neuroskeptic wrote it up. So, I figured I would advertise for my Srull & Wyer trace from last spring, where I follow citations forward from Srull & Wyer (1979) – the Ur-Donald paper (and, possibly along with Higghins Rholes & Jones 1977, one of the papres that started the social priming area). I’m still working on a manuscript for this. Not so easy when you use non-traditional (for psychology) methods. (Plus teach too much).


I got a tweet message from Michael Inzlicht asking for the TL;dr (or, take-home message as he said). It isn’t that easy. No tweet-size take-home really. But, I thought I should try to summarize a little bit about what I have learned.

First, I think priming happens – when the priming is strong, relentless and conscious. Srull & Wyer had participants unscramble an awful lot of hostile sentences, and the more they unscrambled, the more they judged Donald to be hostile. The effect was similar for kind, but somewhat weaker. There is nothing subtle about this. What people did not guess was that the rating of Donald had anything to do with the sentences.

I’m much more doubtful about the subtle primes. The few instances, the oblique influence chains, the outside awareness primes. The difference in means there are smaller. In fact, sometimes they seem more like published null results than anything else. But, with only 11 studies in total, and just a handful doing subtle primes, there isn’t much I can say.

But, putting it differently –I think Srull & Wyer would replicate. I’m guessing those that show films or pictures might (but am less certain), and I have big doubts about the subtle/outside consciousness would do so.

Few other interesting things:

The early papers being inspired by Srull & Wyer don’t really work on extensions, rather than, in essence saying “oooooh, look at this cool paper. Wonder if we can adapt it for our own purposes).

The number of participants in each cell is very small in general, so most of the experiments are underpowered.

Standard deviations are only reported in one of the 11 extension papers. It is also the only paper that report effect sizes. It is also never cited… But, there are also a few papers that publish the F-table, which actually is nice.

When I try the R-index and p-curving, it appears that there is over-reporting of significant results, but there may be some evidential value in the lot (but it is so heterogeneous, it is hard to tell).

It is also surprising how quickly the subtle primes dominate.

Some of the papers make me sad. So much careful work, such low power.

Others irritate me with their handwavey confidence, noice inducing methods, and certainty in pronouncing results that, for all I can see, is analyzing noise.

So, shorter: I believe what happened right before will influence how you see the next thing, even if you don’t think they go together (a kind of pathdependence), especially if the next thing is kind of ambiguous. Especially if the former thing is kind of strong and conscious. But, I severely doubt the more subtle kinds as a general effect.

Posted in Uncategorized | Leave a comment

A past crush passes.

When I was eleven, my parents were involved with a school-Theatre Project for my province: Dala-teatern. This meant that we were visited by the actors and the director working on setting this up, having discussion and what not.

I was particularly taken by the Young director. He was skinny, Brown-eyed with long-swoopy dark hair, and looked all dashing, smoking his pipe.

He occupied my fantasies, and I kind of knew that I was still not quite old enough to understand all the parts of this, but if he could just wait until I was a bit older.

He was 22, it was obvious to everybody that I was having a crush. I’d fetch his ashtray, I’d sit in the room listening to him and my parents talking. My mom even commented that he obviously had a Little slave in me.

Nothing really came of the project. I guess a few attempts, and then the big ideas petered out and the actors and the dashing director stopped visiting.

I knew Before i moved to the US that he had kept working in Theatre and published some book, because you know someone, you just notice the name.

Even the first time I returned to Sweden my sister told me he was doing directing, and had continuted writing books, and that was interesting that my old crush were kind of a public person.

Actually, at some later visit, I got some of his books – mostly murder mysteries. Ripping yarns, but very gruesome murders. Some were set up North, others way down South.

As time passed, I realized that his books were now translated into many many languages – an american friend of mine recommended one. Once back, I also realized that the murder mysteries set down South had also been turned into several TV series on Swedish television.

Then the BBC did one with Kenneth Branagh starring as Kurt Wallander.

And, today, a push-notice on my phone told me he had died. From cancer. 67 years old. RIP Henning Mankell.

(I thought of bringing this Little story up in my class on basking in others glory, but I was way too embarrassed. It was my 11 year old crush. I had a few others, and none of them became internationally famous. I’d be sad finding out they died, too, but I’m unlikely to find out via push-notices)

On Edit:

I found a Picture. From a few years later. Click the Dalateatern – second from the left.

Posted in Interlude | Leave a comment

R-index and p-curve of the Srull & Wyer trace.

I have been continuing playing with the data on the Srull & Wyer (1979) trace. First the R-index

Median FHT
Success Rate Obs. Power Inflation Rate R-Index r(N,d)
All 0,808 0,607 0,201 0,405 -0,510
Sub-set 0,883 0,710 0,173 0,536 0,074
P-App 0,78 0,66 0,12 0,54

For the first two I used Ulrich Schimmach’s Excel-sheet*.The first one uses all the studies (including S & W 1979) that pursue priming. (I leave out the Lord Foti & Devader, because priming didn’t work, and results were left out).

The second is on the subset of studies that I think are directly exending Srull & Wyer.

The last one is the R-index result I got from the subset I used when creating the P-curve on the nice Shiny Apps p-checker.

Which brings me to the P-curve.


Which I got off the P-curve app (before it decided to not work today.). I tried to scrape off the results that test whether priming works (comparing different degrees of priming, or priming with control) – which is not entirely easy to find. Hopefully I haven’t messed up too much.

*(He kindly provided me with some answers to things I didn’t quite understand).

Posted in Uncategorized | 1 Comment

Srull & Wyer trace – some global stuff.

Back in February, I wrote that I was curious about tracing Srull & Wyer (1979) forward. So, actually, I decided to do it. You may have noticed, as I have summarized some of the interesting papers in this trace. (They come right before this post, so I don’t bother to link).

The idea is in part from David Hull (science as a process, you need to look at how the idea develops), and in part because Social Science Citation index. I figure we have all used it to see if something interesting came of that paper that we could incorporate in our own review. Or possibly, if something interesting came of ones own paper.

I must say, the site has a lot of nice tools for summarizing cites.

So, here are some taken directly from Web of Science (or is it Thomson Reuters now).

Citation graph

Impressive amounts of cites for this paper, so I think I’m right in considering it a classic.

And, look who the top citing authors are!

Did you startle at number two?


But, back to what I decided to do.

Obviously, I’m not going to painstakingly collect and categorize 800 papers all by myself. (Or, maybe not so obvious, come to think about it). But, I wanted to be quite formal about what I set out to do.

I wanted to see how this idea has evolved over time, and tracing citations may be a way to do this. I began to think about it as a Golgi Stain, but in the virtual felt of science rather than cortex.

I decided to pull all the papers in the first 5 years. (Turns out that was 53.) I could find all but one of them (A swiss paper, barely cited).

Through the loveliness of PDF, I screenshot the Abstract, and found and screenshot every time that Srull & Wyer 1979 was cited in the paper.

In two it was only cited in the reference list…

Median number of times cited is 1.

Then I did some rough classifications based on the Abstract. I noted whether it reported an experiment, if it was a review or some other type of theoretical paper (there were a few proposed models and measures). I also noted down if I thought it actually extended Srull & Wyer, whether I thought the topic was related but not directly extending, and whether I thought it was more oblique.

Here’s a Time line.


There were 11 papers that looked like they pursued an extension, out of the 53 I pulled, so I began looking at them more closely. I blogged my initial scrutiny of just about all of them (Except Srull & wyer 1980, for some reason).

Extending papers 1

Extending papers 2

(I think you can click on them to make them readable).

These are some simple classifications of type of priming, priming measures and construct primed.

Priming task and measure Primed constructs

It gives a somewhat, well, motely impression. The extensions go in all sorts of directions, and are not quite extensions, but more inspired by the work. Patterns are emerging, but perhaps it would be worth it going 50 more papers (or 5 more years) in the future for a more coherent trace.

I’m now working on pulling together something more coherent with these papers. (Turns out one of them found no priming effect, and was oblique enough that I won’t include it).

I’m working on learning the R-index, which I think is reasonable even across such a motely bunch of papers (I do have a very rigid rule for what is pulled in), and possibly trying the p-curve and v-index.

N-s are on the small side. There are lots of places where error can creep in. Some papers are more trust-inspiring than others.

I’ll be updating, as I proceed. I’m planning to make a paper out of it, because it is interesting, but the method for selecting papers is a little unusual, at least for something formal.

Posted in Uncategorized | Leave a comment

Wyer, Bodehausen and Gorman. Final paper in the trace of Srull & Wyer 1979

The final paper in this early sampling that directly look at social priming is one on Rape judgment by Wyer, Bodenhausen and Gorman. “Cognitive Mediators of Reactions to Rape”

This is another complex design with not many participants. As I mentioned on twitter, lots of sparsely populated cells.

They do mention that they consider the work exploratory rather than confirmatory. I kinda like that.

The idea here, like much of this literature is that when you are faced with making a judgment about a situation or a person or an item that is somewhat uncertain (in this case, descriptions of rape-cases), you won’t take the time searching through all of your memory to find some matching prototype, but most likely stop with some information that you have already in mind. Such as something that was presented in that earlier experiment that has absolutely nothing to do with this experiment….

As with the others, there is a great deal of reasoning about how different kinds of primes may push around judgments, and I won’t really go into these here, because I really don’t think a cell size of 5 can properly answer those, but instead be more focused on what they did.

They recruited an equal number of men and women – students. 35 of each to be exact.

The cover story for the priming was that they wanted to investigate reactions to pictures that are shown in public media, that some people may think are morally objectionable. The priming materials consisted of 10 slides of pictures, where the “to be primed” concept was placed in the 3, 8th and 10th position.

They came up with 7 different priming conditions!

  1. Negative outcomes of aggression (basically, dead people)
  2. Aggressive acts that are considered socially OK (e.g. police subduing a criminal)
  3. Non-sexual intimacy between man and woman (e.g. holding hands)
  4. Female sex object (photos and cartoons)
  5. Sexual arousal (for men, or whoever digs women)
  6. Sexual arousal again – even more explicit

In the actual task, they first viewed all 10 slides without doing anything. In the second viewing, they reported their reactions to the slides using a checklist. They are not entirely clear what the checklist consists of, but they correspond to the following 5 factors:

  1. People are cruel and inhumane
  2. Aggression is socially sanctioned
  3. Intimate relations are desirable
  4. Women are sex objects
  5. Sexual arousal.

Come to think about it, could you really consider showing pictures of dead peeps, officers subduing perps, and women masturbating as priming? I can kind of see them thinking of it like that in 1985 (or prior, as the research must have been done before), but these are the types of stimuli that are used to induce momentary affect – such as Lang’s IAPS.

After this, they went on to the second experiment, which was more of a forensic experiment. Each participant was asked to judge 5 different descriptions of rape-cases on 10 different factors.

Each case described a rape. In each case, there was one section that described the perpetrator as either a stranger or an acquaintance. Another section stated that the woman had either resisted or not resisted (“fearing that she might provoke more serious injury to herself”).

A final fifth version omitted both types of statements. It wasn’t analyzed but was simply there so they could present the order of the cases in a latin-square manner across the 5 individuals per cell.

The participants rated each case according to following (using a 0-10 scale)

  1. a) extent which woman provoked rape
    b) likelihood she could have avoided
    c) likelihood she responded correctly
    d) extent her life in danger
    e) how emotionally upset she was
    f) how harmful effect rape had
    g) belief defendant should be convicted
    h) likelihood he will be convicted
    i) likelihood story is true

The ratings were aggregated into 4 composites
1) Perception of Crime (d, e, f)
2) Perception of victim – true (i)
e) Perception of victim – responsibility (a, b, c*)
4) Conviction judgment (g, h)

* means reverse scored.

In this judgment task, we are up in what would now be considered appropriate levels of observations. All 70 participants rated all of the cases, which were all properly randomized/latin-squared. They do report, very briefly, on what they consider the effect of the situational variations, but the inferential statistics consists simply of “All results to be noted were significant at F(1, 56) > 4,40, p < .05.” They also note the means for only one of these results.

I won’t go into detail here about what they found. It could be interesting, but it is aggregated across 7 different types of primes, so that should add some systematic noisiness (and it also isn’t my main concern).

Then they go on to analyze the effects of priming on the 4 composite judgments. They divide this up in three sections: The first looks at the two types of aggressive priming – comparing to control. The second looks at priming of relationship, and the third on priming women as sex-objects.

Remember. There are 5 individuals in each cell, because they are looking at men and women differently. (and not reporting any standard deviations).

Let’s start with the aggressive acts priming, and judged responsibility of the victim

Victims responsibility for rape
Aggressive outcomes Aggressive acts Control
Defendant stranger
males 3,7 3,4 2,33
females 4,1 3,4 2,7
Defendant acquainntance
males 5,23 2,5 2,47
females 3,77 2,7 3,73
Aggressive outcomes Aggressive acts Control
Victim resisted
Males 2,93 3,2 1,37
females 3,97 2,33 2,3
Victim did not resist
Males 6 3 3,43
Females 3,9 3,77 4,13

I have circled the two ratings that stick out – both in the aggressive outcome priming, and both by the male group (5 individuals) when the defendant ins an acquaintance and the victim did not resist. Means here are above the half-way point (which they are not for the other). They also report that there is significant interactions between priming type, sex of subjects and both of those types (remember these are two different analyses).

Both F’s are actually the same: F(2,56) = 4.09, p < .05.

I’m wondering if that was a typo, though. I’m not sure what the likelihood is that the actual F value would be exactly the same.

But, with only 5 in each cell, who knows if this is due to one particular individual in that particular cell.

Conviction of defendant

When they report their analysis of conviction of the defendant, they actually separate the responses (ought to be convicted vs will be convicted), but collapse across gender. As the two measures are within subjets, this means that the cells are now 10 individuals.

Should be convicted
Aggressive outcomes Aggressive acts Control
Victim resists 9,3 9,9 9,3
Victim does not resist 8,6 9,6 9,5
Victim resists 9,4 9,9 8,8
Victim does not resist 7,5 9,1 8,3
Will be convicted
Aggressive outcomes Aggressive acts Control
Victim resists 5,4 4,2 4,5
Victim does not resist 6,4 2,6 3,9
Victim resists 5,5 3,5 4,7
Victim does not resist 3,1 3,6 2,7

Priming doesn’t do anything to the ratings whether the defendant should be convicted, regardless of whether it is a stranger or an acquaintance.

They report two interactions for this – one 3 way, and one 4 way.

priming x acquaintance x resistance F(2,56) = 5.81, p < .01
Priming x acquaintance x resistance x type of judgment F(2,56) = 4,37, p < .05.

(Yes, I have a hard time understanding what is going on too.)

Priming aggression seems to not have had any discernible effects on the other two types of judgments.

Reading through the discussion, they are appropriately mute about interpreting the results. The make a little flag for possibly this is consistent with a just world. Being primed with aggressive outcomes resulted in higher ratings that the defendant ought to be convicted. But, as I keep harping on, 10 participants in each cell.

They do an interpretation about the “is she partially responsible” results – where the five guys together judged females who were raped by an acquaintance and did not resist the rape were judged as more responsible. We have to recall that this involves 3 different scenarios, although each measure involves two. When the defendant is an acquaintance, there is one scenario where the woman resisted, and one where she didn’t resist. Likewise, when the defendant is a stranger, there is one scenario where she resisted, and one where she didn’t. One of these overlap both judgments.

I don’t know what to make of it. I don’t think anybody should, considering how few participants there are.

Intimacy priming

Here they are comparing the 10 people who were primed with the couples to the control group.

They report a whole bunch of effects. First, how priming may have altered the perception of harm to the victim

Priming control
General percpetion 8,2 7,2 F(1,56) = 8,21, p < .01
acquaintance 8 6,5
Stranger 8,3 7,8 F(1,56) = 4,51, p < .01

Means are overall higher on the scale for those that were primed.

Also seems to have increased the degree to which the participants though the victim told the truth.

priming control
truth 8,5 7,7

They separate this out in men and women, as well as the circumstances, and report a significant effect. Men seemed to move around a bit here, but they report to means, just an F statistic. F(1,56) = 5,67, p < .02- The claim is that for men who judged a victim that resisted a rape by an acquaintance (that is, one scenario only) did not show an elevated belief in truth compared to control. Got that?

So, in other words, the 5 men in the priming condition judged the degree of truth of one story more like the ones in the control group, but we have no idea by how much. Is it close to the 7,7 overall? Does it even make sense to parse it down like this.

Finally victims responsibility (the conviction judgments did not yield any differences).

Priming Control
Males 3,6 2,4
Females 2 3,2 F(1,56) = 6,88, P < .01
Priming Control
Resist 2,6 1,8
Not resist 3 3,8 f(1,56) = 4.20, P < .05

I think what I want to point out first, is that all of the values are on the low end. In the text, they suggest that males judged the victim more responsible than in the control, but the reverse was true for the females.

Then, in the other analysis, they pool across gender to look at the effect of resisting, and they report it weirdly (although technically I can see it being correct). The primed rated the responsibility of the resisting victim higher than for control. The reverse was the case for judging responsibility of the non-resisters. Yes, technically that is true, but non-resisters are overall judged more responsible (or less non-responsible). But, can you even say anything with so few points? (I know, I keep harping on this).

Women as sex objects.

I actually have no idea how they aggregated this. They start with the three sex primes (sex object and two different sexual arousal ones) and then the control, which they put together in a 2 x 2 design. And, from this they find out that the sexual arousal only have a couple of effects that are independent of the sexual object so, as they say “therefore, all results to be presented are independent of the effects of priming stimuli on sexual arousal).

Did they throw out the nudie primed? Or what did they do? I suspect this, as the df is 1, 56 in their analyses, so what I present here seems to use only those individuals that were primed with sex-objects (and the controls).

There were nothing on perception of crime.

However, there were effects on belief that the story was truthful, and on responsibility. 4-way interactions between type of prime (object, control) sex of perceiver, whether rapist was acquaintance or stranger, and whether the woman resisted or not.

So here, for each data point, you have 5 individuals making two judgments,

First, truthfulness

Truth of testimony
Defendant stranger . Defendant acquaintance
Resisted did not resist . Resisted did not resist
Sex object 7,6 6,2 . 7,3 6,7
control 8,7 7,8 . 8,4 5,9
Sex object 9,2 9,3 . 9,1 8,2
Control 8,1 7,4 . 7,2 7,8

One result I could possibly believe is that men and women rate the victim differently when primed with women as a sex-object in that the women tend to believe her more and men to believe her less than compared to control. Yes, things are moving around due to the acquaintance and resistance factors, but geez….

The inferential evidence is this 4-way interaction

priming x sex of subjext x acquaintance x resistance F(1,56) 0 7,45 P < .01.

They also found an effect – same type of 4 way interaction – for victim responsibility.

Vicim responsibility .
Defendant stranger . Defendant acquaintance
Resisted did not resist Resisted did not resist
sex obj prime 2,1 3,4 . 2,9 2,7
no sex obj prime 2,4 2,6 . 1,2 3,7
sex obj prime 2,1 2 . 2 3,9
no sex obj prime 2,5 3,6 . 3,1 4,2

Again, here is the 4-way statistic.

priming x sex of subjext x acquaintance x resistance F(1,56) = 9,31 p < .01

What does this say? Aggregate of 3 ratings, but for 2 vignettes, and still only 5 participants in each point.

Possibly the most striking is that women that were primed held the defendant very low on responsibility, except when it was an acquaintance and she did not resist. But, I am really not sure what the results from 5 women can say here.

The judgment of conviction yielded nothing. They reported an “uninterpretable interaction approaching significance”. P-value was .10. I think we’ll well satisfied calling that not significant.

conviction judging does nothing. They report an “uninterpretable interaction approaching significance” that is a p .10
I’d say, there is nothing.

I kind of feel exhausted after having gone through this. It is, in a way, such a complex design, with the 7 different primes, the 4 different variants of stories, and the four types of questions where some were aggregated and others not. And, with 5 people in each cell, and no standard deviations anywhere, what can you say? Other than this is interesting to follow up on to see if these effects hold. The paper has been cited 61 times, so it is up for a forward trace.

But, as I mentioned, I’m not so sure these are primes rather than emotion inductions, and they differ from the verbal primes earlier.

I’ve gone through and noted the results in such detail, as I want to try to pull something together on these, but mainly I feel depressed over so much work done with so few participants.

Wyer, Robert S., Bodenhausen, Galen V., Gorman, Theresa F. (1985). Cognitive mediators of reactions to rape. Journal of Personality and Social Psychology, 48, 324-338.

Posted in Uncategorized | 1 Comment

Halo attentuation, and the availability of the letter “T” – furthering the trace.

The Halo attenuation paper is the first paper with reported Standard Deviations! And Eta Squared!

Kors I taket (Cross on the ceiling) as we would say in Sweden.

The n of each cell is also 40 (although I’m wondering if the correct n should really be 20, and the 40 is for a particular marginal effect, but I will get back to that).

We’re now up in 1984, and this is a paper on the Halo effect, looking at whether you can use priming as a ways of attenuating it.

The halo effect, if you don’t recall, is that individuals (and even things like companies, as one of my bachelors students found) that have some really good traits – they are nice, they are beautiful, they are successful, also are judged as more positive on other areas as well. The positive traits shine like a halo, and brightens everything around.

This can be a problem when we want to form accurate judgments, for example, in order to do fair performance judgments.

The halo-effect, they claim, has been stubbornly difficult to short-circuit in judgment situations, and they cite work where raters have been informed about the Halo-effect in various ways prior to making judgments, to no avail.

However, the authors have been impressed by Srull & Wyer’s work on how priming particular traits later alters the judgment of an ambiguous character – making the judgments assimilate towards the prime.

Perhaps, they think, if we first prime participant with some trait, they will then be more resistant to letting the halo-effect influence judgment on the trait.

The trait they choose is physical appearance. The “trait” (or what you should call it) that casts its halo is a teacher that has a warm personality and a lenient teaching philosophy, as opposed to a cold hearted bastard with a strict teaching philosophy.

The traits to be rated are liking, physical appearance, mannerism and accent.

As an aside, I find this order of work…interesting. I would think that the true problem with halo is when something like good looks makes people think performance is better or more desirable than warranted, but what they are testing is in some ways the other way around. I’d be happy if my warm personality and lenient teaching style would also make me look pretty, and if I was a cold-hearted bitch with strict rules, why would I care what you think about my looks.

But, no matter, it is an interesting question.

Those that are primed (81 out of the 161 participants) get to rate what they think about a number of physical traits. For example “ I find moustaches 1) extremely irritating ……8) extremely appealing

The to be judged material were two videotapes from Nisbet & Wilson (1977) depicting a teacher either advocating a lenient teaching philosophy or a rigid teaching philosophy. I’m assuming they use the same teacher in both. He is definitely male.

Taylor et al decided to double down on the halo of these two dudes, so they created short vignettes to be read prior to watching the films. The first one of a warm family man, and the other of a cold grouchy lonely man.

These were all crossed, resulting in a 2 x 2 x 2 between subjects design. That is priming vs no priming, warm vs cold vignette and lenient vs harsh philosophy.

In their analysis, they eliminated all of those that got mixed messages (that is, warm vignette with harsh philosophy, or cold vignette with lenient philosophy). These are not reported at all. Of course, it is interesting looking at the double whammys, but I would have liked to see the more ambiguous/contradictory ones also.

Finally, their measure is one where they rate the teacher, on a 7 point scale, how much they like him- After that they rate the teacher on physical appearance, mannerism and accent using that same scale as in the prime: 1: extremely irritating ….8 Extremely appealing.

They also had a sub-set of participants do a memory test of appearance, vignette and videotape, to see if the priming influence this (they looked at this as a more objective measure than the impression formation above).


In a nutshell, priming attenuated the positive halo effect for the physical appearance and mannerism. Nothing much happened for the negative descriptions, and nothing happened for the judgment of the accent

unprimed primed
positive 4,3 3,4
negative 3,2 3,3
unprimed primed
positive 5 3,9
negative 3,6 3,6
unprimed primed
positive 2,9 2,4
negative 2,4 2,7
Priming x physical appearance F(1, 157) = 9,35, p < .05. Eta squared = .05
Priming x mannerism F (1, 157) = 10, 73, p < .05, eta squared = .05

For some of the participants they also did a memory check – which they considered a more objective measure. Mainly they were interested in seeing whether priming also influenced how well they recalled the information in the vignettes and the films.

The priming of physical appearance increased memory for physical appearance, but nothing else. The effect was small: a 7.8 vs 7.2 score (out of maximum of 10, eta squared .6). I would think this is a bona fide priming effect too – have people rate what they think about various appearance things, later, when asked to recall appearance they are better at it (perhaps because they were primed to pay more attention to appearance). They didn’t remember the vignettes and the videos better (outside the appearance).


Here they used 40 participants in each group, which I would think is much more robust, from the vantage point of 2015. (Nelson et al had some rules of thumbs of what effects can be detected with what sample size. I think this is touching it).

But, this is not the type of priming effect that is described by Srull & Wyer. The vignettes and movies are not ambiguous. Instead they are very specifically positive or negative. The role of the priming is to try to regulate the halo effect – the warm and relaxed guy looks better than the cold and rigid guy, not because they look different, but because the warmth and coldness spills over. It seems to have done so in the positive account, but no evidence in the negative account. But, my suspicion is that there wasn’t any negative halo taking place there anyway, so nothing to attenuate.

Is this really of the same nature as pushing around ambiguity? I’m not sure. I’m not sure how to think about ostensible priming effects, although I like Andrew Wilson’s suggestion that it is some kind of canalization. Appearance is brought to mind, then, when judging appearance perhaps one is more likely to pay better attention to it. It wasn’t a big effect, but it was there.

I’ll stick this short paper (published next in the order) here. I’m not sure it really belongs to the part that extends. It primes, and does so subliminally, but it is not about perceiving people, but judging frequency of words with the letter T.

I suspect that it cites Srull & Wyer (and Bargh & Pietromonaco) because the research was done as an honors thesis under Russell Fazio.

Participants were presented with 2 blocks of 20 words that had the letter T in it. They were presented at 1/500 ms tachistoscopically.

At the end, they were asked to rate, on a 9 point scale which one of two letters appeared more often.

For example: “Do more words contain T or S”. Anchors would be “Many more contain T” and “Many more contain S”.

Target comparisons were comparing T to the letter D, M P R and S. (There were other comparisons).

The Primed participants judged that T was more frequent: M= 5, 25 vs m = 0,43) t(13) = 2,43, p <.05.

I’ll just end with ETA OIN SHRDLU

Taylor, Karen, Bernardin, H. John, Riegelhaupt, Barry J. (1984). Halo error: An assessment of priming as a reduction technique. Perceptual and Moteor Skills, 59, 447-457.

Gabrielcik, Adele & Fazio, Russell H. (1984). Priming and frequency estimation: A strict test of the availability heuristic. Personality and Social Psychology BUletin, 10, 85-89.

Posted in Uncategorized | Leave a comment

Higgins Bargh & Lombardi, more on the trace.

I’m skipping a couple of papers (just for the blog series, I will get back to them) to post about this one.

Higgins, Bargh & Lombarid (1984) is definitely one that is extending the priming literature.

Three words first:

5 per cell.


Yes, granted, they actually collapse over those cells, which then ups it to 15 per cell, but still.

Let me get back to the purpose, and how they go about the experiment.

What they want to do is to distinguish between 3 models that can account for the priming effect on categorization.

They consider two types first of all: The mechanistic model, and the excitation transmission model. The first is a very computational one (from Srull & Wyer), whereas the second is more electric. They subdivide the transmission model further into two; The battery model and the synapse model.

And, I think I leave it there, because the models are perhaps not that important for what I’m trying to pursue. I like the idea that they are setting up models, and deriving alternative predictions that can then be tested, of course. None of those “differs from null” things here. But, I’m not entirely sure how well this end up working in the end.

Instead, I think I will focus on what they did, and what the results are. I’m a bit Nassim Taleb inspired here. Theory/schmeory, look at the damned phenomenon.

In this work, they think the crucial dividing point between the models is whether something has been frequently primed or more recently primed.

The Srull & Wyer work suggests that frequency matter. So far, nobody has really looked at recency, although through squinting enough, one could think the Fazio et al, with the puzzle placed in the 7th position (which due to duplication becomes the 7th and the 17th position over 20 presentations) could possibly be considered a mild recency effect – but then again, I’m not sure that effect actually happened.

So, how do they go about this?

The general template is the Donald paradigm: a priming sentence-unscrambling task, followed by judgment of an ambiguously described individual.

But, they didn’t want to have just one ambiguous trait dimension. They wanted more, to see if the effect generalizes. So, they created ambiguous stories that could be either independent/aloof, or adventurous/reckless or Persistent/stubborn. This is actually not analyzed, so I’m not sure what happened. For simplicity’s sake, I’m using the Adventurous/reckless example to describe the priming manipulation.

The idea here is to see whether the more frequently primed, or the more recently primed construct will influence the subsequent judgment of the ambiguous character. And, of course, the frequently primed and the recently primed need to have opposite valence. That is, in the Adventurous/Reckless example, positive synonyms for adventurous are presented more frequently (Bold, courageous, brave) whereas a negative synonym for reckless are presented as the last prime (Foolhardy, to pick one of their synonyms). And, vice versa. It is perfectly nicely crossed.

The priming task was a sentence-unscrambling task, 4 words presented on the screen, same specification as in the original Srull & Wyer. Participants are to say their sentence out loud.

First they go through two 20 sentence practice trials (they don’t know it is practice). Then they go through the 20 sentence priming trial. The 7th, 12th and 15th trial contains the synonyms for one of the valences, and the 20th a synonym for the opposite.

Once they are done with this, they are asked to do a backwards count in thirds from some large number for either 15 seconds or 120 seconds. This is an interference task. The delays are selected so that they can distinguish between the models.

Finally they are presented with what they are supposed to judge, and the method here is actually – ambiguous.

They are presented with a series of ambiguous descriptions that they are supposed to label with one word (written). In the first series they get descriptions of animals, and in the third they get the description of an individual that behaves either adventurous/reckless or any of the other combinations.

The description is very unclear. They talk about series, and I can’t make out whether all participants get to label all of the ambiguous persons, or only the one particularly fitting the prime. I think the latter makes more sense (I’m not sure the priming would work across like this), which means that, again, this is a one-shot measure.

This is all the involvement of the participants (they do the probe for suspicion, and get rid of 3, which they then replace).

The labels are then rated by judges as to how synonymous they are with the primed traits on a 6 point scale. A one indicates that the word coincides with one of the negative synonyms (“same as negative alternative construct”) and a 6 that it coincides with the positive.

So, get that? An additional label of judgment, done by others.

So what do we have here, design-wise? We have two types of priming (positive frequent/negative recent vs. Negative frequent/positive frequent). We have 2 types of delay. And we have 3 types of traits. 12 cells. Five in each.

All of this is thrown into an ANOVA, but only the 2 x 2 is reported.

Brief Long
+ frequent/-recent 3,1 3,4
-frequent/+recent) 4,8 2,9

The interaction here is significant F(1,48) = 4,84 p < .05

They then look at how often participants classify the ambiguous person using either the recent construct or the frequent construct (they throw in ambiguous also, because I guess not everyone were that clear in their labeling).

Brief Long
Recent 21 11
ambiguous 1 3
frequent 8 16

They test this with a chi-square, (N=60) = 6,79, p < .05

Looks like recency works across short delays, but not sure what happens over long delays, suggesting a reversal.

This supposedly discriminates between the three models (only the synaptic model would predict this pattern they say. I won’t evaluate that claim).

What I’m much more baffled by is the very low N. Even when collapsing over the different trait types, there are only 15 individuals in each condition. There are many instances of uncertainty and noise to creep in, and I’m not sure how replicable this is.

Methodologically I think it is interesting, even the reasoning is interesting. I think there are parts in here that are, well, open for intrusion so that results may not be as robust.

Posted in Uncategorized | Leave a comment

Leadership categories, prototypes, and a failure to prime.

Lord, Foti & De Vader’s paper is more… well, inspired, in part, by Srull & Wyer, especially the third experiment.

The two first are much more interested in understanding how we think about leadership, and more precisely, leadership categories in a Rosch manner. In the first experiment, they simply have participants list features that they think a good leader should have, and then they analyze these features to understand whether leadership categories are related through family resemblance, and which features seem to have cue validity (if this feature is present, it is probably a leader).

In the second experiment, they are interested in looking at the accessibility of features that are either leadership related, neutral, or anti-leadership related. The create a questionnaire, that they call the “Acron Leadership Questionnaire” or ALQ. It consists of 25 two word items, such as “Emphasizes goals” “makes jokes” and “neglects details”. I assume that the first is congruent with leadership, the second neutral, and the last incongruent. Participants are asked to respond to each using one of 5 computer buttons that mainly correspond to likert-scale type categories. (Not at all well to extremely well). The researchers are particularly interested in reaction time, as quick reaction times likely means that this particular description is highly accessible, and would give a cue as to the underlying category structure of leaders.

I’m not going to discuss the results, because they are not of the main interest here. But, I described the task in some detail, because most of the participants in this experiment also participated in experiment 3, and these participants were considered primed with the leadership concept.

In the third experiment they are investigating what it is that makes someone perceived as a leader. They propose two mechanisms: the first is how well the described individual matches a prototypical leader, and the second is whether the prototype of leader has recently been brough to mind, in the way that Srull & Wyer brought hostility to mind via their sentence unscrambling task.

To investigate this, Lord et all created three short vignettes of a manager, John Perry. The vignettes were either prototypical, neutral or antiprototypical of leadership. (They dipped into experiment 1 to construct these).

After having read one of these vignettes participants got to rate John Perry on the following:

His “contribution to store managers’ effectiveness”

“his influence in deremining the new product’s success”

“his leadership exhibited”

“His desirability as a district manager”

They also rated how often they thought John Perry would engage in each of the two-word behaviors from the ALQ

The participants that rated were most of those that had done the ALQ rating in experiment 2 (primed condition) plus an additional 34 participants who did it without the priming.

The main finding for us is that priming did not do anything to any of the dependent measures.

I thought I should, maybe, speculate why this is. As with all the earlier work, the n/cell isn’t high. There were 61 participants in the priming condition, but as this was divided into three vignettes (one shot) that leads to about 20 participants in each cell. Even fewer for the non-prime – about 11 per cell.

This, so far, seems to be the standard.

But, should we really expect a priming effect? In the earlier work, the ostensible effect (robust or not) is that the prime biases the judgment of information that is ambiguous. When we can’t make sense of things in and of itself, what has been brought to mind earlier will influence how we judge it. Hence, we have the more hostile Donald. But, of these three vignettes, only the neutral could possibly be considered ambiguous, or at least not displaying any particular cues as to whether John Perry is a good leader or not. This should be the only place where you would see an effect of a prime biasing responses.

Also, in all the earlier priming work, the prime (or variants of primes) are designed so that they should be able to bias the subsequent responses. I’m not sure that responding to the ALQ can be considered a biasing prime. It contains both prototypical and anti-prototypical leadership behaviors. If anything, the prime could possibly have narrowed standard deviations (the concept of leadership is already activated, so the participants don’t have to create one ad hoc), and possibly speed up responses – although they don’t measure it.

It is mildly interesting that the exposure doesn’t seem to have an effect, but there isn’t much more that can be said.

Lord, Robert G., Foti, Roseanne, J. & De Vader, Christy L. (1984). A Test of Leadersip Categorization theory: Internal Structure, Information Processing, and Leadership Perceptions. Organizational behavior and human performance, 34, 343-378.

Posted in Uncategorized | Leave a comment

Will funny TV prime kids with Funny?

The first paper looking at non-college students, in fact young children, turns up 1983. Byron Reeves and Gina Garramone tested whether exposure to a TV program could prime traits that are then use to judge another character – very much in line with the work of Srull & Wyer and all the others.

The participants were kids in 2nd, 4th and 6th grade.* Two classes for each. One were the experimental class, and the other the control class.

What they wanted to prime was “Funny”. First, they put together a 10 minute video with clips from “prime time syndicated situation comedy programs”. The clips had been rated by other kids on how much they made them laugh, and how funny they thought the characters were.

Then, they created a vignette of Andy which was ambiguous as to how funny he was, although the situations described were ones where he could have been funny. For example “Later in the day, Andy’s class went on a field trip and Andy made jokes on the bus”.

The experimental classes got to see the film, and then they read about Andy. The control classes only read about Andy. Then they rated him on 25 traits, using a scale from 1-4.

Target traits, they claim, were Funny, Attractive and Strong. I’m not sure how they came up with this. I certainly buy Funny, but wonder if they did a bit of exploring to find effects for both Attractive and Strong also.

Overall, collapsed over grade, priming did not result in any difference. Neither did class. But, there were interactions for the above three traits. Mainly, it looks like there was a lot more variance between the grades in the control condition than in the experimental condition. Especially the second graders tended to rate Andy as more funny, attractive and strong in the control condition than in the experimental condition. The 4th graders don’t seem to differ much, whereas the 6th graders tended to go in the other direction from the 2nd graders, but the differences are not large

Control Experimental Control Experimental Control Experimental
2nd 3,6 3 2,67 2 2,6 2,2
4th 3,1 3 2,18 1,9 2,25 2,12
6th 2,95 3,27 1,75 2,16 1,87 2,29

The cell sizes ranged from 19-25 (whole classes), so mainly in range of what has been done before.

Reeves, Byron & Garramone, Gina M. (1983). Television’s influence on children’s encoding of person information. Human Communication Research, 10 257-268.

Posted in Uncategorized | Leave a comment

The two next papers in my Srull & Wyer trace: Carver et al (1983) and Fazio et al (1983)

Onward to two more papers in the trail of Srull & Wyer citations.

The first one is the easier one.: Carver, Ganellen, Froming and Chamber’s probe into Modeling and Category Accessibility. Modeling here is not related to SEM or Neural nets, or posing in front of a camera, but to Observational Learning, where the model is the person we emulate.

It consists of 2 experiments where the first uses the Donald Paragraph*, and the second uses the sentence priming task from Srull & Wyer.

In experiment 1, participants are first exposed to a video of a businessman and his secretary (ungendered, but one can guess). In one video, the businessman was hostile and derogatory, in the other he was neutral.

Then they judged toned-down-for-Florida Donald, on the same scales as Srull & Wyer.

Originally, they intended to also look at gender-differences, so they recruited 20 males and 19 females (appears that one guessed the connection and got dropped), but they found no gender differences. They did find a model difference, as predicted.

Hostile Neutral
Descriptively related 38,83 35,14 F (1,74) = 5,58, p < .03

Nothing for the evaluatively related (interesting).

Experiment 2 reads like a mash-up between Milgram and Srull & Wyer – minus the scientist demanding obedience.

The ostensible task that the participants are doing is a learning task. They will be the teacher, and their job is to administer electric shocks to the learner when he did a mistake (the learner is a confederate of undisclosed gender, but I don’t think “he” is a stretch.) The shocking apparatus has 10 setting, the participant gets to experience and rate the intensity of the shock. The instruction to the participant is to teach the problem to the learner as effectively as possible.


Then, oops, a poor masters-student would like to get some help (this is definitely a she), could they please help?

This task is the sentence-unscrambling task from Srull & Wyer. There are 30 items, and the mix is either 80% hostile or 20% hostile.

They then proceed to the learning task. There are 34 trials, and the confederate is mistaken on 20 of these. Yes, like in milgram, no shocks are administered. I presume that, unlike Milgram, there are no shrieks of pain.

In a debriefing, participants did not realize that the priming and the shocking had anything to do with one-another, but one participant didn’t believe he had administered any shocks, so he was eliminated from the analysis.

The Dependent measure was, of course, the average shock-intensity, with the predicted effect that the 80% mix would administer stronger shocks.

It came true!

80% hostile 20% hostile t cohens d
3,31 2,24 t(29) = 2,.24, p < .05 0,82

*In, presumably a pre-study, they found that the University of Miami students perceived Donald as more hostile than the University of Illinois students did, so the softened and/or deleted some elements to make it more ambiguous.

The second paper, by Fazio, Powell & Herr is a lot more… complicated.

Fazio is mr Attitudes, and attitudes is what he has explored during his career. (I figure I disclose again that he was one of my professors, and I took his attitudes class).

The question they are exploring here is whether being exposed to an attitude object will influence subsequent judgments of something that is unrelated.

To back-track a bit, an attitude is considered an evaluation of an attitude object. To make it more concrete – I like my i-phone, so my attitude is positive. I loathe ketchup, so my attitude is negative, in fact very strongly so. But, for lots of things that we are at least somewhat familiar with, we do have this kind of mild positive/negative evaluation. We like or dislike them.

Now, if I’m exposed to something that I have either a positive or negative attitude towards, quite incidentally, could that possibly bias me so that the judgment I make of an ambiguous person will be more like my attitude? Or, would exposing me to ketchup drive me to judge Donald as more Hostile?

That is a rather oblique chain.

Their first experiment is a conceptual replication of Srull & Wyer. They simply need to find out if priming with evaluative adjectives could bias judgment in a person perception task.

The priming task was the Color task described in the earlier Fazio study (which I believe is adapted from Higgins et al 1977). The ten pairs were presented twice. (As they had been in the earlier also)

They created four conditions:

a)Positive applicable, b) negative applicable, c) positive non-applicable, d) negative non-applicable.

The selected prime-words are listed below.







Negative non-applicable
















The ambiguous story was about Ted, a high-school student waiting for his ride who is then asked to participate in an experiment where he solves a number of problems.

Participants are then asked to indicate why they think Ted participated by rating the following on a 0-10 point scale:

  1. In order to earn the extra money
  2. To have something to do while waiting for his ride
  3. Because he liked and was interested in the experimental task.

What they were particularly interested in was whether the prime could move around the judgment of the third reason – the intrinsically motivated reason.

They do a somewhat convoluted summary of the causes. They take the mean of the first two (extrinsically motivated), then subtract the rating of the third. In the resulting index it means then that the lower the number the more the participant attributed the reason to participate to intrinsic motivation.

The results are weak. The mean rating does not reach significance, although the pattern looks like they had hoped. The reason, they speculate, is that the standard deviation in each cell is very high. There are 15 individuals in each, so maybe not surprising it is unstable and non-significant.







Negative non-applicable
1,567 3,200 3,033 3,033

Mean standard deviation: 8,57. Cohen’s d for the applicable primes: ,19







Negative non-applicable
Above median 4 10 9 6
Below Median 11 5 6 9

So, instead, they do a median split, and count number of participants above and below median- as you can see above. They find an interaction between applicability and valence using a non-parametric analysis, and in the applicable condition, they find a significant difference between the priming conditions.

Just for laughs I stuck the applicable and the non-applicable in separate Chi-square analyses, and found the first one significant, but not the other one.

They admit that this is disappointingly weak, but proceed to the next task anyway.

Experiment 2.

Here the connections are even more stretched – and this is the main experiment.

They really didn’t want to prime with attitudes, but to prime with an attitude-objects; simply exposing someone to this object to see if that changes the rating (as in the ketchup-Donald suggestion).

But, our attitudes can be somewhat idiosyncratic, of varying strengths and really difficult to control in an experimental setting, so they do what they have done in lots of experiments since: they create and manipulate attitudes.

Participants are presented with 5 puzzles. These are presented in two versions. For 1/3 of the participants (that is about 37 or so) the worksheets are not filled in, and they are asked to work on solving the puzzles. For the remaining 2/3rds (whatever n to fill up to 112 participants), they get the same puzzles, but these are now solved, and they get to listen to a tape where they explain the puzzles and how to solve them. These two conditions are the direct experience and the indirect experience conditions.

After they have been exposed to the puzzles, they get to rate each puzzle on a -5 to + 5 scale, where -5 means extremely boring, and -5 extremely interesting.

For half of those in the indirect condition the experience doesn’t stop here. No, they also get to repeat their ratings of the puzzles twice. This is done under the guise that the experimenter needs some help with the data-entry and getting the ratings to a professor.

So, here we have created three types of attitude-formations:

Direct experience,

1 repetition of explicit attitude

Indirect experience

1 repetition of explicit attitude

Indirect experience

3 repetitions of explicit attitudes

One can think that the attitude would be stronger in the first and 3rd condition than in the middle condition.

In the next step, the priming takes place, and the individual priming is in part tailor-made as follows.

The experimenter is presented with the participant’s favorite and least favorite puzzle. The participant is then randomly assigned to either a positive or a negative condition. The priming task is then created with the selected puzzle in the 7th position of that same color priming task.

Now, the participant will be primed either by her or his favorite or least favorite puzzle.

Finally, they get to rate Ted, who has now become even more ambiguous as to what motivates him to participate in the experiment (incidentally, of course, it is a puzzle experiment). They add an open-ended question also, which they simply correlate with the other ratings.

The ratings are combined as in experiment 1, and here are the means.

Direct experience,

1 repetition of explicit attitude

Indirect experience

1 repetition of explicit attitude

Indirect experience

3 repetitions of explicit attitudes

Positive -,309 ,063 -,170
Negative ,285 -,234 ,366

Notice first of all that no ratings are higher than the absolute value around 0. Remember that the scale goes from 0-10, and that the intrinsically motivated score was subtracted from the mean score of the two extrinsically motivated scores. This means that they were all very similar. I’m not sure one can fruitfully compare to experiment 1, as so many of the numbers are embedded in one another, but the means were much higher there, and there seems to be much more variation.

The reported results are marginal: The main effect of puzzle valence: F(1,106) = 2,72, p = .11

Condition x valence interaction F(2,106) = 2,87, p = .06.

But, let’s look at the means to see what they are after. The idea is that you may be able to have a stronger reaction if the attitude is strong. This is not the same as extreme. It is just strong. It will be reliably and quickly evoked in whatever direction. And, a way to create a strong attitude is to either interact with the attitude object, or to repeat ones attitude several times. That is the case in the two conditions on the flanks. Those primed with positive attitude objects end up having a negative score, meaning that they attributed Teds behavior more to intrinsic than extrinsic motivation, whereas when primed with negative objects, the reaction is the reversed. It is as if the evaluation of the object kinda spills over into the evaluation of puzzle solving ted (he does it because he likes it). The stats for the direct experience are t(106) = 2,01, p < .05, and for the repetition t(106) = 1,84, p < .07.

In the weak attitude condition, the pattern is the opposite way, although I’m not sure why that would be the case. T(106) = 1.01.

Then again, none of this is very strong evidence. I keep wondering (now) if they are chasing noise.

In both these papers, there are never more than 20 people in each cell, measuring effects that really should be thought of as weak, and which are showing up as weak also. I’m not sure how to put this all together.

Carver, Charles S., Ganellen, Ronald, J., Froming, William J., & Chambers, William (1983). Modeling: An analysis in terms of category accessibility. Journal of Experimental Social Psychology, 19, 403-421.

Fazio, Russell H., Powell, Martha C., Herr, Paul M. (1983) Toward a process model of the attitude-behavior relation: Accessings one’s attitude upon mere observation of the attitude object. Journal of Personality and Social Psychology, 44, 723-735.

Posted in Uncategorized | Leave a comment