A sad story about the students who plagiarized.

A long time ago, in a little country tucked under the polar circle, there were two students, lets call them Pat and Sam, who were doing a research project for a class. No matter what level, just say, far enough along that one ought to know better.

The project stretched over half a semester, culminating in a research report. Four days prior to deadline, I, who was the course leader but not the advisor, got e-mails from both Pat and Sam as well as their Advisor. Advisor said the project is not in good enough shape to turn in and has recommended waiting. (In this little Nordic country, there are always second, and third, and fourth, and fifth chances to turn things in. Only time you get restricted is if we stop giving the course). Pat and Sam did not want to wait. I, their intrepid course leader spoke to Advisor, who apprised me of the situation. But, as we both knew and agreed on, students can turn things in if they want to. We can only recommend against. Which I tell the students.

Come morning after deadline (Pat and Sam have dutifully turned in their paper), Advisor e-mails me and tells me that their paper looks awfully lot like the introduction of AdvisorAvisor’s paper that was given to Pat and Sam as inspiration. Said paper is cross-disciplinary and under review (but not yet accepted and not published).

At a rather cursory glance, I first compare a chunk of yellowed out area, several sentences long, which looks like a slightly re-written chunk from Advisors paper. A little bit different wording, but basically the same content. Also nothing like “as mentioned by Advisor in the not yet published paper” indicating where they had gotten their ideas. (The paper is listed in the references though).

As I scroll down, I start being able to predict what will come next in Pat & Sam’s paper from what is in Avisor’s paper. Pat and Sam also refers to research that belongs to the other cross-discipline, which really has very little to do with their actual project.

On top of this, there is an e-mail exchange between Advisor and Pat & Sam, prior to turning the paper in, where Advisor tells them that their intro now looks awfully close to Advisors intro, and to make sure that they do not plagiarize. Sure, Pat and Sam answers. We’ll fix that. We just want to know we are on the right track.

This little Nordic country has very stringent rules about suspected plagiarism. As soon as there is a suspicion, it should immediately be turned over to the disciplinary committee, for them to look it over, determine whether there really is plagiarism, and, if it is, determine what the consequences are. The consequences can be relatively mild (from a US perspective) – a couple of months where you are not allowed to be at the university, which means you may miss exams and obligatory moments.

We, as teachers, or even the chair, are not allowed to take any actions. In fact, we shouldn’t even speak much about it once the suspicion is there, but just turn it over. That is not very psychological in some ways. We did talk, but really, because both Adviser and I needed to get used to the idea, because we had never encountered this before. Also, to feel that we were backed up, and that we knew how to behave.

I found out, for example, that I’m not allowed to fail students on the grounds of plagiarism. That is from the top down, in the laws surrounding university education. That is because it is seen as so serious that it is immediately taken out of our hands.

I had some thoughts that it would be nice to have options to just fail and scold in some instances – letting the students know how much trouble they are in, but this time they will only get the fail, and a re-do.

But, I can see how arbitrary that can become. In some ways it is nice that it is taken out of our hands. That the law says we are Obligated to report this – we have no choice.

In this case, it is panic plagiarism, because why on earth would you plagiarize your adviser, especially after having been told that it looks too close. But, really, Pat and Sam are too far along to claim that they did not understand that what they were doing – borrowing a structure, and just slightly re-writing – is not permitted.

It feels sad. And, I put it here possibly as a kind of warning. I always mention that we have policies on academic misconduct, usually casting it as a protection for those who do the right thing (which it is), rather than as a “we’re on to you, you cheating bastards”, because I recall feeling slightly offended by that. One of my friends says that too. I also link in the site with the clear policy on what constitutes misconduct, and also another tutorial site on how to avoid plagiarizing. But, having the didactic story, the personal anecdote of the very sad story of Pat and Sam who thought, in a panic, they could borrow the structure of a paper to get a paper in, may be more effective – story telling animal that we are.

 

I wrote the above a long time ago, when it just happened, but didn’t want to post it close to the event. I’m a public person, and students can read my blog. I don’t want them to walk around wondering which ones of their class mates it might be, so I delayed until it could be none.

The fallout was that the disciplinary group deemed it plagiarism (phew). They got a warning, and the excuse for this was that they had studied abroad (in decidedly western countries), and thought that delaying turning things in would mean that they would get kicked out of the program. (Hmmmm – the repeated chances is not a secret – it is Swedish law – but I am telling the powers that be that evidently we have to repeat this over and over). And, they didn’t realize that they needed close contact with advisor. Hmmmmm. I’ve advised a lot of people, and they have been wonderfully good at scheduling themselves. In fact, the class-mates that I advised had no problem getting in contact with me. It is just kind of typical of the disciplinary group that the students get a lot of benefit of the doubt. But, I thought just having it recognized as plagiarism was a win. It was close to the outcome I wanted anyway – it is plagiarism, now, go rewrite.

But, one thing that gave me pause was that, evidently Pat and Sam thought what they had done was OK – that it wasn’t plagiarism.

Posted in Uncategorized | Leave a comment

On Ekman And Friesen, and methodological critiques.

Jim Coan put out a long post on Negative Psychology the other night.

Others can comment and debate the rest of it, but one thing caught my eye, in his defense of perhaps wild, not methodologically sound work that we admire, and that was Ekman and Friesen’s early work.

My eyebrows kind of arched here, as this is something I have followed for a long time.

The Ekman/Friesen work and its legacy is part of a very very long standing psychological conflict, which you can find echoing in publications to this day. It is every bit as acrimonious as the current social psychology skirmish, and rumour has it that Ekman and Russell cannot be in the same room. The conflict breaks down – roughly – between the categorical/universals and the Dimensional, with Russell being the proponent of the dimensional and Ekman the categorical. Neither view was new when they were debating this in the 90’s. Darwin’s book on expression is very categorical (and the methods he used rather similar to what was used by Ekman). Ekman came out of Silvan Tomkins work on rehabilitating the importance of emotion. The dimensional account was present with Schlossberg, and Osgood and Suchi in the 40’s and 50’s. (Emotion Review had an interesting account by Phoebe Ellsworth on the beginnings of Ekmans pancultural research, and which faces were used).

As I was reading these accounts for my graduate research, it was clear that neither the dimensional nor the categorical won. Both theories are very useful conceptualizations, depending on what you look at. I always felt that Russell didn’t quite have the evidence on his side (which could possibly be because it was not yet clear how to find good evidence – or could be because I started out with a more categorical conceptualization), but I always thought he did a good job poking at the areas of weakness in the categorical/universal account. Also, Alan Fridlund aimed very pointed critique towards Ekman in accounts that I’m very sympathetic towards, but held the echoes of an unspoken conflict in their writing.

This is very much the idea that Hull says when he says:

Scientists rarely refute their own pet hypotheses, especially after they have appeared in print, ut that is all right. Their fellow scientists will be happy to expose these hypotheses to sever testing.

These days, it is not so much Ekman and Russell, but Feldman-Barrett (for the dimension side) against – I don’t know. It seems that outside the emotion researchers the categorical has taken hold as canonical, but inside there is a much more nuanced view. Point is, that the articles about this are still dripping with the conflict, derogating straw-persons of the opposite view, to the point that I actually have a very hard time reading the articles because I get too upset. My firm belief is still that both views are important, and perhaps there is a need for a different kind of conceptualization – more dynamical – which I think in part is out there.

In his intro, David Hull speaks of the science myths that are used to give evidence for some kind of scientific practice, and how, as they are myths (that is not true) cannot be used as actual evidence for how science is done.
The Wallace-Friesen is undergoing, to this day, the same kind of negative psychology that Jim goes against. It just has not been done in social media. It was done in articles and books, and the conflict is evident.

Posted in Uncategorized | Leave a comment

To Sivilise or not to Sivilise

We had a brief exchange on Twitter, James Coyne, Tom Hartley, Helen South, and me, with some comment Keith Laws and Mark Bolstrige on rudeness and civility.

Occasion is, of course, the current tribal clash between those who think science needs fixing, and those that think – well, as that is not my particular ingroup I will (in the name of civility, and demonstrating my working knowledge of Social Psychology) refrain from labeling them.

A lot of the discussion/debate has not been about the research, but etiquette and behavior (some questions about the research has been brought up also). There has been charges of bullying, of defamation, of snark, of unseemly behavior. I’m not sure if I should bring the popcorn, or bring the swords, or flee in horror.

I have very conflicted thoughts about calls for civility or allowing the rude. I believe in freedom of speech. I think this is extremely important for democracies to function. And, I think you need to tolerate a bit of rudeness and heat and snark. It is fun snarking, being at the receiving end makes you realize it is a boojum, but, fair is fair.

But, there is also a downside, in that people really vary in how much of this they can tolerate.

I’m going to try to illustrate. I’ve been hanging on a forum for over a decade that was started by a woman who is extremely polarizing – and she knows it. She is very smart, very outspoken, and not afraid of taking the fight. People either love or hate her. Those who love her have oversight with the more prickly parts (me), those who hate her work on pushing her out, either through banning her, or through moving elsewhere (I have seen both). The pushing is extremely nasty, and upsets me as a form of bullying, but she is also capable of giving as good as she gets. She also doesn’t hold a grudge.

The forum has a contingent that like to argue Politics. Participants there openly say they enjoy the nasty, ad hominem fighting, and as the sides keep coming back for more, I believe them. I don’t understand that mind-set, because I don’t (and I make good use of the scrolling function).

One of the “rules” on this forum is that you don’t get to be a tone police. You either deal or leave. And, one way to deal, if you are like me and don’t like the pummeling matches, you just don’t join in some of the discussions, and there are other discussions where you are kind of measured. So, in many ways, the voice of the meek may be much more filtered than of those that bluster.

This is not entirely wrong though, as Cyberspace sometimes gives the illusion of privacy, when it actually means shouting to anybody with a device and access to a modem (as I figure John Bargh realized a couple of years ago). There may be things you don’t want to leave as a legacy for people to rub your face in later. (I figure as this new online record stuff keeps going, we learn to deal with that shit too. It is not like not having any privacy is a new state of being for humans).

The blustering is, I believe, a means of defeating the other. I wrote about this last year in “welcome to the monkey house”. Because I’m meek, that kind of behavior would silence me. Not the thinking in my head, but I would most certainly not voice my beliefs and opinions in the face of that. Now, others are willing to step in and bluster, so I don’t really have to. But, it then becomes a forum for the pugilistic, and the rest is silent.

The willingness to bluster and attack is also not particularly correlated with being… right. It is a means of trying to assert power, and it can be used against those doing bad science, those doing good science, those critiquing ones hypothesis, those people being uppity, those researchers pursuing explanations for something that we don’t like. It is used by anti-vaxers and pro-vaxers, by creationists and evolutionists, by republicans and democrats, by bros and feminists, and on and on and on.

It is much more tribal than anything.

But the shushing and the tone-police can also, insidiously, be used to silence. To keep control. To keep status quo.

I really don’t know a good way out (which means, trying to find some solution where I feel reasonably comfortable without having to grow an armor, which it seems I’m incapable of anyway). My suspicion is that this simply can’t be done, and that the arguing is needed.

Right now, I’m simultaneously reading two books (I know, multitasking, but I’m just too scatter brained to focus) – one is Christopher Boehm’s “Hierarchy in the Forest” and the other is a re-reading of Hull’s “Science as a process”. Both are very evolutionary based. What Boehm is investigating (and then continued to investigate in his later book Moral Origins) is the curious human penchant for egalitarianism. When you look at the anthropological record, bands and tribes are egalitarian (at least among men). There is no alpha, no boss, no top bully. This is very odd, as our nearest primate relatives are highly hierarchichal with a male bullying himself to the top. This suggests that the evolved legacy for humans would also be a tendency to hierarchy. But, for humans, across a great deal of ethnographies, it seems that there is an effort to suppress this kind of effort. What Boehm proposes is, that for humans, there has been a tendency for those with less drive to dominate to band together and suppress this tendency in any individual. This is done through rather quiet, but coordinated means, by snark, pointed disobedience, gossiping and cutting down anybody that tries to be an upstart. This is part of the moral code. Those who don’t abide may be shunned or ostracized, and in extreme cases executed. It is not entirely comfortable reading this. It doesn’t seem to be an easy, comfortable egalitarianism where I’m ok and You’re Ok, but a case of constant monitoring and vigilantism against anybody trying to lord it over anybody else – but done very quietly. Both comforting and uncomfortable to me, because I know what it is like to be a target for the “don’t think you are anything special” growing up. In the egalitarianism (nobody is the boss of anybody) there is also a strong push for conformity. But, it is also interesting in that it means that there is a need to negotiate and discuss, and the possibility of more freedom (at least against domination) than it is in a hierarchichal species – or a hierarchichal human society.

Hull, of course, advances an evolutionary view of science in his book. I was reading his introductory chapter as the skirmishes on twitter and face-book and blogs were going on, and he states, very much, that individual scientists and individual research groups need not be unbiased and behave like some kind of stereotype of good, objective people. Of the researchers he investigated, many of them were very ill behaved; bullying, blustering, derogating. Somehow, in these clashes, a better understanding of the world was chiseled out. Nobody has to be right. One just has to be willing to stick it out there for others to take a potshot at it, and then to defend it. As long as that dynamic is allowed, the work will go forward.

So, that is another reason I’m so conflicted about this. I would like things to be nicer, because I’m so intensely oversensitised I go into hiding faster than a sensitized aplysiasiphon, I find it difficult to engage, and I doubt I’m alone.

Being loud, snarky and obnoxious – or maintaining a “hush, think of the children” behavior is orthogonal to, well, truth, or whatever it is we try to find in science. They are means of maintaining the tribe, right or wrong. You punish the deviant and the intruder. It can be used to stand up to corrupt power. It can be used to usurp and topple precarious beliefs. It can be used to drown out unpleasant and inconvenient – well – things. Good or bad kind of depends on which side you are on.

The meek – I don’t know. Perhaps only the bible calls us blessed…

As this was happening I reviewed an absolutely excellent Masters Thesis investigating some of the factors underlying football hooliganism – comparing football supporters love for their team to social science students love for their study topic. She was reviewing the kinds of group processes underlying supporter behavior that Social Psychology has investigated, usually not with football fans, of course, but still. It just seemed really apt. Incidentally , social science students feel nowhere near as strongly about their topic as football fans do. I guess at that level one hasn’t extracted enough of true fanatics.

I may simply have to select a tribe, as reluctant as I usually feel about this. Reading Hull, there seems that within the research demes, there is the kind of camaraderie and tolerance that allows for the more meek to get a voice.

A little note on the title spelling, for those who have yet to read the wonder that is Huckleberry Finn – that is, how Mark Twain spells it.

Posted in Uncategorized | Leave a comment

Are Swedish Universities heading for another “Pisa” collapse – in response to a Swedish Editorial. Consider the teaching.

The other day, there was a somewhat alarmist editorial in one of the Swedish National papers (written by two professors, one here in Lund), considering the state of Swedish Higher Education. There has been some alarmist commentary about its status in the papers (in part prompted by some of our universities no longer cracking the top 100 list – never mind the measurement problems with those lists), and some recent task-force that has visited Stanford and Berkeley, compared our universities to them, and found us Wanting.

The article in DN brings up 3 points where they think the Swedish Universities are wanting (compared to Stanford and Berkeley): The university is not led by top-researchers, Swedish universities have a tendency to “grow their own”, that is, the lecturers and professors are people who have spent their entire higher education career at the same university, from undergrad to pension. The third point is related to that, and has to do with a promotion reform that was just done, which has resulted in a possible lowering of status for the title (many professors are, supposedly, invisible in the international research community). This makes (possibly) for dull, derivative research.

What they didn’t bring up in this editorial, but which was lifted up in the report comparing us to Stanford and Berkeley is that this report claimed that one of the hallmarks of these world class universities (both in the Bay-area, California, on either side of silicon valley) is that they value teaching much higher than the universities in Sweden. This made me go seriously Hmmmmm. Of course, I never went to either Berkeley or Stanford, but I did go to the little bear in the UC system – UCLA. It is a while ago (early 90’s). I don’t have anything to complain about when it comes to education really. I had some really stellar teachers, especially in the smaller courses. I also had some “meh” teachers. All the big topic classes – intro psychology, and introduction to topics like developmental, social, cognitive, neuro were done in huge lecture halls (about 300 students) accompanied by smaller sections of about 20 students led by a Teaching Assistant. All of these exams were multiple-choice. Other types of exams were given in other areas.

What I’m reading now from colleagues in the US (via twitter) is how teaching is more and more de-valued. Classes are taught by adjuncts who don’t have job-security. Universities are looking into creating MOOC’s so they can save even more on the teaching staff. Teaching is not what brings you fame and fortune, but I’d suggest you follow Rebecca Schuman for this discussion (this article on the Soul of the Research University is also very interesting, as both Stanford and Berkeley are Research Universities – and the UC system is figuring in this discussion). Teaching is no more valued there than here.

But, let’s see if there was something that I think the universities I went to did differently than what Lund does, and keep in mind this is a very amateurish case-study from one single observer at a particular department in Lund (a department I like very much).

At the three research-universities where I have obtained my education, undergraduates taking the introductory psychology course were required to participate in experiments – the participant pool. Anything from 3 to 4 experiments. Other courses also used research participation as a means of getting extra credit. In part, this benefits the researchers, of course, but it also has a great deal of educational value, which is why we have it. It is a win-win situation. Of course, you cannot force people to participate in experiments, so there is always an alternative for those who absolutely do not want to participate.

Why is it educational? Because experience is educational. When we teach psychology to our students, what we teach them are based on findings from thousands upon thousands of experiments and studies, upon even more participants, painstakingly trying to understand how the human psyche works.

I did my intro psych at a community college, and therefore was not required to do any participation. The first time I did, it was an eye-opener. The actual experience of participating – the task I had to do (and I have since done many) – seemed rather removed from the kinds of conclusions I had been reading about. There were also a number of varying experiences that I had that is not generally discussed in papers or books, but may potentially be important to consider. In one case I had to respond to a dot that was flashed on a computer screen. It was terrifically boring. But, what they were trying to find was if they could measure how fast a signal is sent across the corpus callossum. You need a lot of data-points for that.

In another experiment, I had to classify faces and words. This was working well for about two of the blocks, and then I was losing concentration and making a lot of misses. When you run a test, you have to consider that after a while people get bored and fatigued. If your experiment or your set of questionnaires is too long, the responses towards the end become meaningless for what you are trying to answer because the responses will be indicative of something very different from what you are asking. I have answered questionnaires where the questions seem to not make sense (but I can understand why someone wants them answered).

I know, first hand, from circling Likert scales, that they really are ordinal, and that the intervals between the numbers really are meaningless, because it is clear to me that I can’t make a difference between a 7 and an 8 on a 9 pt scale. I have experienced the kind of jolt you get when you get an “error” message for having misclassified something, and wondered how that may influence my next response.

I have filled in a State Trait anxiety scale at a time when I felt happy and content, and I very strongly felt I had no interest in letting anybody know that I have had past instances of anxiety and depression that is possibly more indicative of my slightly pessimistic base state – and I did this post getting my PhD, knowing full well how researchers interpret the data, and realizing how many additional factors that play a role in any one individuals current state that can influence how you interpret the data.

I don’t get this as strongly from creating my questionnaires and experiments, because then I’m focused on the research question at hand. Reminding myself how it is to be on the other side is invaluable for my ability to design experiments, and for my ability to interpret research, based on how it was conducted.

Now, simply doing a lot of experiments is of course not enough, and it was never meant as this. You pair it with the theoretical teaching, and with lab work, in order to make sure your student has insight into all parts of the research cycle, and can be properly skeptical, but also properly accepting.

I see it as akin to how we require psychologists to go through therapy themselves before becoming therapists. And, there are plenty of anecdotal stories from doctors who became patients and had their eyes open to suggest that this would be a good exercise for everybody.

When I got to Lund, this did not exist. It does now, in a limited way, in that we have the option to let students do experiments as their compensation for missed obligatory seminars. Limited, of course, as there is always a reason why you want something in your course to require obligatory participations – usually because this is something you cannot acquire from simply reading books and articles and write a paper such as laboratory exercises or presentations, or discussions, or other empirical exercises. Let’s replace something empirical with something else empirical, should you have the misfortune to become ill.

There were a lot of discussions about this before we could implement it, and it is only a handful of courses that take advantage of this possibility, but I was astounded that asking students to participate in experiments were even questioned. Presumably you want universities to do both research and teaching, because you think there is a synergistic relationship between them, not an absolute separation.

Still, you have to keep in mind that participating must just be an option, that there are other ways of doing completions, or of fulfilling your requirements (in the version where it is part of the course), and there are.

But, there is a clear pedagogical reason why you want psychology students participating in psychological research even as participants and canned experiments just are not the same thing.

Another thing offered by the universities I attended, was being a lab-assistant for course-credit. In fact, at one university, this was the requirement that earned you a Bachelors of Science rather than a Bachelor of Arts.

This doesn’t exist, although we are working on it, considering that our international Masters, as well as other international exchange students are asking for it. As it is, some researchers with funding can pay you to be a research assistant. In other cases, students can work as research-assistants in exchange for a certificate stating that they have this experience. It is actually very valuable, especially for those that want to continue doing a PhD. But, working in a lab is generally valuable experience, even if you don’t plan on becoming a researcher. Currently, there isn’t a really clear path for students on how to get experience as researchers. This is short-changing them, and this is an area where at least some American research universities are better at integrating teaching and research.

Now, I don’t know if those were the areas that the report considered, but I think it is something that Swedish universities have to consider. Don’t keep research and teaching apart. Work on the synergy. That is a win-win situation.

Posted in Uncategorized | Leave a comment

On Movies and Psychology, report from a meeting.

I’m siiiiinging in the rain, I’m siinging in the Rain….

I’m part of a Film studies group, as a quantitative psychologists with more interests than brain-capacity. Most of the others are Film scholars with a humanistic/sociological bent, but with an interest in what people like me can add. And, I’m interested in what they can add to what I do. It is immensely fun when we get together, which, for this post, was April 25 at Copenhagen University, on a very lovely day which ended with wine and tapas outside on the 3-rd floor terrace.

Compelling art (deliberately vague) is psychologically interesting, because it wouldn’t be compelling unless it can harness something in the human psyche (to use Changizi’s terminology). And, those creating compelling art – be it movies or music or books or paintings – are able to harness, well, something, and that something is immensely interesting for a psychologist to try to understand. This is, of course, not my unique insight. Shimamura brings it up in his introduction, and brings up the fact that, although a whole lot of heuristics have been worked out for how to make a film (continuity editing, 180 degree rule, 30 degree rule, their violations for effect), the explanations that the film professionals give are not necessarily well grounded in, well, psychology. So, this is a perfect place for a psychologist to attempt some reverse engineering.

I was asked to present something on experimental method. I actually decided to do a little – well –tapas like sampling of quantitative methods, as the experimental variant is just one of the quantitative variants.

To prepare, I plowed through Arhur Shimamura’s edited volume “Psychocinematics”, and pulled a bunch of papers, before settling on presenting the methods in two papers: Rooney & Hennesy’s field study comparing 2D and 3D movies, and Magliano, Dijkstra & Zwaan’s work on predictive inference. Neither of those are true experiments, but we had already decided on doing a close reading of a paper by Marilyn Boltz, which is a true experiment.

The work in the first paper compared experiences of 2D and 3D movies, and did a very simple set up – ask people as they emerge from the 2D or 3D version of a movie to complete some questionnaires. The selected movie was Thor.

Participants completed 4 questionnaires: For perceived apparent reality they used a validated questionnaire: the ITC-Sense of Presence Inventory: a five-item, 5-point Likert scale validated questionnaire. (Lessiter, Freeman, Keogh & Davidoff, 2001). Typical question (that they cite) is “I had a strong sense that the characters and objects were solid”.

For attention they actually did not use attention, but self-reported distraction: How often distracted by other people, own thoughts, etc. Which, we discussed a bit. Is this really attention? Also, here they used a 6 point likert scale.

For Emotional Arousal they used Lang’s SAM, where you indicate how aroused you are, and how positive or negative you feel. Nine points. Well used scale – even I have used it.

Finally, again on a 6 point scale, they measured satisfaction basically by asking to rate “liked very much, would watch again, and bring my friends with”. Well, not like that – broken up in separate questions so as not to have them assess multiple questions on one scale.
Now, why they used so many different scales came up for discussion. In part, of course, because two of the scales were validated scale, and you simply do not just change the scale on a validated instrument.

Change

Why on earth their self-created scales used 6, I don’t know, and it is not discussed, but it allowed me to expound a bit on ordinal scales and all that.
The only effect they found were on perceived realism and distraction (that they call attention). They reported cohen’s d, so I got to explain what effect sizes are. They also had a truly bizarre df for one of the measures (it has fractions), but I’m guessing it is one of those adjustment for data that does not fit the assumptions of the test. (Which one this could be I don’t know, because the standard deviations are close enough, but the ratings are close to ceiling, so I’m guessing some severe skew).

And, of course, we got to discuss whether people deliberately selecting a 2D version are really the same as those deliberately selecting the same movie in 3D. It is experiment-LIKE, but not a true experiment.

I then went on to the Magliano et al Moonraker paper – which is correlational. The question is: Is there some distinct features that directors use in a film – Mise an Scene, Montage, Framing, Music, Dialog – that makes viewers predict correctly what will happen next? One of the film-scholars said, of course there is, and told how she has used it in her teaching. This is her expertise though. The paper looks how everyday viewer (students) pick up on the predictions too (perhaps without knowing the source of that prediction). Very simple (but work-intensive) methodology: Participants are instructed to stop the movie when it occurs to them that they can predict what will happen next, write down what their prediction is, and note where they stopped the film. Then came the extensive coding of all the answers. Only predictions made by at least two people were included for the analysis. For all those times, the researcher looked at what was happening in the film, and whether any or all of the five predictor features were present. They weren’t always. Some predictions were likely from general genre knowledge (Bond will sleep with the good looking woman). But, more often than not there was something in the set-up of the movie that helped prediction, and the more of the predictors were present, the more of the viewers did correct predictions.

We finally moved on to the Marilyn Boltz paper, which truly is an experiment, with an impressive amount of selecting and pre-testing before running the actual experiment, but, as always with any experiment, there are places one can go in for critique. I think the interesting thing is that, depending on what our back-ground was, what we critiqued was different.

Music fills an important function in movies. (I was taught early on that when I got too scared, I could turn off the music and it wouldn’t be scary. I didn’t heed that kind of emotion regulation, though, but preferred to use the “avoidance” method.) It is often thought of as adding emotional tone to the scene, but the question here was also whether it guided the viewer’s attention, person-perception and memory. She uses schema theory as her theoretical underpinning: you set up some expectations, your person-perception, your attention, and your subsequent memory will be affected.

To test this, she first spent time selecting cuts from commercial movies that were ambiguous in nature. These came from cat-people, vertigo and a series called “the hitch-hiker”.

She then set out to select positive and negative music that she could pair with the movies (one positive, one negative for each clip). There were three different pieces of each affective type, whittled down from a larger set of nominations, and these were subsequently matched with a movie based on perceived “appropriateness” – more on this, because this was a place where we had a lot of discussion, and where we differed in the type of critique we were giving.
Each participant viewed all three clips, but with different music pairing, one with positive, one with negative and one with no music. It was all sorted and counterbalanced in all the appropriate ways to correct for order effects and other confounds. The participants were also divided into two testing groups: the first one were asked to give an extrapolation of the ending of each clip, and an interpretation of the intentions, as well as rating the intensions and emotions and moods of characters and scenes right after the clips were seen. The questions were specifically tailored for each scene. The other half did none of that, but returned a week later to participate in a memory test where they had to indicate whether described items (e.g. Flower bouquets, Tomb stones) had been present or not in the film – that is, an “old/New” recognition test. (We had some discussion about why not all of them did the memory test, or why there wasn’t an immediate memory test. I think the answer here is that in the design they wanted half of them to answer an interpretive question, and the other be tested for memory. I see no particular reason for waiting a week with the memory test. It could likely have been conducted right after, but see no argument against waiting a week either, other than the risk of participant attrition).

The findings, mainly, is that the music will induce participants to interpret the story differently (and, usually differently from when no music is present), and people will perform differently on the memory task – recalling more positive items in positive conditions, and more negative items in negative conditions, and similarly for the false alarms.

But, let’s go through some of the issues – most of them mine (because, well, my blog) but also some interesting commentary from the participants which I thought was very interesting.

First, I had an issue with the music that they used as either negative or positive. Here are the selections, with the descriptive adjectives (from the appendix of the paper). All the negative excerpt came from Tangerine Dream’s Rubycon album. This is electronic music – synth music. The excerpts were described as “Eerie, Mysterious, Unsettling”, “Eerie, Unsettling, Edgy, Suspenseful” and “Anxious, Mysterious, Evil”. I decided to listen to it – how I love Spotify – and it reminded me very much of “Bladerunner”. One of the film scholars pointed out that that soundtrack was by Vangelis, but that, yes, this was a “genre” sound – something a regular movie-goer would recognize as the music accompanying scary and eerie movies. I actually didn’t find it terribly eerie or scary when I listened to it. I just noted the style.

The positive music was more varied, and I also found it a bit odd. The article states clearly that positive music tends to be in a major key and fast paced – which is exactly a criterion I would use for inducing some kind of happy/contented emotion (and I believe that research on music and psychology would suggest something similar). The first piece is called Blossom Meadow, and characterized as a new age piano piece. It is described as “Calm, Light, Airy, Cheerful, Pleasant”. Never heard it, but I get a kind of music image from it – perhaps even that very genre indicative. Another positive piece is Schutzliesel, a German drinking song, with woodwinds, brass and percussive, described as Fun, Boisterous and Folksy. Hmmmm. I might use that to make people feel happy too, even though I have never heard it. The last one, though, is Barber’s adagio for strings. You know, Platoon (or, like I said, from that Vietnam movie where all the cute guys get shot). Described as Sad, Tender, yearning, Wistful, Solemn. Yes. Exactly. Sad induction.

What is this? Clearly, their terminology is at variance with mine as an emotion researcher (and I was not the only one that objected. Some of them got into meta-emotions also – you know, I cried my eyes out/almost shit my pants from fear – it was SOOOOOOO Goood – but psychologists have barely gotten into that area yet, from what I know).

Yet, if you look at results (which are labeled positive/negative), all the ratings for the clips paired with “positive” music goes in the same direction. In all cases, participants think that there will be no harm, the interpretation is positive, and mostly positive adjectives are used to describe the man (kind, loving, playful, etc) – but, god knows, this part is messy from a measurement point of view.

The semantic differential for each movie showed similar results – but the questions for each movie was custom made – thus not quite comparable (although they range from what one would consider positive to negative). Granted, they do not actually compare them statistically, but there is a comparison going on.
I was thinking that rather than positive negative, there is an indication of something being benign or malign – no harm or harm – safe or unsafe. Or, like one of the film scholars pointed out, genre recognition. (Here I go, the psychologist, thinking about basic human concerns, and the film scholar about what western humans have picked up from watching Hollywood style movies since before childhood amnesia sets in).

We also pursued an interesting discussion on the word “appropriate”. In the research some judges had been given a set of music pieces and asked to judge which piece of music was most “appropriate” for each film clip, which is how they decided on the different pairings. I kind of vaguely have a notion what that means (let’s do a pilot to see what seems to fit best by vote), and my main objection was that I thought they should have used the same piece for all clips, in order to make a good comparison – experimental psychologist as I am.

But, the objection from non-psychologists was that, what the hell is “appropriate”? Without any more specification? As an example, is Alex, in “a clockwork orange” singing “singing in the rain” while behaving atrociously appropriate? Certainly effective, achieving what the director intended. And, yes, if you really want to look at it there is no such thing as the vague appropriate. There is an active fitting of music for effect – and here I really get in over my head, considering that so many of the scholars there are specialist in film-music.

Still, with the flaws, I thought the paper interesting. It opened up a lot of questions, and in some ways that is what papers like this is for – making things very explicit.

Boltz, Marily G. (2001) Musical soundtracks as a schematic influence on the congitive processing of filmed events. Music Perception: An interdisciplinary Journal., 18(4) 427-454.

Magliano, Joseph P., Dijkstra, Katinka, & Zwaan, Rolf A. (1996). generating predictive inferences while viewing a movie, Discourse processes, 22(3), 199-224.

Rooney, Brendan & Hennessy, Eilis (2013). Actually in the cinema: A field study comparing real 3D and 2D movie patrons’ attention, emotion and film satisfaction, Media Psychology, 16, 441-460.

Posted in Uncategorized | Leave a comment

What a difference a century makes – On March 8. For those who get gloomy.

My grandmothers were born 1898 and 1902, when women were yet allowed to vote in either Sweden or Norway. When my grandfather talked to one of the head political honchos in the small village where he was the post-master, for a loan (or a signature for a loan) so he could pay for higher education of his eldest daughter, my mother, this head honcho thought this was just a waste of money, spending it on educating daughters who could, you know, become hairdressers or something. Happily my grandfather didn’t agree, and my mother got her teaching degree.

One day, late 50’s, early 60’s, the headmaster of the school where my mother and father were teaching (and, good friends of theirs) saw my mother wearing pants to work, and he asked if she was having an “idrotts dag” – days we have in Swedish schools where all students do some sports, or at the very minimum spend some time outdoors. Women just did not wear pants to work.

A couple of years ago, when I was running a seminar where we were discussing a paper where they investigated how dress-style influence judgments of competence, warmth, and other measures, a few of the female students reacted strongly to the fact that business-attire for American Women in the late 90’s meant skirt and pumps. As one of them said, some of us just feel awkward in a skirt!

The times they are a’changing.

Jill Ker Conway, contemporary with my parents, grew up in the Australian Bush, and had a stellar academic career in the US and Canada. In her books, Road from Coorain and True North, especially true north, she chronicles the blatant and hostile sexism she encountered during her academic career. I have encountered none of that.

Too bad that that cigarette company took that slogan, because we have come a long way.

Doesn’t mean it is all roses. Doesn’t mean there aren’t things to do. Doesn’t mean we are now safe from sliding back to where women are routinely seen as possessions, chattel or high-prized pets. But we are, now, in a place in the west at least where we clearly can say that if no women are present it is because the dudes in charge just didn’t look carefully enough, not because there are no competent women.

And, yes, sometimes you do have to remind them that, along with the dudes, there were also some sharp women present*, and some men have not yet to get the memo that being a letch is just so passe.

But, whenever I’m tempted to be depressed, I look back just a tiny amount of time, on the circumstances for the women who came before me – some of whom are still around (like my mom, and Jill).

Not bad. Not bad in slightly more than a century.

* Scroll down to Bobbie Spellmans comment, and David lists all of them. Impressive bunch. Or just read the whole post with the comments. Definitely worth your time.

Posted in Uncategorized | Leave a comment

Reflections on teaching and science.

I’m reflecting on teaching. Which in itself can be good, as long as it isn’t so much reflection that you end up in a mirror maze with copies of your thoughts to infinity and beyond.

But, in this case, the reflection was prompted by a post from John Horgan, where he speculated that perhaps the current reproducibility crisis could be traced to practices in teaching the labs. Supposedly students would fudge data obtained to get it correct on reports, in order to get good grades.
Fudging data-points is not something that would get you better grades in psychology. It really isn’t a point in doing so for grade purposes (at least where I have been), because we know damned well how hard it is to get significant results so grading is never based on whether your experiment worked out or not. I actually had one of my profs telling me I could get my PhD based on a non-significant dissertation, as long as the reasoning and method was reasonable. (With my 12 x 30000 data points, that was not a problem!). I usually tell my own students something similar.

I have engaged in some sinful things like fishing for p-values (usually with the embarrassed caveat that this is not really how you should do it) so that my undergraduate students can practice writing up results section with something that we can pretend is meaningful. And, yes, those beautiful euphemisms for p > .05 (which surely God loves just as much) that Matthew Hankins so nicely collate for us – I’ve used them.

I have wondered about the role of teaching in the problems abounding, contemplating, as I am right now, on its low status. In coming up, though stats was usually taught by people who had done it for a long time, methods were relegated to the grad-students and the recent docs who needed something to do. In fact, at my grad institution, the methods and APA style course was handed to all the graduate students in year three, as part of their education. Actually, education wise, that was nice for us. The rest of methods were very much learned on the job as a lab assistant. Informally, it was suggested this was done because faculty did not want to be stuck with methods, and why not make it a learning expericens for the new grad students? (True? Not sure). Doesn’t say much about the value of teaching, especially about the basic mechanisms of doing research.

The other day, “Simply statistics” (in this post) stated that the squabbling agains p is misguided. The problem isn’t really the p-value, and that one should use effect sizes or other statistical procedures. The problem is a lack of analytical skill, and a lack of teaching analysis. Statistics is, after all, just a tool. Something that should help us keep track of our data, and sort out our results. It cannot provide a magical line that demarcates results from non-results. But, this is how it has been used.

What he suggests is that, beyond statistics and probability, there need to be teaching on how to analyze – the part of the theory and ideas that are not just in the statistical package. This I agree with. I’d like to add, it would be good if this was maintained throughout the schooling. Statistics and analysis of data is a skill that must be trained and repeated, not something to suffer through, get a good enough grade, and then forget. To do so, of course, you need to plan the teaching in a more overarching way. Did students learn statistics and methods last semester? Great, now that we have an area course, make sure that students keep using their fledgling skills in reading papers, planning research and analyzing results in any possible project. Do that in the next course also. This is not something you acquire overnight and once and for all.

You start, like pedagogic texts will tell you, learning some rule about “how it is”, because, really, that is probably the best way to start. A rule – we look for rules (if I believe Tomasello and Bloom, and I do), and tend to treat them as laws. Until, in the next stage, you confuse the students with exceptions. Slowly, you can build the skill necessary to understand and analyze what acquired data may actually mean, which is more than pushing buttons on SPSS. This learning does not end. I’m still at it, and my PhD is over a decade old.

Teaching Takes Time. And Skill.

In the chapter on culture and cognition in Viren Swami’s Evolutionary Psychology, which I used for my course in Evolutionary Psychology, they spend a lot of time trying to understand the evolution of the learning mechanisms that have allowed us to accumulate and transfer our technical knowledge across time, distributed across people. There’s imitation, there is emulation, and there is direct teaching. This teaching has a cost, to the teacher. That is, the teacher must take the time to create learning situations that fit with the stage of the learner. Having gone through the process is not enough. We forget how we learned, and how it was to not know – the Curse of Knowledge. This teaching exist in other species also. I was charmed by the sequence meerkats teach their young to hunt scorpions: from dead, to disabled, to fully functional prey.

Also, you cannot transfer your current level of knowledge directly to the students, even at college level (until Gibson’s Sim Stims become a reality, which is never). It has to be adjusted to allow learning to occur. This adjustment is in itself a learning process. So loopy.

The time you teach, and prepare to teach to those new in the field is time you cannot spend working on other projects, or keep abreast of the literature, or write grants. There is a possibility for synergy. The act of teaching helps with consolidating and deepening knowledge. It is well known, of course, that in order to really learn something, the best way is to teach it. (One of the reasons I keep designing new courses).

But, it doesn’t bring glory, it doesn’t bring grants, it doesn’t bring status, as it is set up right now.

But, perhaps John Horgan is right, that this is what has undermined the quality of the science that is done, because the new scientists do not get sufficient training to be able to do really good work (which is such a waste, considering the passion and cleverness of this group), and the glory comes with publishing papers. (I just read through this paper by Charles Lambdin which bring this up and more. Either Fred Hasselman or Hans Ijserman linked that in on Twitter. Paywall, alas).

Another lesson from evolutionary psychology (and also Richerson & Boyds work on cultural transfer) is that knowledge can become watered down and disappear if it is not carried forward. In their example, from Tasmania, it was because the population shrank, and key knowledge-bearers died without having transferred their knowledge sufficiently. The population in science is not under threat of shrinking, but the lowering of status of those involved in the knowledge transfer (adjuncts anyone?) may have a similar effect. (Here I’m wildly speculating – I’d like to think up a way of empirically test that conjecture).

I teach too much, at least from the perspective of current incentives, and research and publish way too little, and I sometimes feel my effort is invisible. In fact, I have no clear idea whether I’m any good at it. I like thinking about what students of psychology need to know and need to learn in order to be astute consumers and possibly producers of knowledge, and then to stick that in my courses. There is an element of self-interest here, because I’d like to feel like I’m valued, and teaching constantly gets dissed.

But, if not enough time is set aside to properly train the new generations – and this can be done in an apprentice setting – we will produce what?

Posted in Uncategorized | Leave a comment

Some continuous musings on emotion.

I’m trying to think about emotion. Or think about how to think about them. What is a good, fruitful conceptualization.

Perhaps, first, in line with Joseph LeDoux one should really jettison the term “Emotion” as it is more of a folk-psychological term, which muddies how you think about it. (Emotion Review have had issues recently discussing this). Of course, the folk psychology of emotion is itself interesting – lots of cultural variation. Swedish doesn’t have a word for emotions. We use feelings. Covers roughly the same thing. There’s also variation across history.

The first Emotion Review of 2014 has a section looking at James legacy. If you have read any intro book on psychology the past 15 or so years (when they even had anything about emotion), you would have heard about the bear, and running, and that was the emotion, whereas Cannon said no, it is all in the brain – all this is covered in the issue, in its proper historical view. It always irritated me that these cartoon versions of their theories were something that we taught our undergrads in what would, perhaps, be their only encounter with emotion research, considering that the state of research is very different, and has been for a very long time. Not wrong to look at the historical, of course, but it was not presented as historical. In our book it was presented as “theories of emotion”. I would have liked them to, instead, look at categorical vs. dimensional accounts, and expand on the appraisal theories.

But, an account by Phoebe Ellsworth  in the Emotion Review issue stuck with me for a couple of reasons. (Title is “Basic Emotions and the Rocks of New Hampshire”)

She worked with Ekman and Friesen on their basic emotion program. She had come into this, viscerally convinced that there were similarities in emotional experience and expression across cultures from some films she had seen.

At the time, in text books she said, textbooks used to display picture of faces that were distorted in a grimace, asking if one could tell what the person was feeling. Then they would feel the entire picture in context and reveal that it was some winning moment and go ha-HA you thought you could read emotions from faces, but neener neener you are soooo wrong. (OK, I’m exaggerating). But, this was mainly the result from the Landis work that no IRB would approve these days.

So nothing is new under the sun, because I have seen things like that published in the 2000 – showing how the body informs, and the face is not necessarily the man informant, as a refutation to the basic emotion in the face thing. Sure, context, body, face, all matters in how we interpret what someone is feeling, and that may not entirely be signaled by the face (I’m thinking studies by deGelder and Todorov, and similar). And, yes, that it is important to keep in mind. But, I just found out that is not new…. Why is research memory so damned short?

She also gave a bit of insider info on how they were preparing for the cross-cultural work on expression recognition, and how they selected the expressions. The six is, in some ways, an artifact of the time and resources they had. They had enough pictures of these expressions that were viable, but not enough of other theoretical expressions. (Much of them were derived from Silvan Tomkins ideas).

And, yes, even I know that what is supposed to be the basic emotions vary, and vary across the researchers. You can find that in tables in emotion textbooks. I read that in papers during the late nineties. You can find it on the web. There are some overlaps always, and some odd ones.

Also, historically, what are the emotions and what are not seems to have shifted. Is love? Shame? Awe? Fear?  emotions? Why? Why not?

I’m not happy with either the categorical or the dimensional, or the various appraisal theories. But, at least they are theories. I think they are frequently compatible, neither of them seem to win.

Really, what I want is a more dynamical system account of emotions. But, I may muse on that one later.

 

Posted in Uncategorized | Tagged | Leave a comment

Through the looking glass into an oddly analyzed clinical paper

My curiosity turned me down a dark alley of oddly reported and interpreted statistics. It has fancy things in it, like effect sizes, and even confidence intervals, and “Wilcoxon sign tests” in it, and claims of large effects. Perhaps I’m not sophisticated enough to understand its meaning, but to me it seems more like a fun-house out of the twilight zone, or the research mirror world of that old post-modern bs academic writing, with statistical concepts as the lily-pads rather than obscure hipster-words.*

What lured me was a second posting of a data-table from Keith Laws, where he lamented that he still could not make sense out of it. So I re-tweeted. Why not. A click is easy. Daniel Lakens responded, and I was now the witness to a conversation suggesting that this was some crappy analysis. Yes, I had looked at the data-table at one point before, but nothing of value had turned up in my head, so I had nothing to say without looking at the paper. And as schizophrenia and CBT and clinical trials are way off even my meandering paths, I had simply refrained from that. But after the back and forth for a bit (and being copied in on a slur accusation, and, as usual giggling about some remark from DrNeil Martin) I just had to go look. Keith had kindly linked in drop-box copy for anybody’s perusal.

For you, I give you the title and a link to the abstract – it is paywalled. (I just don’t want to use the drop.box copy for a blog). High-Yeld Cognitive behavioarl techniques for psychosis delivered by case managers to their clients with persistent psychotic symptoms. An exloratory trial.

It is a strange world. I wasn’t sure if I should giggle, or possibly wonder if I had misunderstood something about the statistics they were doing, or getting deeply depressed that this passed peer-reviewers, considering that clinical psychology is the one area where we have the most realistic opportunity to both do good, and to do great harm.

From my understanding, having looked at it now in my rabbit-grazing way, the question the group was interested in is whether it is feasible to train Case Managers to deliver a particular kind of Cognitive Behavioral Therapy to their psychotic patients.**

They assessed 69 patients for eligibility for the treatment, and ended up with a total of 38. They also trained 13 case managers to deliver the treatment. Training took five days, and all case-managers received weekly supervision during treatment. I can’t find how long the treatment lasted when I look through it this time, but I assume it took place over several weeks. Patients were assessed on a number of functions at base-line and at the follow-up. The scales or whatever protocols they use for assessment are unknown to me, so I have no means of knowing if they are good, but what they assess seem to be things that it would be reasonable to assess. This is standard, of course. I wouldn’t bother to explain how the IAT works if I was submitting something to a Social Cognition Journal.

So far so good. Nothing is remarkable or out of the ordinary from this non-experts point of view. Doing clinical research seems like a massive job, and I appreciate that there is a great deal of difficulty doing it well.
But, now comes the analysis. First there are four figures of histograms with error bars, showing before-and after scores on the different measures. I can’t find any explicit inferentials about these results in the text, although they claim to have done both t-tests and Wilcoxon sign tests (I looked the latter up on Wikipedia. Both types seem reasonable for before and after assessment). But, just looking at the graphs it is clear that inferentials aren’t really needed because it doesn’t look like anything happened. The means tend to be somewhat lower in the “after” but the standard-errors are rather large and overlapping (I even double checked my Cumming New Statistic book to make sure I understood this). It really looks like the intervention has had no effect whatsoever, at least when you aggregate across all 38 participants, which I assume they did. They claim that no data was missing.

They also do an “effect size analysis” using “Cohen’s d-methodology” referring to Cohens entire 1988 book. Well, fair enough, but I wanted to know if they meant something different by this than we do when we do t-tests and calculate effect size. I gather that this is what they are listing in that table that Keith tweeted in that he could not make heads or tails out of, and that Daniel think is just horrible, and I think resembles a sinister hall of mirrors, or possibly a run-way made of bamboo in the south eastern war theatre in the late 40’s.

Now effect sizes are nice, of course. In this case they run from the middling to large, and also include a few negative ones (suggesting that things got worse). But, one must remember that with only 38 participants, effect-sizes tend to be inflated, as the handy chart in Dan Simon’s blog shows (simulations of effect size estimates where the true effect size is zero – you can do that when you simulate).

The table also shows confidence intervals. I take it that it is for the effect sizes. I looked up how you calculate confidence intervals for effect sizes to try to make sense of this, and you can do it of course. It is a bit trickier than just calculating confidence intervals for estimated means – involving non-central non-symmetric t-distributions, but it can be done, and evidently there are nice R-algorithms for it.

The confidence intervals are large, and all go from a number less than zero to above. That is, for every single effect size, the “no effect whatsoever” is still within the possible estimate. There are likely a couple of typos there also – two confidence intervals are identical. One starts at the same number as the estimated effect size. (It is not the only place where the copy editors and proof readers missed. Figure 5 lacks labels on the axes.)

None of this is backed up in the analysis section, which is all of 5 lines long, naming a number of tests they claim they performed. Of course, looking at the graphs and the tables it really looks like there isn’t much to write about anyway, because I doubt anything would have acquired that magical p<.05 level, but it would have been nice to actually see the values, and the df’s and all that in numbers, because my feeling right now is that I’m not sure the researchers know what they are doing, at least not statistically.

I wouldn’t have let this kind of reporting pass in my undergraduates (who have a good excuse feeling wobbly about stats). It should be a fairly straightforward analysis with a before and after group.

Sure, perhaps it isn’t appropriate or feasible to write down all those numbers in all cases. Right now I have a paper out where I don’t report the inferentials, and only show means with standard errors, but the journal is one focusing on film, and is mostly using their kind of qualitative analyses. I wanted to illustrate that we can induce emotions with films, but showing the data was more supplementary than all the other things I wrote about.
I’m not sure this paper can get away with the excuse, especially as it starts its discussion claiming that the results showed large effect sizes (never mind those confidence intervals), and that the intervention showed good, significant results never mind that the table all suggests that nothing happened, or at least that if something happened it is so overwhelmed with that pesky crud-factor the signal doesn’t make it outside the noise.

They don’t look at the training of the case managers, which I thought was part of the question. There are a lot of claims, but they don’t seem anchored in the data they show, and none of that should be particularly difficult to show.

And, yes, sure, they are aware that the sample is small, and there is nothing that seemed control-like, but they are confident they have shown some kind of feasibility for training case managers to deliver this type of therapy. It seems akin to reading palms.

Now, why, oh, why, did I dive into this sinister mirror world, when I don’t do clinical? I should have stayed in the fun-house of small n counter-intuitive findings in social psychology. We can snark and replicate one another, and nobody’s mental health is in danger.

Still I wonder, did I miss something? Is it some analysis method I don’t understand (yes, there are, plenty of those of course), but that is pertinent to this one?

Am I, in the twilight zone?

* A quote from Katha Politt that I read in a Socal book/article late last milennium keep sticking with me as a perfect illustration of mindless attempts at influence by using particular keywords: A frog jumping from lily pad to lily pad. I finally found where it is from with a little bit of google-fu. The article is called Pomolotov cocktail.. The Frog quote is at the very bottom.

**Never mind that CBT for schizophrenia seems to do very little based on this recent meta-analysis. There is a place to play with the data for the meta-analysis, but for the moment I have lost that link in my twitter flow.

On edit: I had hoped Keith would see that one, and provide me with the link, and he kindly did. Here. Go play with meta-analysis data.

Posted in Uncategorized | Leave a comment

Confidence is in the Action, not the details. (On our recent paper).

Our paper is out! Go check. It is on eye-witness memory (how did I get here?)

Here it is. Elsevier, but open access.

Farhan is actually my one and only complete Doctoral Student(I was co-advisor). The work is kind of a second dip into a rather rich data-set with a follow-up study.

The main work (published here, no idea how open) asked what happens to your memory of a crime you witnessed as you keep talking about it over and over again, which is, of course, what happens if you are a witness. This is the study Farhan, and Carl Martin (and to some extent me) came up with:

A bunch of participants come in to watch a movie of a kidnapping. It is not a scintillating, well-cut, engaging movie. No. It is filmed from a single point of view, with a tad of zoom, and panorama, trying to mimic what things would look like if you were actually there, looking at the surroundings. The same movie has been used several times previous (the work is part of a larger program that Carl-Martin Allwood, the main advisor. )

They are then randomized into four conditions. In one, they come back 5 times, over several weeks. Each time, they meet a new person, and they discuss what happened in the movie. The new person (a confederate) has the job to ask questions about the event so they can understand everything that happened. In another, they also come back 5 times, but this group just retells the content of the movie to Farhan, who doesn’t discuss it at all with them. In a third, thrown in for ecological validity, they are given a schedule for when they are to talk about the movie with a friend or family member – different one each time. And, then it is the silent control: Do not talk about the kidnap movie! (The one and only rule).

Now all but the control group have talked about the movie, and all in slightly different ways. Time for the final two events. First, they come back and write down everything they remember from the movie. A day or two later, they come back and confidence rate each remembrance. This means that between the recall, and the next session their response has been segmented up into single statements, and a confidence scale put beneath. I think Farhan had some help here. They are all done in Swedish, and he is from Pakistan. (Yes, he speaks Swedish by now, but not then). Massive job.

We had also thrown what we called a “focused questions” task at them, which is where we asked more pointed yes-no questions about the film, and asked them to judge their confidence in their answers also.

You can read all about it in the paper, if you are interested.

We weren’t just interested in memory, but also in what is called meta-memory and calibration. Let me go through this for a bit, without doing the math. Meta-memory has to do with how well your confidence in your memory aligns with how correct your memory actually is.

For example, I run a lot of seminars that are obligatory, and students have to sign a roster as evidence they have been there. Occasionally a student gets back to me asking why they had not gotten credit (your name is not on the list). But, I was there, don’t you remember? I don’t. I may very well recognize the student, but the memory trace of who participated in which seminar disappears rather rapidly. I don’t remember whether or not they were present, and my confidence is low. But, some students stick out – usually because I know them from before. So, I do remember that both Rob and Drew participated in my two seminars on theory of science, even now, several years later, and my confidence in that my memory is correct is very high. These are examples of being well calibrated. When memory trace is iffy, my confidence is low, and when memory trace is high, my confidence is high.

Then we have the case of Em, who I was also completely sure had taken that theory of science course with me some years back, until she informed me that she had never been a clinical student. I actually have an episodic memory of her (or, most likely, someone that looks similar to her), and it is false. Here my meta memory was poorly calibrated, because I was quite confident about something that just could not have occurred. (I think that is the more reasonable explanation than the alternative that Em somehow is mistaken about her educational path).

Why is this interesting? It is because we tend to use our own and others confidence as an indicator of truth or correctness or certainty – depending on what you are looking for. In court, a confident witness is believed more than someone that is not confident. It is easier to trust someone who sounds like they know what they are doing. As an aside, that is a short-cut that can be taken advantage of, like so many Mamet movies shows.

But, confidence and accuracy does not always track that well. Apart from con-people. (The Dunning-Kruger effect is one of those).

One way that we try to investigate calibration between confidence and performance is to ask participants to perform (recall an event, tell a joke, solve a small problem, predict the weather), and then to ask them how confident they are they got it right. Now, most of us are going to feel confident about some of the performances and less confident about others, and this will vary along some kind of scale that is ordered, perhaps like: would bet my life on it, pretty sure, maybe, dunno, totally guessed. We usually ask them to do it in percentages or likert scales. Now, when you take all the performances together and bin them into those that were gotten right most of the time, those that were right 90% of the time, 80% of the time and so on this will be reflected in the confidence. When the performance is at guessing level, confidence should be at guess. When performance is near perfect, confidence should also be high. When in between, confidence should be in between. If you map it on a plane with accuracy as one axis and confidence as the other, perfect calibration is illustrated by a perfect diagonal.

If they don’t track, and confidence is used as a proxy for reliability it is clear why this becomes forensically interesting.

It isn’t like people are completely clueless, or that the confidence tracking is always off. In fact, in the first paper we published, participants were relatively well calibrated, although not perfectly so. At least for some of the free-recall questions. But, on the yes-no questions they were really lousy. I think the clearest interpretation of the results were that the participants were guessing, and that their confidence ratings suggest they knew they didn’t know.

At this point Farhan was diving into the research on central and peripheral pieces of information in recall. Loftus and Christianson has looked at this, as well as many others (you can read all about it in our paper). One thing he noticed with the responses was that participants seemed to be fairly good at the gist in the free recall, but were not terribly good at detail information, such as color of t-shirts and the like – all questions that were part of the fixed questionnaire. Perhaps this was the key to why meta-cognitive performance varied so much.

What he did, first, was to subdivide all of the statements in each recall into those that were what we first called forensically central, but changed to action information (throughout the lengthy review process) – what we then called forensically peripheral, but changed to detail information (t-shirt color, hair color etc), and then non-forensic information. The action and detail information is the kind of information that it is important to get from a witness, because you cannot get it any other way (what happened, how did things look), whereas the non-forensic is stuff that you can come back and find again (what the bus-stop looks like, how many buildings, etc.).

First, people recalled way more of the action information, and were better at it and better calibrated. But, of course, this was going back and sorting through data we had already collected for other purposes.

So, we ran a second, much smaller study, where participants again saw the movie, and then answered a new set of focused questions as well as confidence judged their responses. This time the questionnaire contained both detailed questions and action questions. And, again, we saw the same pattern. They were better at remembering actions correctly (and were more confident) than details.

Which is a small finding, and perhaps not entirely novel, but nevertheless neat.

Posted in Uncategorized | Leave a comment