How direct is social perception? A comment on Abramova and Slors (2015)

Students constructing a staircase, each working separately on a particular part.

Students constructing a staircase, each working separately on a particular part; Abramova and Slors (2015) label such activity as “distributive action coordination”. (Image source)

It has become trendy recently to talk about social perception as a direct process. Or, at least, I gather it has become trendy, because the journal Consciousness and Cognition has a special issue coming out on the subject. Most of the articles appear to be playing the game of taking a novel phrase, in this case “direct social perception”, and trying to cram that phrase into the author’s preferred psychological description scheme, be that Bayesianism, theory-of-mind cognitivism, or some other approach. Each to their own. I don’t have anything much to say about these articles. There was one paper that caught my eye, though. It is written by Ekaterina Abramova and Marc Slors (A&S), and it attempts to give an account of “simple action coordination” in explicitly Gibsonian terms—in terms of individual actors, or agents, who, through their actions, “shape the field of affordances for another agent”.

This strikes me as a move in a useful direction, and one to be encouraged. It is also one that immediately raises a whole set of conceptual issues which demand careful thought.

I do not want to critique the paper itself, as such. The proposals presented are quite preliminary in nature (the authors themselves claim only that they “have tentatively made an indirect case in favor of [direct social perception] in simple action coordination”). Instead, I want to take issue with the way the authors seem to be interpreting the concept of affordances. This will, I hope, be illustrative of some of the issues that arise when we appeal to affordances in trying to explain social phenomena; these issues are general and go well beyond this particular paper. I want to suggest that A&S’s reading of the concept of affordances is unduly narrow. As a consequence of this, they end up maintaining certain notions which I believe we’d be better off simply abandoning. Specifically, they maintain that coordinated activities involving multiple actors must be supported by a “shared intention”; and they appear to be committed to the notion that acting with others necessarily involves “understanding” others in some sense—e.g., “recognizing” that certain objects afford certain actions for others. I want to suggest that there is a way to formulate the notion of affordances which allows an account of coordinated multi-actor activity in which these issues simply do not arise.

It is, however, worth noting a couple of things about the paper up front. Firstly, the phrase “direct social perception” is being appealed to here in a promissory manner. It is being invoked as part of a rejection of the mainstream, cognitivist account of acting with others as involving the making of inferences about others’ intentions. On the mainstream account it is assumed that intentions are a matter of private mental activity and that therefore they are inherently inaccessible to anyone but the individual whose intentions they are. A&S reckon that their affordance-based description scheme allows them to avoid having to appeal to inference-making, although it is not always clear from what they write that they have completely escaped thinking about the actions they describe in terms of inferences. Thus, in their conclusion they claim that acting with another agent requires “perceiving affordances for another person [which] involves seeing that person as an agent”, and that this “does involve attributing some basic kind of intentional relation between a person and her environment”. It is hard to interpret that “attributing” as anything other than a form of inferring.

Secondly, the authors make a distinction between two types of acting with others:

    1. distributive action coordination, in which actors work on their own tasks while avoiding getting in each other’s way, as in the photograph at the top of this post
    2. contributive action coordination, in which multiple actors use one another to complete a task which none would be able to perform independently, as in the ubiquitous sofa-carrying case; A&S write: “In such cases coordinated joint action is usually required to start simultaneously”


Four men carry a sofa together.

A rare four-man sofa-lift; according to A&S this is a case of “contributive action coordination”. (Image source)

The authors think that their affordance-based scheme can straightforwardly explain what’s going on in the case of distributive action coordination (and that this can already be modelled using dynamic field theory), but that some work is required before an account of coordination of the contributive type can be attempted. This distinction is, intuitively and descriptively, a reasonable one to make. However, I think the distinction breaks down quite rapidly when we consider the situation in terms of affordances relative to an individual actor (see this previous post in which I argue against analysing the situation primarily in terms of interpersonal relations; I claim that we should instead focus on the relations between the actor and its populated environment).

Perceiving affordances for others is not like perceiving affordances for oneself

In their attempt to give an account of contributive, sofa-carrying-type action coordination, A&S suggest that it is necessary that the actor perceive their environment not only in terms of what it affords for them, but also in terms of what it affords for others. They cite the following passage from James Gibson’s chapter introducing the “theory of affordances” (Gibson, 1979, p. 141):

The child begins, no doubt, by perceiving the affordances of things for her, (. . .). But she must learn to perceive the affordances of things for other observers as well as [for] herself.

As with a number of the remarks that Gibson makes in that chapter, this one can be read in a couple of different ways:

  1. Perceiving the affordances of things for others requires one to perceive that, from where the other person is standing, that person’s environment affords action X
  2. Perceiving the affordances of things for others requires one to attend to and discriminate a correspondence, or fit, between two pieces of structure in one’s own (the perceiver’s) environment: a correspondence between the object and the other actor

The first reading appeals to what in the cognitive tradition is called other-modelling. A&S appear to be leaning towards the first reading: “in the plank-lifting case, the field of affordances of a person standing at one end of a plank is changed when she sees another person at the other end of the plank. The plank becomes ‘jointly liftable’ for her because she recognizes the availability of exactly the same affordance for the other person as well.” This seems to sit uneasily with any claim to be pursuing a direct perception account of social activity, because other-modelling adds an inferential step between what is seen (what is specified in the optic array) and what is “perceived”.

The second reading is more interesting, I think. Here there is no requirement to put oneself in the place of the other. On this reading, perceiving affordances for others is nothing like perceiving affordances for oneself. And how could it be? If affordances are animal-relative, as we are told is the case, then we can only directly perceive our own environment in terms of what it affords to us as individuals. Any notions I might have about how the world looks to you are a matter of conjecture on my part. What I perceive is an environment (my surroundings) which happens to be populated with other actors, whom I perceive in relation to all the other objects in my environment. I might perceive you and the sofa as objects that I can potentially assemble in a way that affords sofa-carrying. But this is no different in degree from my perceiving the relation between a sofa and, say, one of those flat trolleys you have to push around when you get to the warehouse bit in Ikea. The trolley and sofa can likewise be assembled into a structure for sofa-transportation (over a flat surface, at least). The point is that the other person, like the trolley, is, for the purposes of the task at hand (shifting this sofa to some place else), merely usable structure in the environment.

To be clear, the claim is that perceiving is only ever of an environment that affords actions to the animal doing the perceiving. Consider an example from soccer. A striker bearing down on the opponents’ goal must pick out a particular place within the goalmouth into which to aim the ball such that the goalkeeper will not be able reach it. We might say that this involves the striker perceiving the affordances for the goalkeeper. And in a sense it does. But this does not mean that the striker must know what it is like to be the goalkeeper. The striker need have had no experience of being a goalkeeper and need have no appreciation of what the situation looks like from the goalkeeper’s point of view. The striker perceives the situation in terms of what kinds of action will most likely result in a goal—in terms of what kinds of shot the goalkeeper is unlikely to reach. The goalkeeper is a mere obstacle, to be negotiated, evaded, got the better of.

Note that on this way of thinking about affordances it does not make sense to talk of what the environment affords for the collective, or to talk of “fields of affordances for dyads or groups”. Groups do not experience their surroundings or perceive affordances, only actors do.

Affordances make “shared intentions” unnecessary

An American beaver in Alaska lifts its head above the dam.

Beavers don’t waste their time worrying about whether or not the other beavers in the colony are committed to a shared intention for dam-maintenance. (Image source)

In the abstract for their paper, A&S write: “Coordination ensues, we argue, when, given a shared intention, the actions of and/or affordances for one agent shape the field of affordances for another agent” (my emphasis). They ask us to simply accept that coordination requires a pre-existing shared intention. The problem here is that, if you share an intention with someone (whatever that means), then you must already be coordinated with that person in some sense. Coordination does not “ensue”; it has already been presupposed. The question of how that coordination came about has been begged.

Given how opaque this notion of “shared intentions” is, there is good reason to want to avoid invoking it in the first place. Fortunately, the affordance concept is powerful enough to allow us to do this. The crucial thing is to note that the field of affordances surrounding an animal is subject to change as the internal structure of the animal changes.

Consider the beaver. Beaver dams are maintained by members of the beaver colony. This might be taken as a case of what A&S call “distributive action coordination”: each individual beaver takes part in building and repairing a structure that is beneficial to all. But there is no need to appeal to shared intentions here. (And since we are talking about a non-human animal here, we are naturally less inclined to do so.) The beaver’s motivation to repair the dam is explained in this delightful Attenborough clip. Beavers are apparently averse to the sound of running water, so when they hear the sound of water rushing out of a damaged section of the dam they are drawn towards that location and immediately start work on the repairs. Put another way, the sound of running water triggers an internal change in the beaver that reorients the animal’s activity.

Or consider the sofa-carrying case again. It is not the case that the actors involved must initiate their actions “simultaneously”. In fact, there is no natural start point to the action. To understand how the actors arrived at the position of being about to lift this particular sofa it is necessary to look backwards and ask: what happened before now that led to the present situation? Perhaps I am moving house and you agreed a couple of weeks ago to help move the furniture. Why not take that as the starting point of the action, then? When you agreed to help, you became oriented towards the furniture in a particular way: there was a change in your internal structure such that you went from being neutral with reference to this furniture to being action-oriented relative to it. The furniture became a set of objects to be moved. And as we now stand at either end of the sofa, the idea that we might need to establish a shared intention before we can get this bulky object off the ground simply does not enter into consideration. We both became object-moving devices some time earlier, and this is manifest in the fact that we are both here, and in the fact that we have probably already shifted a bunch of objects out of the room.

Or we could take the explanation further back and ask how it was that we learned to lift bulky objects in coordination with others in the first place.

The point is that, appropriately understood, the concept of affordances simply obviates the need to think about these situations in terms of shared intentions. It is perhaps useful here to make a distinction between “action” and “task”. Action has no start point. Actors do not simply engage in routine performances, or scripts, that have a clearly defined start and end point. Action is continuous: everything the actor does is part of an ongoing process in which the actor changes its environment and in doing so changes its own internal structure. I have argued (Baggs, 2014) that the task, by contrast, is an analytical device that allows us to artificially break up this stream of action into tractable chunks that can be studied; these chunks need to have a clearly defined start and end point and a characteristic mode of transition between the two. The outfielder problem is a good example: we can define the start (the ball being launched), the end (the catch), and the transition (the parabolic trajectory of the ball through the air), and this is very useful as a guide for what we should measure. But we should not make the mistake of thinking that the task as we have defined it is the same as the action from the perspective of the actor. In the context of action in multi-actor settings, such a narrow definition creates unnecessary problems. If we assume that the action itself starts at the point that we happen to start observing or measuring, i.e. at the point when we are already standing at either end of the sofa, then it looks like our behaviour must be structured by some hidden “shared intention” which is not itself perceivable. What is overlooked is the fact that our internal structure has been shaped by previous action. If the affordances for lifting the sofa are directly perceivable it’s because we perceive the situation we are in relative to the type of animal we have become.

“Direct social perception”: a conceptual blind alley?

A note of caution is in order about this phrase, “direct social perception”. The existing phrase, “direct perception”, at least as Gibson uses the term, is meant to denote a relationship between an actor and its environment. Here is a description from Turvey, Shaw, Reed and Mace (1981, p. 239):

Gibson argued that the proper “objects” of perceiving are the same as those of activity. Standing still, walking, and running are all relations between an animal and its supporting surface. Though not always explicitly recognized … the supporting surface is just as much an essential constituent of these activities as, for instance, legs; and useful perceiving involved in controlling posture and locomotion must be directed toward the same surface. Thus it would seem that a two-term relation involving the same surface or ground can exist in both cases: an animal *runs* on the ground and an animal *sees* the ground. This much should be common sense. There is no thing *between* the animal and the ground in the relation. This is what Gibson has always meant by direct perception and it is the same as what one would mean by direct action if one were discussing activity.

Even if one may be inclined to take issue with some of this, it is at least clear that direct perception means something: it refers to a particular way of thinking about what’s going on between the animal and its environment.

This is not the case with the phrase “direct social perception”. This is because “social” is a descriptive category, introduced by a third-party observer or an analyst. A situation is described as “social” if it involves multiple actors. But just because we can describe a situation as “social”, we are not necessarily licensed to assume that the “socialness” of the situation must play a causal role in our eventual explanation of the actors’ behaviour. (See the parallel argument here for the phrase “joint action”.) We might describe the beaver as a social animal, yet the beaver does not need to appreciate its own status as a social animal in order to fix its dam.

It may be prudent to avoid the phrase “direct social perception” altogether, on the grounds that the inherent mixing of perspectives—descriptive and explanatory—is bound to lead to confusion further along the line. (I will however point out Eric Charles’s claim that there’s a way of reading ecological psychology according to which other minds are perceivable in the sense that the actions of others are perceived in the context of larger, ongoing patterns of behaviour.)

Some devices for avoiding conceptual traps

In lieu of a summary, I conclude with some recommendations for avoiding some of the ever-present conceptual traps surrounding the analyst of multi-actor activity.

  • Shared intentions only seem necessary as a function of the language we use to describe situations. To avoid this, consider non-human animals: the beaver, or, better, the ant. Ants don’t do shared intentions. When proposing an account of multi-actor coordinated activity, ask: how would this apply to ants.
  • Describing something as “social” or “joint” is all well and good as a description, but do not make the mistake of thinking that just because you (the analyst) can describe something as social, that the socialness of the activity must necessarily play a causal explanatory role in an account of the behaviour of any given individual actor.
  • Actions do not begin at the point that you start measuring them. Tasks, on the other hand, do. Tasks are a methodological device that establishes some more or less arbitrary start and end point (Baggs, 2014). Do not confuse the action an actor is performing with the task as you have defined it.



