Painter’s great­est achieve­ment is to make the flat sur­face show con­vex bod­ies pro­trud­ing from this plane (Leonar­do da Vin­ci, 1792/2006)

We live in a space defined by three dimen­sions: hor­i­zon­tal (right or left), ver­ti­cal (up or down) and deep (front or back). In this space, we see objects and we can rel­a­tive­ly accu­rate­ly deter­mine their loca­tion on each of these dimen­sions. Of course, this is true pro­vid­ed that, first­ly, we have a visu­al sys­tem well-func­tion­ing along the entire length from the reti­nas of both eyes to the dor­sal path­way, sec­ond­ly, we occu­py a cen­tral (ego­cen­tric) posi­tion in this space, and third­ly, we are suf­fi­cient­ly expe­ri­enced in mov­ing in this space (Mil­ner and Goodale, 2008).

The hor­i­zon­tal and ver­ti­cal dimen­sions define a plane per­pen­dic­u­lar to the visu­al axis and the depth inside sets a line par­al­lel to it. See­ing the scene in three dimen­sions there­fore means see­ing a set of planes per­pen­dic­u­lar to the visu­al axis, which are at dif­fer­ent dis­tances from us, on the axes along the third dimen­sion. The min­i­mum con­di­tion for see­ing an object in space is there­fore to see any of its sur­faces in such a way that its iso­met­ric pro­jec­tion on a plane per­pen­dic­u­lar to the visu­al axis can be described by means of non-zero val­ues ​​on the hor­i­zon­tal and ver­ti­cal dimen­sions. We do not see those object whose iso­met­ric pro­jec­tions on a plane per­pen­dic­u­lar to the visu­al axis assume zero val­ues ​​on any of these two dimensions.

The sit­u­a­tion with deep vision is a bit more com­pli­cat­ed. Like every­thing that evo­lu­tion has equipped us with, the mech­a­nisms of see­ing in depth are pri­mar­i­ly sub­or­di­nat­ed to ensure safe­ty, i.e. sur­vival. In nat­ur­al con­di­tions, the visu­al sys­tem pays spe­cial atten­tion to the area up to approx. 6 m around us. Intru­sion of some­one or some­thing into this space requires pre­cise reac­tions, e.g. defence, and there can be no room for any mis­takes in assess­ing the dis­tance of that per­son or object from us.

In an even small­er area around us, that is, in the space delim­it­ed by the reach of our hands, high pre­ci­sion is nec­es­sary in assess­ing the dis­tance of objects or their frag­ments from us. Prob­a­bly thanks to the devel­op­ment of the abil­i­ty to pre­cise­ly manip­u­late objects with­in a short dis­tance from the eyes, it became pos­si­ble for us to man­u­fac­ture pre­cise tools and sculp­tures. Per­haps the carv­ing of the first Venus in the Hohle Fels cave, 35–40 thou­sand years ago, was just a man­i­fes­ta­tion of the war­like nature of our ances­tors aimed at devel­op­ment of mod­ern killing tech­nolo­gies, but how ben­e­fi­cial it was for the devel­op­ment of human spir­i­tu­al cul­ture. Either way, in rela­tion to the near­est space around us, there are sev­er­al neu­rocog­ni­tive mech­a­nisms con­trol­ling the pres­ence of objects deep in the watched scenes. The most impor­tant of them are relat­ed to binoc­u­lar vision.

Aware­ness of order of objects along the third dimen­sion in a short­er dis­tance, i.e. rough­ly to the lim­it of 6 m, is pri­mar­i­ly the result of binoc­u­lar vision, and in par­tic­u­lar the so-called binoc­u­lar dis­par­i­ty and ver­gence. The expe­ri­ence of see­ing the spa­tial­i­ty of the visu­al scene aris­es as a result of the action of neu­ro­phys­i­o­log­i­cal mech­a­nisms of data pro­cess­ing regard­ing the reac­tion of pho­tore­cep­tors excit­ed by two-dimen­sion­al images pro­ject­ed on the reti­na of both eyes of the observ­er. If there are dif­fer­ences in pho­tore­cep­tor response between pro­jec­tions of these images on the spa­tial­ly cor­re­spond­ing recep­tive fields of both reti­nas, then at the cor­ti­cal sec­tions of the visu­al path­way, this data is inte­grat­ed into one image con­tain­ing infor­ma­tion about the order of the planes seen in depth.

It is not dif­fi­cult to guess that the first sig­nals regard­ing binoc­u­lar diver­gence are the sub­ject of analy­ses in the V1 cor­tex, in which data from the right and left eye is ordered with­in one anatom­i­cal struc­ture. Research on this sub­ject was con­duct­ed sev­er­al years after Hubel and Wiesel’s dis­cov­er­ies regard­ing the con­struc­tion of the V1 cor­tex, among oth­ers by Horace B. Bar­low, Col­in Blake­more and Jack Pet­ti­grew (1967). Fur­ther analy­ses in the field of binoc­u­lar diver­gence are con­duct­ed in the struc­tures of the pari­etal lobes on the dor­sal path, espe­cial­ly in the V5 area.

If the binoc­u­lar diver­gence is too large, i.e. if the images pro­ject­ed on the reti­na of both eyes of the observ­er are com­plete­ly dif­fer­ent, then the so-called  binoc­u­lar rival­ry occurs. In this sit­u­a­tion, dif­fer­ences between the images are not inter­pret­ed as a depth indi­ca­tor; only one of these images, most often the one pro­ject­ed onto the reti­na of the dom­i­nant eye, is treat­ed as the cor­rect one, and the oth­er is ignored.

The sec­ond group of mech­a­nisms asso­ci­at­ed with stereo­scop­ic vision con­cerns ver­gence reflex­es and diver­gence of eye­balls. This issue has already been dis­cussed in the chap­ter on fram­ing the visu­al scene. Con­ver­gent eye move­ment is a man­i­fes­ta­tion of the greater inter­est in objects clos­er to the observ­er, while diver­gent move­ment indi­cates inter­est in objects slight­ly away from the observ­er, but only up to approx. 6 m, because after that lim­it the lines of field of vision are almost par­al­lel to each other.

Because in this book I focus on view­ing flat images, binoc­u­lar cues of depth per­cep­tion, i.e. binoc­u­lar diver­gence and ver­gence reflex­es, will not be dis­cussed fur­ther. From the point of view of the issues dis­cussed here, those mech­a­nisms of depth vision that do not require binoc­u­lar­i­ty are much more inter­est­ing. They are much more uni­ver­sal, because they not only allow us to assess the order of objects in depth in a three-dimen­sion­al scene, regard­less of whether they are close or far from us, but also form the basis of the third dimen­sion illu­sion in the paint­ings. They are the so-called non-stereo­scop­ic or monoc­u­lar depth cues.

Space seen through Cyclope’s eye

At the same time, monoc­u­lar depth cues are the basis for under­stand­ing spa­tial rela­tion­ships inwards between objects depict­ed in a flat image or in three-dimen­sion­al space, prac­ti­cal­ly regard­less of the dis­tance from the observ­er. The accu­ra­cy of the inter­pre­ta­tion of these indi­ca­tors is the result of learn­ing how spe­cif­ic pat­terns of planes, their con­tours and col­ors, and their vari­abil­i­ty in motion, rep­re­sent real rela­tion­ships in three-dimen­sion­al space. This hypoth­e­sis is sup­port­ed by the results of inter­cul­tur­al research, in which the stim­u­lus was illus­tra­tions (paint­ings, draw­ings, pho­tos) depict­ing scenes con­tain­ing depth cues unknown in oth­er cul­tures (e.g. Dun­can, Gourlay and Hud­son, 1973; Gib­son, 1988; Hud­son, 1960; Jaho­da, Dere­gows­ki, Amp­ene and Williams, 1977; Jaho­da and McGurk, 1974; Lid­dell, 1996; 1997; Ser­pell and Dere­gows­ki, 1980).

Robert Ser­pell and Jan B. Dere­gows­ki (1980) prove that intro­duc­tion to the scene of hunt­ing of depth cues obvi­ous to peo­ple from the cir­cle of West­ern Euro­pean cul­ture, such as the hori­zon, con­verg­ing road edges and the right pro­por­tions of sizes of the pre­sent­ed objects in the first and sec­ond plane (Fig. 134) by no means is a valu­able tip for the indige­nous peo­ple of Ghana, Zam­bia or Ugan­da in terms of space.  Africans have con­sis­tent­ly inter­pret­ed this scene as an ele­phant hunt.

Fig­ure 134. Illus­tra­tion from W. Hud­son’s depth per­cep­tion test (1960). Graph­ic design: P.F. Bsed on Hud­son (1960)

Depth per­cep­tion depict­ed on a flat paint­ing based on its monoc­u­lar indi­ca­tors is the result of the activ­i­ty of neu­ro­phys­i­o­log­i­cal mech­a­nisms devel­op­ing on sim­i­lar prin­ci­ples as, e.g., mech­a­nism that enables us to sep­a­rate an object from its shad­ow. With regard to new objects, we can eas­i­ly be mis­tak­en in tak­ing a shad­ow for actu­al part of an object. In numer­ous exper­i­ments we found out that both the con­tour and the col­or of the visu­al object and its shad­ow have slight­ly dif­fer­ent prop­er­ties. Their per­cep­tion, clas­si­fi­ca­tion and remem­ber­ing are the basis for accu­rate recog­ni­tion of object in the future, both in the three-dimen­sion­al world and in images.

It is sim­i­lar with depth per­cep­tion. We know from expe­ri­ence that, for exam­ple, the size of the same objects in a visu­al scene most often depends on their dis­tance from the observ­er. And although this obser­va­tion has cer­tain­ly been giv­en to human­i­ty almost since the dawn of time, it took over 30,000 years (count­ing from the cre­ation of rock paint­ings in Las­caux, Altami­ra and Chau­vet) for the prin­ci­ples of lin­ear per­spec­tive to be con­scious­ly used by artists in West­ern cul­ture to depict three-dimen­sion­al visu­al scenes in flat paint­ings (Alber­ti, 1963; Janows­ki, 1997). It was only at the begin­ning of the 15th cen­tu­ry that Fil­ip­po Brunelleschi cod­i­fied the prin­ci­ples of a lin­ear per­spec­tive and through the cen­turies-old cul­tur­al mes­sage they were includ­ed in the canon of basic monoc­u­lar depth cues. Acquir­ing the abil­i­ty to use this cue required time and train­ing, i.e. sim­ply learn­ing how it reveals the spa­tial­i­ty of the pre­sent­ed visu­al scene in the third dimension.

Although the Renais­sance solu­tion of Brunelleschi seems to be the most obvi­ous form of pre­sent­ing three-dimen­sion­al space in the pic­ture, it is sure­ly not the only per­spec­tive accept­ed by the human mind. This is proved by the very inter­est­ing graph­ic and paint­ing propo­si­tions of Dick Ter­mes, fas­ci­nat­ed by not a one-point or two-point per­spec­tive of per­ceiv­ing things, but a six-point spher­i­cal per­spec­tive (Fig. 135). Both his exper­i­ments and stud­ies of ways of space imag­ing in the past prove that we do not have a good answer to the ques­tion of how per­spec­tive lines run when view­ing three-dimen­sion­al scenes. We can­not even agree whether these lines are straight or crooked. Draw­ing on knowl­edge from optics, it is eas­i­er for us to think about straight-line visu­al axes and par­al­lel lines of per­spec­tive con­ver­gence than the lines run­ning along the edges of the sphere.

Fig­ure 135. From one-point to six-point (spher­i­cal) per­spec­tive of Dick Termes

Cur­rent­ly, we know many monoc­u­lar depth cues that form the basis for see­ing the third dimen­sion in flat paint­ings. The func­tion­al­i­ty of all these cues is ver­i­fied by count­less dai­ly visu­al and move­ment expe­ri­ences. Look­ing at the flat image of three-dimen­sion­al space, we use it auto­mat­i­cal­ly. They cre­ate the unques­tioned illu­sion of three-dimen­sion­al­i­ty of the image and it is only when the artist breaks the prin­ci­ple under­ly­ing one of these cues, it forces us to revise the belief in the spa­tial­i­ty of what we are cur­rent­ly watch­ing. The illu­so­ry spaces of Mau­rits Cor­nelis Esch­er pro­vide excel­lent exam­ples con­firm­ing this sup­po­si­tion (Fig. 136).

Fig­ure 136. Mau­rits Cor­nelis Esch­er, Day and Night (1938). Nation­al Gallery of Art, Wash­ing­ton, DC, USA [39.2 x 67.6 cm] 

Three groups of monocular depth cues

The first group of monoc­u­lar depth cues include those that define depth on the basis of the rela­tion­ship between ele­ments in the visu­al scene depict­ed in the image or in rela­tion to its frame­works. These are: inter­po­si­tion or occlu­sion, i.e. cov­er­ing one object by anoth­er, trans­po­si­tion (ele­va­tion), which deter­mines the rela­tion­ship between the posi­tion of objects in the image rel­a­tive to the hori­zon and their illu­so­ry dis­tance from the observ­er, and lin­ear per­spec­tive and curve lin­ear per­spec­tive), zero‑, one- or multi-point.

The sec­ond group of cues is based on blur­ring of the out­lines of objects seen as a result of thick­en­ing of the tex­ture gra­di­ent or sharp­ness of details. The basic inter­pre­ta­tion rule of these cues is: the less dif­fer­ent the sur­face of the objects pre­sent­ed in the image, the greater their dis­tance from the observ­er inwards.

Final­ly, the third group of depth cues is con­nect­ed with the lumi­nance and col­or of the pre­sent­ed objects. The most impor­tant cues in this group include: chiaroscuro, aer­i­al per­spec­tive and col­or sep­a­ra­tion. The dis­tri­b­u­tion of light and the inten­si­ty of the shad­ing of the sur­faces of objects and the space around them is basi­cal­ly inter­pret­ed in accor­dance with the fol­low­ing prin­ci­ple: the brighter the clos­er, and the dark­er the more inwards. Not with­out sig­nif­i­cance for the inter­pre­ta­tion of chiaroscuro as a cues of depth is also the assump­tion regard­ing the posi­tion of light rel­a­tive to the object, with­out which it does not cast a shadow.

Watch­ing dis­tant objects, e.g. moun­tains on the hori­zon, we gen­er­al­ly see them much less clear­ly than near­by objects, but above all, we see them as if behind a blue-grey haze. It is the den­si­ty of this mist that defines the air per­spec­tive. The last cue, i.e. col­or sep­a­ra­tion, is the most con­tro­ver­sial one. Accord­ing to Pablo Picas­so, for exam­ple, col­ors do not take part in cod­ing of space at all. They are only sym­bol­ic. On the oth­er hand, accord­ing to Paul Cézanne or 17th-cen­tu­ry aca­d­e­mics, it is com­plete­ly dif­fer­ent — warm col­ors are clos­er to the observ­er, and cold ones — further.

Of the depth cues list­ed above, inclu­sion is the most obvi­ous and also the least con­tro­ver­sial. We may have doubts whether the artist cor­rect­ly reflect­ed the per­spec­tive of the depict­ed space, or whether — in accor­dance with the con­ven­tion — he or she used one or anoth­er col­or to empha­sise the depth, but we have no doubt about one thing: the obstruct­ed object can­not be clos­er to us than the one obscur­ing it. The results of inter­cul­tur­al stud­ies indi­cate that African chil­dren in the 1960s had the least prob­lems with cor­rect depth inter­pre­ta­tion based on the inclu­sions pre­sent­ed in the pic­tures of the Hud­son test (1960). On the one hand, we almost exclu­sive­ly look at objects that are either obscured or obscure oth­ers, and we always inter­pret this obser­va­tion as a reli­able indi­ca­tor of their dis­tance from us. On the oth­er hand, a clos­er look at the mech­a­nisms respon­si­ble for under­stand­ing the rela­tion­ship of obscur­ing, reveals their com­plex­i­ty. For these rea­sons, in the next chap­ter I will dis­cuss this depth view­ing cue in more detail.

The identity of objects as the foundation of seeing along the third dimension

Regard­less of the depth cues list­ed, which can be used more or less cor­rect­ly on the image to give the scene a three-dimen­sion­al plas­tic­i­ty, they all have one thing in com­mon: they are image prop­er­ties. In oth­er words, by draw­ing or paint­ing a pic­ture we can use each of them to empha­sise the spa­tial­i­ty of the scene inwards. How­ev­er, how the giv­en depth cue is inter­pret­ed is no longer a fea­ture of the image but of the observ­er, or more pre­cise­ly his or her abil­i­ty to per­ceive objects as iden­ti­cal regard­less of per­cep­tu­al con­di­tions (Law­son, 1999).

None of the three dimen­sions of space work as destruc­tive­ly for the con­stan­cy of see­ing of objects as the dimen­sion inwards. Spa­tial­i­ty not only affects the size of objects seen, but above all caus­es that we are still deal­ing with their dif­fer­ent appear­ance. This vari­abil­i­ty is the result of both the observ­er tak­ing dif­fer­ent points of view and inclu­sion, which mer­ci­less­ly pro­vides us with only the frac­tions of com­plete images. A set of those men­tal fea­tures that coun­ter­act the destruc­tive effects of inwards vision is referred to as visu­al object con­stan­cy, two of which are par­tic­u­lar­ly rel­e­vant to images. These are: shape and size sta­bil­i­ty. I will start dis­cussing the issue of depth in flat images using those two.


Constancy of shape

The per­cep­tion of com­plex shapes of objects as the same, despite the fact that they are seen from a new or unusu­al point, in unfavourable light­ing con­di­tions, or that they are par­tial­ly obscured, is called shape con­stan­cy (Palmer, 1999; Piz­lo, 2008; Vogels and Orban, 1996). Shape con­stan­cy is one of the basic fac­tors allow­ing us to recog­nise an object, a per­son or rela­tion­ship between them, regard­less of whether their image pro­ject­ed on the reti­na of the eye at the giv­en moment is the same or dif­fer­ent as the one pro­ject­ed ear­li­er (Piz­lo, Sawa­da, Li, Kropatsch et al., 2010).

The sense of shape con­stan­cy is one of the pil­lars of vision. In par­tic­u­lar it refers to the per­cep­tion of depth because the three-dimen­sion­al­i­ty of space is one of those fac­tors that sig­nif­i­cant­ly affect the change­abil­i­ty of shapes of objects, under­stood as their iso­met­ric pro­jec­tions on the sur­faces of the reti­nas. After all, we must remem­ber that although we live in a three-dimen­sion­al world in which the dimen­sion “inwards” is most often inter­pret­ed as “in front” or “in front of us”, in the act of see­ing data about it is reduced to two-dimen­sion­al reti­nal images. The expe­ri­ence of the third dimen­sion is not giv­en to us direct­ly in such a way as e.g. the expe­ri­ence of see­ing light, but it is a kind of con­clu­sion result­ing from many premis­es con­tained in flat reti­nal images.

An exam­ple illus­trat­ing the con­stan­cy of shape is the expe­ri­ence of view­ing the same object from dif­fer­ent points of view (Fig. 137 A).

Fig­ure 137 A. Illus­tra­tion of the expe­ri­ence of shape sta­bil­i­ty despite the dif­fer­ent appear­ance of the object viewed from dif­fer­ent per­spec­tives. Graph­ic design: P.F.

If we look at the con­tours that will most like­ly be cap­tured on the basis of their images pro­ject­ed on the reti­na of the eyes, we will see that they are fun­da­men­tal­ly dif­fer­ent (Fig. 137 B). It does not both­er us, how­ev­er, to think of this object as on and the same. And this is the won­der­ful expe­ri­ence of shape con­stan­cy. It is real­ly hard to imag­ine life with­out this abil­i­ty to grasp the essence of objects despite the vari­abil­i­ty of their appearance.

Fig­ure 137 B. The con­tour ver­sion of Fig. 137 A. Graph­ic design: P.F. Pro­ce­dure: (1) Image/Mode/Lab Col­or [Lumi­nance]; (2) Image/Image Size [Res­o­lu­tion: 150]; (3) Filter/Stylize/Find Edges ; (4) Image/Image Size [Res­o­lu­tion: 300]; (5) Image/Adjustments/Brightness/Contrast [Bright­ness: 50; Con­trast 100]

Shape con­stan­cy gen­er­al­ly does not depend on how large are the images pro­ject­ed by the same object onto the reti­na of the eye, e.g. due to view­ing it from dif­fer­ent dis­tances. Of course, grow­ing dis­tance between the object and the observ­er is one of the fac­tors that caus­es a decrease in visu­al acu­ity and after exceed­ing a cer­tain thresh­old, the size of the image pro­ject­ed on the reti­na stars hav­ing a neg­a­tive effect on shape con­stan­cy. In the 1970s, Her­shel W. Lei­bowitz, Stephen B. Wilcox and Robert B. Post (1978) stud­ied the effect of image blur­ring on shape and size con­stan­cy. They found that an increase in image blur­ring reduces shape con­stan­cy, but does not affect size con­stan­cy (which will be dis­cussed lat­er in the next chapter).

Moses W. Chan, Adam K. Steven­son, Yun­feng Li and Zyg­munt Piz­lo (2006) stat­ed that the shape con­stan­cy of three-dimen­sion­al objects is close­ly relat­ed to such fea­tures as sym­me­try, vis­i­bil­i­ty of the pla­nar con­tours defin­ing the planes and vol­ume. Sym­me­try plays a par­tic­u­lar­ly impor­tant role in main­tain­ing shape con­stan­cy. Asym­me­try makes recog­ni­tion of the object viewed from dif­fer­ent points of view dif­fi­cult. Sim­i­lar­ly, small­er shape con­stan­cy is char­ac­ter­is­tic of those objects in which it is dif­fi­cult to iden­ti­fy planes and that are devoid of volume.

Three-dimen­sion­al object shown in Fig. 138 dif­fer in terms of shape con­stan­cy. Although both are asym­met­ri­cal, object A has clear­ly vis­i­ble planes that, when con­nect­ed togeth­er, sug­gest it is a sol­id fig­ure, and there­fore it has vol­ume, while object B does not resem­ble a sol­id fig­ure. It would be much eas­i­er for us to recog­nise object A from a dif­fer­ent point of view than in case of object B.

Fig­ure 138. Three-dimen­sion­al objects pre­sent­ed on a plane, char­ac­terised by high shape con­stan­cy (A) and low shape con­stan­cy (B)

Zyg­munt Piz­lo, Yun­feng Li and Robert M. Stein­man (2008) claim that the shape con­stan­cy of three-dimen­sion­al objects viewed from a short dis­tance is by no means nec­es­sar­i­ly, nor suf­fi­cient­ly relat­ed to the brain’s record­ed diver­gence of images on the reti­nas of both eyes (binoc­u­lar dis­par­i­ty). The basis for shape con­stan­cy are pri­mar­i­ly the list­ed fea­tures of these objects seen in a two-dimen­sion­al plane.

It is worth recall­ing one more con­cept at this point, which address­es the issue of shape con­stan­cy as the prin­ci­ple of per­cep­tu­al cat­e­go­riza­tion. It is the so-called canon­i­cal per­spec­tive, from which the object is seen. The cre­ators of this con­cept are Stephen E. Palmer, Eleanor H. Rosch and Paul Chase (1981). They pre­sent­ed to the sub­jects pho­tographs of nat­ur­al objects cap­tured from dif­fer­ent points of view and asked them to recog­nise those objects. It turned out that for each sub­ject there is the most pre­ferred point of view from which it is most accu­rate­ly recog­nised. From this point of view, the appear­ance (shape) of the object reveals the most impor­tant details nec­es­sary to recog­nise it and is most often seen from this point in every­day life (Blanz, Tarr and Bulthoff, 1999). In oth­er words, the shape of objects pre­sent­ed from a canon­i­cal per­spec­tive is a pro­to­type shape, with which the mind com­pares the cur­rent­ly seen shapes of objects when deter­min­ing their identity.

Two brain struc­tures lying on the abdom­i­nal path­way, i.e. infe­ri­or tem­po­ral gyrus; IT and fusiform gyrus lying just above it from the inside of the cor­tex, to which visu­al data flows main­ly from the V4 field (Tana­ka, 1993; 1996) are respon­si­ble for the per­cep­tion of com­plex shapes of things as iden­ti­cal, or in oth­er words, for their con­stan­cy. It is now also known that the con­stan­cy of vision of shapes of let­ters and word is also asso­ci­at­ed with the activ­i­ty of neu­rons in the fusiform gyrus, but only on the left side of the brain, in the visu­al word form area (VWFA) (Dehaene and Cohen, 2011). It is also known that the con­stan­cy of vision of shapes of face that under­lies their recog­ni­tion is relat­ed to the activ­i­ty of neu­rons locat­ed in the fusiform gyrus on the right side of the brain or on both sides of it (Farah, 1996; Fein­berg, Schindler, Ochoa, Kwan et al., 1994; Kan­wish­er, McDer­mott and Chun, 1997; McCarthy, Puce, Gore and Alli­son, 1997; Fig. 139).

Fig­ure 139. Brain struc­tures respon­si­ble for the con­stan­cy of shape of the objects seen, includ­ing let­ters and words, and face. Graph­ic design: P.A. based on Tana­ka (1996), Dehaene and Cohen (2011) and McCarthy, Puce, Gore and Alli­son (1997)

Neu­rons in the area of IT and fusiform gyrus react sim­i­lar­ly to the same shapes of seen objects regard­less of whether their con­tours are encod­ed by the con­trast of bright­ness, tex­ture diver­si­ty, or move­ment (Sáry, Vogels, Kovács, and Orban, 1995), regard­less of the size and posi­tion of these objects in the field of vision (Ito, Tamu­ra, Fuji­ta, and Tana­ka, 1995; Logo­thetis, Pauls and Pog­gio 1995), and irre­spec­tive of whether they are ful­ly vis­i­ble or par­tial­ly obscured (Kovács, Vogels, and Orban, 1995; Missal, Vogels, and Orban, 1997).

Alan Slater and Vic­to­ria Mori­son (1985) and Alan Slater, Scott P. John­son, Eliz­a­beth Brown, and Mar­i­on Bade­noch (1996) stud­ied the inter­est of young chil­dren in the pre­sent­ed shapes of var­i­ous fig­ures. They deter­mined that the mech­a­nism respon­si­ble for cod­ing com­plex shapes and, con­se­quent­ly, for the con­stan­cy of see­ing them, is innate. Already in sev­er­al days old infants, they found habit­u­a­tion, i.e., loss of inter­est in known shapes and increased inter­est in fig­ures that have shapes unknown to them.

At the oth­er extreme, due to the degen­er­a­tive process­es of the aging brain, this mech­a­nism may become grad­u­al­ly dys­func­tion­al, which is man­i­fest­ed by the deep­en­ing symp­toms of shape agnosia, i.e. the inabil­i­ty to cor­rect­ly recog­nise and repro­duce the things being seen (Farah, 1990; Tip­pett, Black­wood and Farah, 2003). The phe­nom­e­non of the break­down of shapes of things being seen against the back­ground of neu­rode­gen­er­a­tive dis­or­ders per­fect­ly reflects the self-por­traits of William Uter­mohlen (Fig. 140 A‑D).

Fig­ure 140 A. William Uter­mohlen, Self Por­trait with Saw (1997). Galerie Beck­el Odille Boï­cos, Paris, France [35.5 x 35.5 cm]
Fig­ure 140 B. William Uter­mohlen, Self Por­trait with Easel (1998). Galerie Beck­el Odille Boï­cos, Paris, France [35.5 x 25 cm]
Fig­ure 140 C. William Uter­mohlen, Erased Self Por­trait (1999). Galerie Beck­el Odille Boï­cos, Paris, France [45.5 x 35.5 cm]
Fig­ure 140 D. William Uter­mohlen, Self Por­trait Draw­ing (2000). Galerie Beck­el Odille Boï­cos, Paris, France [40.5 x 33 cm]

Pic­tures on Fig. 140 are pre­sent­ed chrono­log­i­cal­ly. They were paint­ed by Uter­mohlen between six­ty-three and six­ty-six years of age, while Alzheimer’s dis­ease grad­u­al­ly led to the dis­ap­pear­ance of his brain struc­tures in both hemi­spheres (Crutch, Isaacs and Rossor, 2001).

On the one hand, neu­ro­science pro­vides many exam­ples of painters who suf­fered from migraine, epilep­sy, stroke or oth­er brain dam­age, as well as neu­rode­gen­er­a­tive dis­eases. In 2006, the entire Inter­na­tion­al Review of Neu­ro­bi­ol­o­gy (vol­ume 74) was devot­ed to this issue. On the oth­er hand, con­tem­po­rary art is full of exam­ples of delib­er­ate and even pro­gram­mat­ic vio­la­tion of the prin­ci­ple of shape con­stan­cy. Clas­sics in this field should cer­tain­ly include painters asso­ci­at­ed with sur­re­al­ism (Fig. 141), expres­sion­ism (Fig. 142) and cubism (Fig. 143). Despite their many dif­fer­ences, they have one in com­mon — a rad­i­cal depar­ture from the typ­i­cal shape of things as a means of artis­tic expression.

Fig­ure 141. Sal­vador Dali, The Temp­ta­tion of St. Antho­ny (1946). Musée Roy­aux des Beaux-Arts, Brus­sels, Bel­gium [89.7 x 119.5 cm]
Fig­ure 142. Fran­cis Bacon, Three Stud­ies for Fig­ures at the Base of a Cru­ci­fix­ion (1944). Tate Mod­ern, Lon­don, Unit­ed King­dom [94 cm x 74 cm each]
Fig­ure 143. Pablo Picas­so, Les Demoi­selles d’Av­i­gnon (1907). Muse­um of Mod­ern Art, New York, USA [243.9 × 233.7 cm] 

Size constancy

Size con­stan­cy is the abil­i­ty of the mind to accu­rate­ly assess the size of per­ceived objects regard­less of their dis­tance from the observ­er, in oth­er words no mat­ter how big the image they project on the reti­na (Gib­son, 1979; Kauf­man and Kauf­man, 2000; Kon­kle and Oli­va 2011; Palmer, 1999). Size con­stan­cy is based on the knowl­edge and expe­ri­ence of the observ­er, which tells him that the per­ceived size of known objects is a deriv­a­tive of the dis­tance at which they are locat­ed from him. It is so obvi­ous that we do not even real­ize how huge the impact of dis­tance in depth is on the size of the images pro­ject­ed on the reti­na of the eye by objects locat­ed clos­er and fur­ther away from the observer.

A pair of young peo­ple in the upper right cor­ner in Fig. 144 (arrow A) is locat­ed no more than 15 metres behind the per­son from the fore­ground, but only bring­ing these plans togeth­er gives an idea of ​​the scale of the dif­fer­ence. The height of the peo­ple from the back­ground does not exceed 2/3 of the height of the head of the per­son in the fore­ground (cf. Fig. 144, arrow B). Despite such a large dis­pro­por­tion between the sizes of peo­ple pho­tographed, we do not have impres­sion that the peo­ple from the back­ground are par­tic­u­lar­ly short.

Fig­ure 144. Illus­tra­tion of the sta­bil­i­ty of the size of peo­ple at dif­fer­ent dis­tances from the observ­er into the visu­al scene. Graph­ic design: P.F.

The sense of size con­stan­cy depend­ing on the dis­tance inwards is, how­ev­er, even more extra­or­di­nary expe­ri­ence. Here is a paint­ing by Pao­lo Uccel­lo, Scenes from the Life of the Holy Her­mits, on which the dis­tances between dif­fer­ent plans are at least sev­er­al dozen metres (Fig. 145). In each of these plans we see char­ac­ters, and we accept their sizes as eas­i­ly as in Fig. 144. The dif­fer­ence between the two paint­ings is, how­ev­er, fun­da­men­tal. The fig­ure of a kneel­ing her­mit vis­i­ble from a dis­tance of 60–80 metres in the far­thest plan is about half the height of the largest fig­ure in the fore­ground — the vision­ary her­mit sit­ting in the bench on the left. If you pho­tographed this scene in nature, then a kneel­ing her­mit would become a small point on the hori­zon. And then, we would cer­tain­ly also accept this scene as cor­rect­ly reflect­ing the size of peo­ple and objects in it. So, what about size con­stan­cy in paintings?

Fig­ure 145.  Pao­lo Uccel­lo, Scenes from the Life of the Holy Her­mits (1460). Gal­le­ria dell’Accademia, Flo­rence, Italy [81 cm x 110 cm]

Talia Kon­kle and Aude Oli­va (2011), in addi­tion to the idea of ​​size con­stan­cy, intro­duce the con­cept of canon­i­cal visu­al size, anal­o­gous to the con­cept of canon­i­cal shape of objects per­ceived from a par­tic­u­lar point of view. While the size con­stan­cy of a known object allows the observ­er to assess the dis­tance between him and the said object in a real sit­u­a­tion, the canon­i­cal visu­al size of this object is the most pre­ferred size in the image. These two con­cepts are not con­tra­dic­to­ry, although they do not have to be com­pat­i­ble with each oth­er. When view­ing the image, knowl­edge of the actu­al size of the objects can be used by the observ­er to assess the spa­tial rela­tion­ships inwards. At the same time, it is sub­ject to a spe­cif­ic adjust­ment, pre­cise­ly because it is a rep­re­sen­ta­tion in an image and not a real sit­u­a­tion in three-dimen­sion­al world.

Two effects found by Kon­kle and Oli­va (2011) reveal the specifics of this adjust­ment. The first effect relates to the rel­a­tiviza­tion of the size of the drawn object to the space of the image, deter­mined by its frames. It turns out that although the fish, chair and truck, drawn on three sep­a­rate sheets of paper of the same size, occu­py a larg­er and larg­er area, respec­tive­ly, the increase in size is not rec­ti­lin­ear (as it would be in real­i­ty), but log­a­rith­mic (Fig. 146 A).

Fig. 146 A. The effect of log­a­rith­mic increase in size of the drawn object in rela­tion to the fixed paper size. Graph­ic design: P.F. based on Kon­kle and Oli­va (2011)
Fig. 146 B. The effect of pro­por­tion­al increase in size of the drawn object in rela­tion to the chang­ing paper size. Graph­ic design: P.F. based on Kon­kle and Oli­va (2011)

The sec­ond effect stat­ed by Kon­kle and Oli­va (2011) relates to the rela­tion­ship between the size of the sur­face of image deter­mined by its frame and the size of the draw­ing of the same object. The pro­por­tion of the size of the car in Fig. 146 B to the sur­face of the sheet of paper on which it was drawn is near­ly the same, although the increase in the size of the image sur­face com­pared to the increase in the size of the drawn object is slight­ly larg­er with each sub­se­quent paper size. It can there­fore be expect­ed that for each object depict­ed in the draw­ing with a cer­tain sur­face there is a cer­tain most pre­ferred (accept­ed) size.

In order to under­stand the rela­tion­ship between the size of the object depict­ed in the image and the size of its sur­face, it is still nec­es­sary to real­ize that the real pic­ture frames, usu­al­ly enclosed in a rec­tan­gu­lar form, are not the only ref­er­ence frames for the objects depict­ed in it. The size of peo­ple, both in the pho­to in Fig. 144, as well as on the paint­ing by Uccel­lo in Fig. 145 can be observed not so much through the prism of see­ing space inwards, but through the prism of many plans, in some sense inde­pen­dent of each, or in oth­er words — images with­in the image. Each of these sub-images has its own space of a spe­cif­ic size in rela­tion to which the size of the objects con­tained in it is relative.

Let’s use the exam­ple of dif­fer­ent scenes from the Scenes from the Life of the Holy Her­mits, which I took from three dif­fer­ent plans of the whole paint­ing (Fig. 147). It turns out that the pro­por­tions of the size of the char­ac­ter to the size of the space of indi­vid­ual scenes are more or less the same, regard­less of the plan in which they are paint­ed. What is more, the prin­ci­ple of size con­stan­cy is respect­ed in each of these scenes. Fur­ther objects are rel­a­tive­ly small­er than the clos­er ones.

Fig­ure 147. Three frag­ments tak­en from three plans of the paint­ing by Pao­lo Uccel­lo, Scenes from the Life of the Holy Her­mits. Graph­ic design: P.F. based on Fig. 145

The pos­si­bil­i­ty to manip­u­late the rela­tion­ship between the size con­stan­cy and canon­i­cal visu­al size are of par­tic­u­lar inter­est to some out­stand­ing con­tem­po­rary painters. Chuck Close and Andy Warhol exper­i­ment­ed with the canon­i­cal visu­al size of the faces por­trayed in the paint­ings, rescal­ing them to dimen­sions strong­ly dif­fer­ent from typ­i­cal pho­to for­mats or pho­to­copies. Not only can the image sur­face exceed 10 m², but also the face paint­ed on it fills it almost entire­ly. The full artis­tic effect is also achieved because these paint­ings are viewed from a short dis­tance in the muse­um space.

Fig­ure 148 A. Andy Warhol, Frank (1969). The Min­neapo­lis Insti­tute of Arts, Min­neapo­lis, Min­neso­ta, USA [274 x 213 cm]
Fig­ure 148 B. Andy Warhol, Mao (1973). Ham­burg­er Bahn­hof Muse­um für Gegen­wart, Berlin, Ger­many [448 x 346 cm]

An exam­ple of a rad­i­cal breach of the prin­ci­ple of size con­stan­cy in a paint­ing can be, in turn, the paint­ings of René Magritte, from the series Sur­re­al inte­ri­or, in which the pro­por­tion of eas­i­ly recog­nis­able objects of rel­a­tive­ly small size was seri­ous­ly dis­turbed both in rela­tion to the space in which they are pre­sent­ed, and to the size of the space deter­mined by the pic­ture frame (Fig. 149). Removal of the nor­mal rela­tion­ship between objects and spaces in which they are found, com­bined with the unusu­al titles of these works are a source of com­plete­ly new meanings.

Fig­ure 149 A. René Magritte, The Lis­ten­ing Room (1952). Menil Col­lec­tion, Hous­ton, Texas, USA [55 x 45 cm]
Fig­ure 149 B. René Magritte, The Tomb of the Wrestlers (1960). Pri­vate col­lec­tion [35 x 45 cm]

Sim­i­lar­ly to the shape, also the mech­a­nism respon­si­ble for the per­cep­tion of the size con­stan­cy is most like­ly innate. This hypoth­e­sis is sup­port­ed by the results of research on the habit­u­a­tion of new­born babies (Granrud, 1987; Slater, Mat­tock and Brown, 1990) and 4–6‑month-old infants (Day and McKen­zie, 1981; Granrud, 2006; McKen­zie, Tootell and Day, 1980) to the size of the objects shown to them. Devel­op­men­tal stud­ies in old­er chil­dren (5–11 years) reveal the rel­a­tive sta­bil­i­ty of size con­stan­cy, where the accu­ra­cy of assess­ment of size of objects seen depend­ing on the dis­tance increas­es with age (Granrud, 2009).

These stud­ies on the effects of image blur­ring on shape and size con­stan­cy led Her­shel W. Lei­bowitz and Robert B. Post (1982) to the con­clu­sion that while the col­lec­tion and main­te­nance of shape infor­ma­tion is han­dled by the ven­tral path respon­si­ble for object recog­ni­tion (which has already been con­firmed many times in oth­er stud­ies), data about the size of the object is processed in the dor­sal path­way, which deals with objec­t’s loca­tion, which is the basis for per­form­ing motor actions involv­ing the said object. Sim­i­lar con­clu­sions were reached a few years lat­er by Hide-aki Saito et al. (1986), who found that about 15% of neu­rons in the medi­al supe­ri­or tem­po­ral area (MST) locat­ed on the dor­sal path are sen­si­tive to changes in stim­u­lus size. Still, the prob­lem of loca­tion of the func­tion of size con­stan­cy is much less cer­tain than the neu­roanatom­i­cal loca­tion of the func­tion of shape constancy.

As a result of the research con­duct­ed on the patient D.F., which led Melvin A. Goodale and A. David Mil­ner to spec­i­fy the func­tion of two visu­al path­ways: ven­tral and dor­sal, it was also found that despite the lack of dam­age to the dor­sal path, the patient has shown prob­lems relat­ed to size con­stan­cy, while per­form­ing tasks using only one eye (Marot­ta, Behrmann and Goodale, 1997). It turns out that dam­age to the cen­tres respon­si­ble for rec­og­niz­ing the com­plex shapes of objects on the ven­tral path in con­junc­tion with mono­scop­ic (monoc­u­lar) vision also caus­es dif­fi­cul­ties in cor­rect­ly assess­ing their size.

Also Allan C. Dob­bis, Richard M. Jeo, József Fis­er and John M. All­man (1998) claim that both the dor­sal (MT) and ven­tral path struc­tures are respon­si­ble for the effect of size con­stan­cy, espe­cial­ly in the area of ​​V4. To sum up, Helen Ross and Cor­nelis Plug (2002) believe that there is still too lit­tle data to indi­cate with high prob­a­bil­i­ty those cor­ti­cal struc­tures that are respon­si­ble for the size con­stan­cy, and sure­ly we should not expect them to be locat­ed exclu­sive­ly in one of the men­tioned visu­al pathways.

Notwith­stand­ing pre­vi­ous find­ings, the results of the lat­est fMRI research con­duct­ed by Talia Kon­kle and Aude Oli­va (2012) reveal that small objects, such as a coin, pipe, leaf or mug acti­vate neu­rons main­ly in the infe­ri­or tem­po­ral cor­tex and lat­er­al occip­i­tal cor­tex, while large objects, such as an arm­chair, chest of draw­ers or lawn mow­er — in the parahip­pocam­pal cor­tex (Fig. 150). This means that oth­er brain struc­tures respond to the data on oppo­si­tion in terms of the size of objects seen, just as dif­fer­ent brain struc­tures store data and are respon­si­ble for their dif­fer­en­ti­a­tion with respect to the oppo­si­tion of ani­mate objects (faces and body parts) and inan­i­mate ones (Kriegesko­rte, Mur, Ruff, Kiani et al., 2008), regard­ing the face and oth­er body parts (Pee­len and Down­ing, 2005) and relat­ing to visu­al scenes and objects iso­lat­ed from them (Epstein and Kan­wish­er, 1998).

Fig­ure 150. Active areas of the tem­po­ral cor­tex in response to large (blue) and small (yel­low-orange) stim­uli in two pro­jec­tions of the left and right hemi­spheres of the brain: A — lat­er­al, B — low­er. Graph­ic design: P.F. based on Kon­kle and Oli­va (2012)

An inter­est­ing result of the study of Talia Kon­kle and Aude Oli­va (2012) is also that the activ­i­ty of these brain struc­tures is inde­pen­dent of how large the reti­nal image of these objects is. In oth­er words, regard­less of whether the same object pro­ject­ed a 4o or 11o angle of view on the reti­na, it acti­vat­ed the same parts of the tem­po­ral cor­tex. The results sug­gest a sig­nif­i­cant rela­tion­ship between knowl­edge about the size of objects seen and the acti­va­tion of spe­cif­ic struc­tures in the tem­po­ral and occip­i­tal cortex.

Break­ing the prin­ci­ple of size con­stan­cy is one of the means used in visu­al arts to empha­sise the impor­tance of par­tic­u­lar objects or peo­ple. In ancient Egypt, the size of the depict­ed char­ac­ters was inter­pret­ed sym­bol­i­cal­ly as a sign of impor­tance in the social hier­ar­chy. Here, for exam­ple, the pharaoh Akhen­at­en, his wife, queen Nefer­ti­ti and daugh­ter, make an offer­ing to the god Aton (solar disk) (Fig. 151 A). The size of the char­ac­ters depict­ed on the papyrus is not deter­mined by their dis­tance from the observ­er, but by their rank in the state and family.

Fig­ure 151. A. Pharaoh Amen­hotep IV (Akhen­at­en), his wife Queen Nefer­ti­ti and daugh­ter, mak­ing sac­ri­fice to god Aten (the disc of the sun), papyrus

Almost exact­ly the same pat­tern of rela­tion of impor­tance reflect­ed by the dif­fer­ent sizes of the pre­sent­ed char­ac­ters can be found in West­ern Euro­pean Chris­t­ian art (Fig. 151 B). Were it not for the giant Gabriel the Archangel in the mid­dle of the scene, one would think that the dif­fer­ences in the size of the fig­ures of Christ and the res­ur­rect­ed peo­ple are caused by the view­point of the observ­er (from heav­en towards earth). How­ev­er, look­ing at the size of peo­ple around Christ also points to an unjus­ti­fied phys­i­cal dis­pro­por­tion in their size.

Fig­ure 151 B. Hans Mem­ling, The Last Judg­ment (1467–1470). Nation­al Muse­um in Gdańsk [221 x 160 cm]

By the way, much less man­i­fes­ta­tions of break­ing of the prin­ci­ple of size con­stan­cy is found in the visu­al art of Asia, South Amer­i­ca and Aus­tralia, and even in East Euro­pean icon art.

Among mod­ern paint­ings, we can also find exam­ples of the sym­bol­ic use of size as a guide to the impor­tance of the objects pre­sent­ed, which break the rules of size con­stan­cy. One of the mas­ters of such anec­dotes is undoubt­ed­ly René Magritte (Fig. 152).

Fig­ure 152. René Magritte, The Giant (1929). Muse­um Lud­wig, Cologne, Ger­many [54 x 73 cm] and text of the poem enti­tled La Géante (The Giant­ess) (1857) by Charles Baude­laire, trans­lat­ed by William Aggel­er (The Flow­ers of Evil. Fres­no, CA: Acad­e­my Library Guild, 1954)

Paint­ing The Giant by Magritte is a sur­re­al vision inspired by the poem of Charles Baude­laire (1857, trans­lat­ed by M. Jas­trun) with the same title.


Main depth indicator

Inter­po­si­tion or occlu­sion is the most obvi­ous and also the most impor­tant indi­ca­tor of depth, both in rela­tion to the three-dimen­sion­al scene and the flat image that rep­re­sents it. It means mutu­al obscur­ing of non-trans­par­ent objects locat­ed deep into the visu­al scene being watched. The obstruct­ed object is per­ceived as being fur­ther away from the observ­er than the object that obscures it (Fig. 153 A). Despite the obvi­ous­ness of this expe­ri­ence, a clos­er look at the per­cep­tu­al mech­a­nism that under­lies it reveals its com­plex­i­ty. First of all, before the observer’s cog­ni­tive sys­tem deter­mines which object is clos­er and which one is fur­ther away from them, they must first answer the more basic ques­tion: whether what they are see­ing is one object in the same plane per­pen­dic­u­lar to the visu­al axis, or many objects posi­tioned inwards along the dimension.

Accord­ing to Gae­tano Kanizsa (1979), inter­po­si­tion is char­ac­terised by the inter­sec­tions of the “L” type and the “T” type con­tours (intra- and inter-object). From the point of view of sep­a­rat­ing objects from each oth­er, the key ele­ment is the per­cep­tion of inter-object inter­sec­tions of “T” type. There is no inter­po­si­tion phe­nom­e­non in rela­tion to trans­par­ent objects (Fig. 153 B). Typ­i­cal cross-con­tour inter­sec­tions between trans­par­ent objects are ‘X’ intersections.

Fig­ure 153 A. Inter­po­si­tion of two non-trans­par­ent objects; blue cir­cles indi­cate inter­sec­tions of “L” type con­tours, red — “T” type (intra-object) and yel­low — “T” type (inter-object); B. Lack of inter­po­si­tion of two trans­par­ent shapes; Green cir­cles indi­cate the inter­sec­tion of “X” type con­tours. Graph­ic design: P.F.

The issue of inter­po­si­tion is most often con­sid­ered in two contexts:

  • Gestalt prin­ci­ples of per­ceiv­ing the fig­ure and back­ground, which define the rela­tion­ship between the obscur­ing and the obscured, and
  • per­cep­tu­al com­plete­ness of what is obscured and the cog­ni­tive abil­i­ty to recon­struct the unseen part.

To under­stand what essen­tial­ly the phe­nom­e­non of inter­po­si­tion is, you must first remem­ber how the visu­al sys­tem recog­nis­es the shapes of indi­vid­ual things in the visu­al scene. The basis for see­ing each object is to iso­late it from the back­ground and sep­a­rate it from oth­er objects in the scene based on the planes and con­tours that sur­round it.

How­ev­er, while the neur­al mech­a­nisms that allow for the iden­ti­fi­ca­tion of con­tours are well known, we know much less about the neur­al mech­a­nisms that are respon­si­ble for deter­min­ing on which side of the con­tour the object is and on which side is the background.

Back in 1988, David Hubel wrote: “Many peo­ple, includ­ing myself, still have trou­ble accept­ing the idea that the sur­face [lying between the con­tours of seen objects — P.F] is not able to stim­u­late neu­rons in our brain – that our aware­ness of the inte­ri­or as white, black or col­ored, depends only on the cells sen­si­tive to the edges” (Hubel, 1988, p. 87). Gestalt the­o­rists also noticed this dif­fi­cul­ty and pro­posed the con­cept of object’s bor­der own­er­ship, which indi­cates the plane belong­ing to the fig­ure, as opposed to the back­ground plane or oth­er objects (Kof­f­ka, 1935).

Neural basis of interposition

To this day, the neu­ronal mech­a­nism respon­si­ble for iden­ti­fy­ing the plane as the objec­t’s own bor­der own­er­ship lying on a spe­cif­ic side is not known. The results of research con­duct­ed by Hong Zhou, Howard S. Fried­man and Rüdi­ger von der Hey­dt (2000) on mon­keys and by Rüdi­ger von der Hey­dt, Tod Macu­da and Fang­tu Qiu (2005) on humans revealed, how­ev­er, that at the ear­ly stages of cor­ti­cal pro­cess­ing of visu­al sig­nals, and exact­ly in the V2 area, more than 50% of the cells in the recep­tive field that are involved in con­tour cod­ing react more inten­sive­ly on the inside of the pre­sent­ed fig­ure than on the outside.

Philip O’Herron and Rüdi­ger von der Hey­dt (2009; 2011) pre­sent­ed to the sub­jects two types of stim­uli, shown in Fig. 154. They dif­fer in the degree of defin­ing of the fig­ure inside the cir­cle. On board A there is a white square with con­tours clear­ly sep­a­rat­ed from the grey back­ground of the cir­cle. On board B, how­ev­er, we can­not define which part of the cir­cle belongs to the fig­ure and which part is the back­ground. The oval shape lying at the junc­tion of the planes means the recep­tive field of the gan­glion cell encod­ing the con­tour. Red means high activ­i­ty of neu­rons in the recep­tive field in the V2 area, and blue means low activity.

Fig. 154. Incen­tives used in research on the iden­ti­fi­ca­tion of the order own­er­ship of objects. A — eas­i­ly rec­og­niz­able fig­ure (square) against the back­ground of a dark grey cir­cle and B — the cir­cle divid­ed into two equal parts, with­out any indi­ca­tion which part is the fig­ure and which part is the back­ground. Graph­ic design: P.F. based on O’Her­ron and von der Hey­dt (2009)

O’Her­ron and von der Hey­dt con­firmed the results of ear­li­er stud­ies accord­ing to which cells lying in the recep­tive field on the fig­ure side were more active (red) than on the back­ground (blue) (Fig. 154 A). They also deter­mined that the intense cell activ­i­ty on both sides of the con­tour in the recep­tive field in the V2 cor­tex per­sists much longer, even more than one sec­ond, when the fig­ure can­not be clear­ly dis­tin­guished with­in the stim­u­lus, as in Fig. 154 B. The results of the cit­ed stud­ies show that there must be spe­cialised cells in the V2 cor­tex that iden­ti­fy the out­line on the side of the fig­ure by iden­ti­fy­ing the con­tour, but so far we have no idea what con­sti­tutes the basis of their “knowl­edge”.

What connects Strzemiński, the artists from Chauvet and Leonardo da Vinci?

An inter­est­ing exam­ple of artis­tic vari­a­tion on the space sig­nalled by the con­tours of rec­og­niz­able things and col­or­ful spots are the tem­peras of Władysław Strzemińs­ki (Fig. 155 A and B).

Fig­ure 155 A. Władysław Strzemińs­ki, The Unem­ployed (1934). Muse­um of Art in Łódź, Poland [19.5 x 25.5 cm]
Fig­ure 155 B. Władysław Strzemińs­ki, Lodz land­scape (1932). Muse­um of Art in Łódź, Poland [20 x 24.4 cm]

The space of a city, in which some objects are obscur­ing oth­er objects is defined in Fig. 155 B using two types of con­tours: trans­par­ent, full “X” inter­sec­tions and sur­round­ing col­ored spots, which in a com­pli­cat­ed way con­nect with each oth­er and with a free con­tour line, drawn as if regard­less of them.

The spa­tial­i­ty of the scene depict­ed in the paint­ing can be read in sev­er­al ways. On the one hand, some hous­es obscure each oth­er (inter­po­si­tion), while oth­ers are trans­par­ent (vis­i­ble, but with­out planes). Some of them are indi­cat­ed only by a con­tour line, oth­ers by col­ored spots. On the oth­er hand, in the paint­ing we can sep­a­rate from each oth­er a lay­er of a free con­tour line that is in front of a lay­er of col­ored spots. How­ev­er, this line does not belong to only one plane. On the con­trary, the con­tours of the win­dows are much clos­er than the chim­neys on the horizon.

The sit­u­a­tion in the paint­ing enti­tled The Unem­ployed is even more com­pli­cat­ed (Fig. 155 A). Con­tour lines and col­ored spots inter­twine here com­plete­ly free. How­ev­er, we have no doubt that the scene rep­re­sents a group of peo­ple. On the oth­er hand, the mea­sures used by Strzemińs­ki empha­sise its shape­less­ness, irrel­e­vance of the explic­it order of obscur­ing and mobil­i­ty, fea­tures so char­ac­ter­is­tic when express­ing the idea of a crowd.

From the point of view of the task that the visu­al sys­tem per­forms by exam­in­ing space inwards the visu­al scene, one can find some anal­o­gy between Strzemiński’s paint­ings and the old­est sur­viv­ing evi­dence of human paint­ing activ­i­ty. Pho­to in Fig. 156 shows a frag­ment of cave draw­ings from the Chau­vet cave. Most like­ly, they come from the Palae­olith­ic era, about 30 thou­sand years ago.

Fig­ure 156. Cave draw­ings from the Chau­vet cave, France, approx. 30,000 b.c.

Look­ing at these, as well as many oth­er cave draw­ings, it can­not be clear­ly stat­ed to what extent the artists who cre­at­ed them con­scious­ly applied the prin­ci­ple of inter­po­si­tion. On the one hand, the group of ani­mals in the upper part of the wall is pre­sent­ed in such a way that we have no doubt that the artist used the inter­po­si­tion to present the order of the ani­mals inwards. On the oth­er hand, the con­tours of two groups of ani­mals at the bot­tom of the wall inter­twine and we are not so sure whether they present a three-dimen­sion­al scene. Don’t the cave draw­ings of ani­mals resem­ble the effects Strzemińs­ki achieved in his visu­al exper­i­ments? Per­haps this is a poor anal­o­gy, but it is pos­si­ble that the cave draw­ings, like Strzemiński’s unism vari­a­tions, are a man­i­fes­ta­tion of momen­tous dis­cov­er­ies in the field of pre­sent­ing what is seen on the plane of the picture.

Also anoth­er inter­pre­ta­tion of these pre­his­toric works of art is pos­si­ble. These, as well as many oth­er cave draw­ings, resem­ble sketch­es, study of the ani­mal’s head or body, anal­o­gous to those filled by sketch­books of almost all painters. Although the loca­tion of the heads of ani­mals drawn by Leonar­do da Vin­ci and an unknown artist from thou­sands of years ago may sug­gest inter­po­si­tion, in fact each of them is autonomous, and their accu­mu­la­tion next to each oth­er is moti­vat­ed by sav­ing space rather than try­ing to recre­ate the dimen­sion in depth (Fig. 157 A and B). The thing is that we are not sure about the motives for cre­at­ing cave drawings.

Fig­ure 157 A. Leonar­do da Vin­ci, Study of Hors­es for The Bat­tle of Anghiari (1503). Roy­al Col­lec­tion, Wind­sor Cas­tle, Lon­don, Unit­ed King­dom [19.6 x 30.8 cm]
Fig­ure 157 B. Sketch­es of hors­es’ and oth­er ani­mals’ heads from Chau­vet Cave, France, approx. 30,000 b.c.

Jean Clottes and David Lewis-Williams (2009), David Lewis-Williams (2002) and Michael Winkel­man (2002) claim that these are imag­i­nary per­for­mances (hal­lu­ci­na­tions) of objects of spe­cial wor­ship and respect to which the artist — most like­ly, the shaman — admired dur­ing the trance. Cer­tain­ly, they includ­ed large ani­mals, such as hors­es, bison, rhi­nos or tigers. The motives to immor­tal­ize them on the walls of the cave could there­fore have lit­tle to do with the inten­tion to depict real­i­ty, in the sense in which, for exam­ple, Bernar­do Belot­to (Canalet­to) paint­ed War­saw from dif­fer­ent points of view. In any case, this puz­zle has not been resolved to this day.

Reconstructing the unseen according to gestaltists

Anoth­er impor­tant issue relat­ed to inter­po­si­tion is the answer to the ques­tion: how does the visu­al sys­tem inter­pret the rela­tion­ship between two objects whose reti­nal image may sug­gest that one of them obscures the oth­er. Kanizsa calls recon­struct­ing the unseen (1979) amodal, empha­sis­ing that knowl­edge of what can­not be seen in an obscured object can­not be ver­i­fied by any of the sens­es. See­ing incom­plete objects, part­ly obscured by oth­ers, is one of the most com­mon per­cep­tive expe­ri­ences, and it does not seem dif­fi­cult for peo­ple to rec­og­nize which object is obscur­ing and which is obscured, or what the obscured object looks like. Accu­rate repro­duc­tion of invis­i­ble parts of the obstruct­ed object is the basis for see­ing their order inwards.

Con­sid­er­ing this issue, Rob van Lier, Peter van der Helm and Emanuel Leeuwen­berg (1994) con­duct­ed an inter­est­ing analy­sis ver­i­fy­ing the accu­ra­cy of the basic prin­ci­ples of per­cep­tion for­mu­lat­ed by Gestalt psy­chol­o­gists. One of them is the prin­ci­ple of good con­tin­u­a­tion, accord­ing to which the shape of the obscured part of the fig­ure is defined by extend­ing the vis­i­ble con­tour lines of this fig­ure in their direction.

A con­di­tion for good con­tin­u­a­tion is there­fore a local analy­sis of the most like­ly con­tact points of two fig­ures iden­ti­fied, for instance, on the basis of “T‑intersections.”

See­ing the two fig­ures on the left in Fig. 158 A, peo­ple gen­er­al­ly have no doubt that the rec­tan­gle obscures the square (solu­tion: b), and it does not con­tact with an irreg­u­lar fig­ure through two edges (solu­tion: a). Con­sid­er­ing this scene in three dimen­sions, we are more like­ly to think that the rec­tan­gle is clos­er to us than the square, and not that the rec­tan­gle and the irreg­u­lar fig­ure lie in the same plane, touch­ing each oth­er with the edges.

Fig­ure 158. Pre­ferred inter­pre­ta­tion of the rela­tion­ship between the two fig­ures, based on the prin­ci­ple of good con­tin­u­a­tion — solu­tion (b), B — pre­ferred inter­pre­ta­tion of the rela­tion­ship between the two fig­ures based on the prin­ci­ple of sim­i­lar­i­ty and reg­u­lar­i­ty — solu­tion (a) rather than good con­tin­u­a­tion — solu­tion (b) and C — the pre­ferred inter­pre­ta­tion of the rela­tion­ship between the two fig­ures based on the prin­ci­ple of con­tin­u­a­tion — solu­tion (b) rather than sim­i­lar­i­ty and reg­u­lar­i­ty — solu­tion (a). Graph­ic design: P.F. based on van Lier, van der Helm and Leeuwen­berg (1994)

In con­trast to the con­cepts empha­sis­ing the local analy­sis of the inter­sec­tion of the inter­sect­ing sur­faces of two fig­ures, Gestalt the­o­rists also stress the impor­tance of glob­al prin­ci­ples of fig­ure per­cep­tion, such as sim­i­lar­i­ty and reg­u­lar­i­ty, e.g. sym­me­try. As an exam­ple of the imple­men­ta­tion of these per­cep­tu­al prin­ci­ples in dis­cov­er­ing the obscured part of one of the fig­ures, Rob van Lier et al. (1994) present the illus­tra­tion in Fig. 158 B. It turns out that peo­ple gen­er­al­ly think that they see a cross and a square which, lying in the same plane, are in con­tact with each oth­er with two edges (solu­tion: a) rather than that the square obscures the irreg­u­lar fig­ure (solu­tion: b). It is worth not­ing that the solu­tion (b) is based on the prin­ci­ple of good con­tin­u­a­tion, but the inter­pre­ta­tion this time is deter­mined by the sim­i­lar­i­ty and sym­me­try of the fig­ure, which is poten­tial­ly obscured.

It turns out, how­ev­er, that the prin­ci­ples of reg­u­lar­i­ty and sim­i­lar­i­ty can also not be accept­ed as a sat­is­fac­to­ry expla­na­tion of the adopt­ed inter­pre­ta­tion regard­ing the like­ly shape of the cov­ered part of the fig­ure. The third exam­ple per­fect­ly illus­trates this dif­fi­cul­ty, which is cit­ed by Rob van Lier et al. (1994). Look­ing at the fig­ures from the left in Fig. 158 C, peo­ple gen­er­al­ly pre­fer solu­tion (b), which is based on the prin­ci­ple of good con­ti­nu­ity, than solu­tion (a), refer­ring to the prin­ci­ples of sim­i­lar­i­ty and reg­u­lar­i­ty. It turns out that in this exam­ple it is eas­i­er to see two less reg­u­lar and more com­plex fig­ures that lie one on top of the oth­er than two reg­u­lar and much sim­pler fig­ures lying next to each oth­er. To sum up, nei­ther the local nor glob­al prin­ci­ples of fig­ure per­cep­tion for­mu­lat­ed by Gestalt psy­chol­o­gists can be con­sid­ered suf­fi­cient to explain the deci­sions regard­ing the pre­ferred shape of the obscured figure.

Perceptual complexity vs interpretation memory complexity

Rob van Lier et al. (1994) sug­gest­ed that per­haps a solu­tion lies in the dis­tinc­tion between per­cep­tu­al com­plex­i­ty and mem­o­ry com­plex­i­ty of inter­pre­ta­tion. The per­cep­tu­al com­plex­i­ty of inter­pre­ta­tion refers to the process of reach­ing the pre­ferred solu­tion to the prob­lem of rela­tions between fig­ures, while the mem­o­ry com­plex­i­ty of inter­pre­ta­tion refers only to the final state, which is pre­cise­ly the solu­tion that is pre­ferred. The process of deter­min­ing whether two fig­ures over­lap and test­ing the rules that jus­ti­fy this solu­tion can be more or less com­pli­cat­ed. Sim­i­lar­ly, the solu­tion itself may be more or less com­plex, but there are indi­ca­tions that sim­pler (more eco­nom­i­cal) solu­tions are pre­ferred above all, regard­less of whether the path to them was based on sim­ple or com­plex prin­ci­ples (Hat­field and Epstein, 1985).

Van Lier et al. (1994) focused main­ly on the analy­sis of per­cep­tu­al com­plex­i­ty. How­ev­er, from the point of view of the prob­lem of inter­po­si­tion, under­stood as an indi­ca­tor of depth, the issue of the mem­o­ry com­plex­i­ty of inter­pre­ta­tion of the rela­tion­ship between the two fig­ures seems much more interesting.

First, how­ev­er, van Lier et al. (1994) intro­duced this con­cept, dis­tin­guish­ing it from the con­cept of per­cep­tu­al com­plex­i­ty, but they stopped at its gen­er­al def­i­n­i­tion. This con­cept refers to the con­cepts of Gary Hat­field and William Epstein (1985) and the research of Fred Attneave (1954) on the rela­tion­ship between complexity/simplicity and prob­a­bil­i­ty and redun­dan­cy, i.e. the rep­e­ti­tion of cer­tain per­cep­tu­al patterns.

The complexity/simplicity of the visu­al inter­pre­ta­tion of the visu­al scene reflects the like­li­hood or rep­e­ti­tion of a par­tic­u­lar solu­tion. This means that the pre­ferred solu­tions for the rela­tion­ship between the fig­ures shown in Fig. 158 are a man­i­fes­ta­tion of a sim­pler, i.e. more like­ly mem­o­ry inter­pre­ta­tion, refer­ring to pre­vi­ous expe­ri­ences with sim­i­lar fig­ures. In this con­text, it is worth com­ment­ing on the pref­er­ence of solu­tion (b) in Fig. 158 C. Instead of refer­ring to the results of com­plex analy­ses of the per­cep­tu­al com­plex­i­ty of both fig­ures and their poten­tial rela­tion­ships, it is enough to pay atten­tion to the fact that the prob­a­bil­i­ty of the visu­al scene, in which the shapes of the two fig­ures per­fect­ly match each oth­er, as in solu­tion (a), is much low­er than the prob­a­bil­i­ty of the scene, in which two more com­pli­cat­ed shapes just cov­er each other.

Sec­ond­ly, it is also worth pay­ing atten­tion to the fact that most of the exper­i­ments whose sub­ject is the study of pre­dict­ing the shape of obscured parts of fig­ures, is based on the analy­sis of two-dimen­sion­al fig­ures, some­times known, like a square, tri­an­gle or cir­cle, and some­times less known, invent­ed sole­ly for the study. In nat­ur­al life sit­u­a­tions, and espe­cial­ly when view­ing images, we usu­al­ly deal with known objects (per­haps except for abstract paint­ings). This means that the prob­lem of restor­ing invis­i­ble parts of objects should be con­sid­ered in the con­text of the con­cept of shape con­stan­cy. Accord­ing to this prin­ci­ple, a par­tial­ly obscured object is still the same object and it can be expect­ed that those parts that are visu­al­ly acces­si­ble are suf­fi­cient to induce their com­plete rep­re­sen­ta­tion in mem­o­ry. This hypoth­e­sis is sup­port­ed by the results of stud­ies in which it is stat­ed that when view­ing par­tial­ly obscured objects, exact­ly the same brain struc­tures on the ven­tral path are acti­vat­ed, espe­cial­ly in the area of ​​the infe­ri­or tem­po­ral gyrus, as when view­ing the com­plete ver­sion (see e.g., Kovács, Sáry, Köte­les, Chadaide et al., 2003; Kovács, Vogels and Orban, 1995; Missal, Vogels and Orban, 1997).

Fidias certainly knew about it, but did René Magritte?

Watch­ing the reliefs on the frieze of the Parthenon, we have no doubt that, when carv­ing them, Fidias knew exact­ly what inter­po­si­tion was and how to use it to depict the inwards illu­sion in a bas-relief metope (Fig. 159). Although, with the excep­tion of the fore­ground char­ac­ters, all the oth­ers are only sig­nalled by larg­er or small­er frag­ments, the image is ful­ly under­stand­able and log­i­cal. Most like­ly, this is because the vis­i­ble frag­ments are suf­fi­cient to elic­it their com­plete rep­re­sen­ta­tions in the mem­o­ry of observers. Inter­po­si­tion, there­fore, appears as the result of a dis­crep­an­cy between the com­plete rep­re­sen­ta­tion of the object induced in mem­o­ry based on its part, and its incom­plete image pro­ject­ed on the reti­na. This sys­tem is inter­pret­ed by the per­cep­tu­al sys­tem as an indi­ca­tor of depth.

Depict­ing the depth on a flat image using inter­po­si­tion is one of the most obvi­ous and most com­mon­ly used tech­niques in visu­al arts.

Fig­ure 159. Fidias, Rid­ers (approx. 440 B.C.) Frag­ment of the north­ern frieze of the Parthenon, Athens, Greece. The British Muse­um, Lon­don, Unit­ed Kingdom.

For mod­ern painters, how­ev­er, inter­po­si­tion is also an excuse to exper­i­ment. René Magritte paint­ed the Ama­zon, break­ing the rule of inter­po­si­tion and mak­ing it an unre­al phe­nom­e­non of the world of dreams (Fig. 160). The Ama­zon pen­e­trates the third dimen­sion in the same way that the nee­dle of a care­less seam­stress pierces the can­vas, regard­less of the order of the ver­ti­cal­ly arranged threads. As a result, some of them under­go unnat­ur­al bends. And although this is not a major obsta­cle to cor­rect­ly read the image, nev­er­the­less such a rep­re­sen­ta­tion of the visu­al scene forces the view­er to reflect on the nature of space, and espe­cial­ly its dimen­sion inwards.

Fig­ure 160. René Magritte, La Carte Blanche (1965). Nation­al Gallery of Art, Washin­gon, DC, USA [81 x 65 cm]

The emi­nent Ital­ian psy­chol­o­gist and artist, Gae­tano Kanizsa (1985) also drew atten­tion to the paint­ing by Magritte and in an arti­cle devot­ed to the rela­tion­ship between vision and think­ing, he pub­lished his, to put it mild­ly, some­what coarse ver­sion of La Carte Blanche (Fig. 161), not­ing in it an anal­o­gy to the cur­va­tures we know — as he said — from the views of woven baskets.

Fig­ure 161. Pho­to mon­tage of Gae­tano Kanizsa, inspired by Magrit­te’s paint­ing. Graph­ic design: P.F. based on Kanizsa (1985)

Table of content