Lustrous material appearances: Internal and external constraints on triggering conditions for binocular lustre

This open-access article is distributed under a Creative Commons Licence, which permits noncommercial use, distribution, and reproduction, provided the original author(s) and source are credited and no alterations are made.

Abstract

Lustrous surface appearances can be elicited by simple image configurations with no texture or specular highlights, as most prominently illustrated by Helmholtz' demonstration of stereoscopic lustre. Three types of explanatory framework have been proposed for stereoscopic lustre, which attribute the phenomenon to a binocular luminance conflict, an internalised physical regularity (Helmholtz), or to a disentangling of “essential” and “accidental” attributes in surface representations (Hering). In order to investigate these frameworks, we used haploscopically fused half-images of centre-surround configurations in which the luminances of the test patch were dynamically modulated. Experiment 1 shows that stereoscopic lustre is not specifically tied to situations of a luminance conflict between the eyes. Experiment 2 identifies a novel aspect in the binocular temporal dynamics that provides a physical basis for lustrous appearances, namely the occurrence of a temporal luminance counter-modulation between the eyes. This feature sheds some light on the internal principles underlying a disentangling of “accidental” and “essential” surface attributes. Experiment 3 reveals an asymmetry between a light and a dark reference level for the counter-modulations. This finding again suggests an interpretation in terms of an internalised physical regularity with respect to the dynamics of perceiving illuminated surfaces.

Keywords: perception of material qualities, lustrous appearances, stereoscopic lustre

1. Introduction

The perception of material properties is, almost evidently, among the most important achievements, both functionally and phenomenally, of the visual system. More than shape or colour, material appearances impart objects their distinctive properties of, say, being soft, wet, smooth, silky, edible, or deformable. By visually perceiving material properties, specific kinds of objects and stuff can be identified and a great variety of object properties can be visually grasped that go far beyond purely visibly definable attributes. They pertain, for instance, to “stability,” “tenacity,” “ruggedness,” or to attributes such as “lustrous,” “hard,” “juicy,” “dry” etc. The capacity of visually perceiving material properties is part of our more general perceptual capacity for making causal assignments and for embedding all of our experiences into various kinds of internal causal analyses. The specific kind of dispositional properties and causal ascriptions that are perceptually attainable from the sensory data is subordinated to the type of “perceptual object” that is activated by the sensory data. In the case of “surfaces”—understood as perceptual objects, not as physical ones—these dispositional properties pertain to material qualities, and the causal ascriptions to, e.g., how surfaces will appear under changes in their orientation and location, which haptic experiences will be elicited by them, and how they will behave under various kinds of interactions, both with an agent and with other objects. Attributes pertaining to material qualities are thus intrinsically transmodal in character.

The identification of the principles and mechanisms by which material qualities are brought forth poses a pre-eminent challenge for perception theory. In the history of perceptual psychology, systematic investigations of the visual perception of material qualities have been largely neglected in favour of investigations of seemingly simpler attributes such as shape or colour. This was not only due to a theoretical preoccupation with elementary attribute but also due to the difficulties that one encounters in attempts to experimentally vary material appearances in a quantitative and parametric way (e.g., Christie, 1986; Fleming & Bülthoff, 2005; Sève, 1993).

Only recently, a new interest in the visual perception of material appearances has emerged, which originates mainly from problems of rendering. Rendering purposes (e.g., Dorsey & Hanrahan, 2000) also go along with a strong interest in identifying in the 2D visual input crucial parameters by which a given material appearance can be varied in a controlled way. These investigations revealed that material appearances, such as lustre, silk (Koenderink & Pont, 2002), translucency (e.g., Fleming & Bülthoff, 2005) or gloss (e.g., Fleming, Dror, & Adelson, 2003; Wendt, Faul, & Mausfeld, 2008), have exceedingly intricate triggering conditions. They can be invoked by a multiplicity of combinations of specific ranges of image parameters. As expected, some subsets of the triggering conditions, by which a given material appearance can be elicited, can be related to external regularities of the ecological physics of the “corresponding” type of material, i.e. to regularities in the way light interacts with certain types of physical surfaces. Yet, the specific class of triggering conditions for a certain material appearance, i.e. the equivalence classes of input properties that are tied to a certain material appearance, cannot be understood by exclusively focussing on external physical regularities. The triggering conditions for material appearances are given by a rather motley conglomerate of physical conditions, which has about the same degree of physical “naturalness” as, e.g., metameric colour classes have with respect to the wavelength composition of lights. While these triggering conditions partly mirror the great variety of complex optical processes of the interaction between illumination and surface properties, they are also determined by the conceptual forms or internal data types underlying computational processes (cf. Mausfeld, 2010a). The triggering conditions for material appearances thus do not merely mirror external physical regularities but are also moulded by internal structural constraints pertaining to the abstract data types of our perceptual systems, or in Gelb's (1929) terms, by the “structural forms,” by which the sensory input is exploited and computationally processed. The complex triggering conditions by which material appearances can be evoked can help us to gain deeper insights into the structure of conceptual forms in which material appearances figure as an internal attribute, and into the nature of the associated internal causal analyses.

The neglect of material appearances also has been due to the fact that traditional colour science has been almost exclusively based on a notion of “pure” colour, according to which colour can be studied detached from the internal coding of, say, space, texture, or from the regularities that govern interactions of light sources with surfaces (Mausfeld, 2010b). Two interesting exceptions, however, to this traditional neglect of material appearances can be found in the classic literature, namely investigations, notably by Helmholtz, of stereoscopic lustre, and Katz inquiries into modes of appearances.

Katz (1911), following Hering, clearly recognised how intimately “colour” is interwoven with the organisation of “space.” Accordingly, Katz descriptively distinguished different “modes of appearance,” such as surface colours, volume colours, or illumination colours, each of which exhibits distinctive phenomenological characteristics and different coding properties with respect to other perceptual attributes. Katz 's classification already captures in nuce basic intuitions underlying investigations of material appearances (cf. Mausfeld, 2003).

1.1. Lustrous appearances under impoverished input conditions

Lustrous appearances elude a definition in terms of a clear-cut perceptual criterion. In this respect, they do, however, not differ from other perceptual attributes, such as brightness or hue. In phenomenological studies, subjects describe lustrous appearances as, for instance, “light and dark, somehow seen as if in the same place at the same time,” “a sort of blending or fusion of light and dark,” “a peculiar commingling or sifting-together of dark and light,” or as “a bulky experience of luminous greyish white” (Bixby, 1928). In the case of elementary colour appearances, Helmholtz and von Kries recoursed to purely physical aspects in order to define hue, saturation, and brightness. These surrogates, although useful for colourimetric purposes, are, in the context of perception theory, fraught with considerable problems (cf. Mausfeld, 2003, p. 389ff.). As von Kries (1882, p. 6) rightly noted, such a description of colour appearances in terms of hue, saturation and brightness “does not claim to be a natural one; without much ado we can regard it as a completely arbitrary one. Such a description is, however, a completely rigorous one, since it only refers to objective properties of the light that causes the corresponding appearances.” In the case of lustrous appearances, we also tend to resort to physical aspects rather than to phenomenological ones in order to illustrate what is meant by them. We usually characterise them by the typical causal situations that give rise to them, viz. certain aspects (beyond those pertaining to colour or lightness) of a surface which pertain to its interaction with light (note, however, that for qualification as a lustrous appearance it is irrelevant whether it has been brought forth by physical surfaces, objects on a computer screen, or painted objects on a canvas). Thus, we refer, as a makeshift, to the kind of material that typically exhibits these kinds of appearances, and correspondingly speak, for instance, of metallic, graphite, silky, or vitreous lustrous appearances. However, the variegated class of appearances that these appearances of “lustrous” material embrace are far from homogeneous and show subtle differences. Whether this gamut of appearances is based on a common core of principles or rather mirrors principles and mechanisms of quite different nature is thus an empirical question. Because lustrous appearances are typically brought forward by specific physical properties of surfaces with respect to their interaction with light, we regard it as a reasonable starting point to assume that our perceptual system is sensitive to these regularities and associates with them a particular visual quality. We will here therefore tentatively proceed on the assumption that the phenomenological class of lustrous appearances results from a common core of principles.

For the endeavour to identify internal factors in the coding of material appearances, it is of particular theoretical interest that lustrous appearances can be elicited by highly impoverished stimulus conditions that, in particular, do not contain cues pertaining to texture, or specular highlights. The most prominent among these are displays for stereoscopic lustre. Corresponding observations were first reported by Dove (1850;1861) and immediately recognised as phenomena of great theoretical importance by Helmholtz, Brücke (1861), Wundt, Kirschmann, Bühler and many others. The phenomenon in point can be easily demonstrated by the well-know stimulus configuration ( Figure 1 ) provided by Helmholtz (1867).

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0001.jpg

Stereoscopic lustre. The figure is arranged for uncrossed viewing.

Under stereoscopic viewing conditions, the binocular combination of the two line drawings of inverted luminance contrast yields a vivid lustrous appearance. Similar appearances can be produced by a great variety of different highly reduced stimulus configurations (e.g., Anstis, 1861; Katz, 1911; Kiesow, 1920; Metzger, 1932; Pinna, Spillmann, & Ehrenstein, 2002), both under binocular and monocular viewing conditions. Thus, lustrous appearances are not tied to the kind of binocular conflict apparently underlying Helmholtz' stereoscopic lustre. This has also been witnessed by painters' achievements, notably in Dutch Renaissance art (Gombrich, 1976), to evoke lustrous appearances on a canvas (attempts which exhibit interesting similarities to today's rendering problems).

In the classical literature, several studies can be found that attempted to identify critical image parameters for lustrous appearances (Bühler, 1861; Dove, 1850; Helmholtz, 1856;1867; Kirschmann, 1895; Oppel, 1854; Wundt, 1862; see Harrison, 1945, for a summary of the classical German literature). These studies already had shown that the kind of image information that is exploited by the internal data types in which material qualities figure as an internal attribute is exceedingly variegated and idiosyncratic with respect to the possible physical generation processes, and thus cannot be understood without taking internal structural regularities into account.

1.2. Explanatory frameworks I: Lustrous appearances as a resolution of a neural conflict

Corresponding attempts to explain lustrous appearances by rather simple and allegedly “low-level” neural principles date back to Dove (1850), who attributed the phenomenon of stereoscopic lustre to a luminance discrepancy of the intensity signals between the two eyes. He also attempted to explain monocular lustre in similar terms by his (rather obscure) “accommodation theory,” according to which lustrous appearances were due to an optical accommodation effect caused by the (typically) laminar physical organisation of lustrous surfaces (cf. Rood, 1861). Brewster (1861) objected to the idea that stereoscopic lustre mirrored physical regularities of surfaces and regarded it as an entirely idiosyncratic physiological side effect due to a kind of neural conflict resolution in a situation of binocular rivalry.

Perceptual phenomena that can be described in terms of rivalry are most commonly understood as resulting from some kind of neural conflict (e.g., Anstis, 2000; Burr, Ross, & Morrone, 1986; Pinna, Spillmann, & Ehrenstein, 2002). In their wake, a variety of parametric conditions for stereoscopic lustre have been experimentally identified (e.g., Ludwig, Pieper, & Lachnit, 2007; Pieper & Ludwig, 1999, 2002). According to these accounts, the visual system cannot merge the different luminances or luminance polarities from the two eyes into a stable percept. Instead of resolving the conflict between the two input values by some sort of averaging or by a rivalry in which the appearance alternates in a random fashion between the two signals, the visual system adopts, on such an account, a new type of resolution in-between pure rivalry and homogeneous fusion, which then is perceived as lustre.

Intuitions referring to neural conflicts have, of course, little explanatory value unless the neural codes with respect to which a conflict is postulated are specified. In the case of binocular rivalry, such specifications usually refer to monocular luminance-based codes. The available empirical evidence, however, speaks strongly against neural conflict models of binocular rivalry that are based on simple functions of luminance, and rather suggests “that rivalry was occurring at a more abstract level of image representation than direct monocular signals from the two eyes” (Lehky, 2011). Binocular rivalry seems predominantly to be dissociated from eye-of-origin information and to occur at a level of operations that pertain to “perceptual objects” (for different types of relevant empirical evidence, see e.g., Kiesow, 1920; Kovacs, Papathomas, Yang, & Feher, 1996; Logothetis, Leopold, & Sheinberg, 1996; Ooi & He, 2006; Shimojo & Nakayama, 1990). Furthermore, there is rich empirical evidence indicating that image luminances are exploited in terms of intimately coupled internal data types pertaining to spatial surface attributes and reflection-dependent surface attributes (e.g., Boyaci, Maloney, & Hersh, 2003; Fleming, Torralba, & Adelson, 2004; Ho, Landy, & Maloney, 2008; Turhan, 1937).

1.3. Explanatory frameworks II: Lustrous appearances as reflecting physical regularities

Helmholtz (1867), following Oppel (1854), went beyond ad hoc accounts of lustrous appearances in terms of simple luminance-based neural conflict accounts. He tied the phenomenon to specific physical regularities concerning the relation of surfaces oriented in 3D-space, illumination and observer.

Lustrous appearances, which perceptually vary on some matte-to-glossy dimension, can physically be related to surfaces whose Bidirectional Reflectance Distribution Functions (BRDF) are intermediary cases between a perfectly matte and an ideal specularly reflecting surface. In contrast to a diffuse matte surface, the amount of light reflected from a lustrous surface to the eye(s) of an observer also depends on the viewing direction. Therefore, the two eyes receive, as a rule, different light intensities from a given point of the surface ( Figure 2 ). Furthermore, small changes in observer position or surface orientation can yield large changes in differential binocular input (cf. Beck, 1972; Evans, 1948, p. 170). According to Oppel (1854, p. 54) and Helmholtz (1867, p. 783), this physical regularity provides the basis for “unconscious inferences” of the attribute of lustre.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0002.jpg

Lustrous surfaces, as specified by the BRDF, yield different intensities in the two eyes of an observer.

This Oppel–Helmholtz explanation has, beyond identifying a core physical regularity associated with lustrous surfaces, the advantage that it provides some suggestive structural links to phenomenological regularities associated with lustrous appearances. It is instructive to consider first the corresponding phenomenological regularities and their link to physical regularities for the case of glossy appearances. Because of the reflectance characteristics of a glossy surface—which reflects the incident light to a certain degree in a specular manner—and the distance between both eyes, the positions of the highlights are generally shifted relative to corresponding surface points between the two monocular half-images and thus have a different disparity from that of the surface (Blake & Bülthoff, 1990; Hering, 1879; Kirschmann, 1895; Ruete, 1860; Wundt, 1862; Wendt, Faul, & Mausfeld, 2008).

This highlight disparity regularity, illustrated by Figure 3 , apparently has a counterpart in a characteristic aspect of glossy appearances, namely in the phenomenal segmentation of image intensities into two layers one behind the other, one layer pertaining to the level of the reflecting surface, the other to an illumination-dependent component. The latter perceptual component is slightly and somehow indeterminately separated in depth from the first one.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0003.jpg

Highlight disparity and differential binocular luminance input. The figure is arranged for uncrossed viewing.

This highlight disparity regularity associated with glossy surfaces obviously bears a close relation to the temporal regularity of the differential binocular input caused by lustrous surfaces. This suggests the idea that the Oppel–Helmholtz account of lustrous appearances could get linked to the phenomenal observation that lustrous appearances, both for monocular and binocular viewing conditions, exhibit a similar kind of phenomenal segmentations into two different (shallow) depth layers. This observation has been extensively reported and discussed in the classical literature on this topic. Bixby (1928, p. 169) emphasised “that the luster lies behind the surface and detached from it”—“luster is not a surface phenomenon; with the best luster, the surface of the stimulus-object is not seen.” The distinctive feature of lustrous appearances is that a perceptual grasp of an intrinsic surface quality is locally precluded in favour of perceptual aspects of the perceived illumination. This feature serves, by mechanisms that are still poorly understood, as cue for a global surface attribute, namely lustrous material appearances.

Functionalist computational accounts of lustrous appearances, such as the Oppel–Helmholtz account, are based on internal data types for surface-related perceptual categories and illumination-related perceptual categories. Because of this, they can, in principle, theoretically motivate a separate phenomenal attribute pertaining to an interaction property of these two types of computational objects. Within a neural conflict account, in contrast, it appears entirely unmotivated why a local luminance-related neural conflict should give rise to a perceptual attribute such as lustre, and which kind of differences in the structure of neural activities are mirrored in phenomenal differences and which are not. Furthermore, computational accounts are in principle suited to account for the fact that small variations in image parameters can result in subtle but qualitatively different perceptual qualities of lustre, such as graphitic, metallic, satiny, pearly, silky, velvety, or vitreous. Such differences in perceptual qualities could, within computational frameworks, for instance be accounted for by differences in cue integration mechanisms that regulate the computational segregation of internal surface and illumination attributes.

Helmholtzian approaches more generally tend to focus on computations that are appropriate for a recovery of functionally relevant physical world properties (e.g., by “unconscious inferences,” Bayesian inferences). The abstract data types over which these internal computations are performed are assumed to mirror functionally relevant physical world categories (“isomorphism” of external and internal categories). Such an assumption, however, is more problematic than it might appear from an ordinary perspective. Ample empirical evidence indicates that internal data types (pertaining, e.g., to illumination, surface or shadow) do not match and are not isomorphic to external physical categories. Hence, Helmholtzian approaches suffer from a “type-mismatch” problem (see, inter alia, Ludlow, 2003, p. 150): Internal data types cannot be defined by external physical regularities nor can they inductively be derived from them or from the sensory input.

1.4. Explanatory frameworks III: Lustrous appearances as reflecting internal principles subserving the separation of accidental and essential attributes of surfaces

Hering's intention precisely was to avoid such a type-mismatch problem and to derive constraints on data types from structural regularities of the percept and from investigations of internal causal analyses by which functionally relevant perceptual categories are established. Thus, Hering-type approaches tend to focus on computations that are appropriate for yielding the semantic distinctions and functional achievements that are mirrored in phenomenal categories. The data types over which these computations are performed are defined by certain functionally relevant classes of phenomenally expressed perceptual categories.

While Helmholtzian approaches capture an important physical aspect of the complex triggering conditions for lustrous appearances, they cannot, by themselves, account for a characteristic structural aspect of the corresponding percept, namely its phenomenal depth segmentation. Hering (1879) clearly recognised how intimately colour and its attributes are interwoven with the internal organisation of perceptual space—what Katz later called “marriage of colour and space.” In line with his internalist inclinations, Hering therefore placed his discussion of lustrous appearances entirely within his discussion of the organisation of perceptual space. According to Hering (1879, p. 576), lustrous appearances arise as a consequence of a shallow perceptual depth segmentation of surface qualities by which “essential” and “accidental” brightness- and colour-related qualities of a surface are disentangled. Such a “cleavage of sensation” into shallow depth layers arises, according to Hering, when a “surplus of light” is perceptually associated with the surface. Input situations, by which internally defined permissible ranges for luminance-related parameter values for essential surface attributes would be exceeded, give rise to a separate phenomenal surface attribute, namely lustre. The sensory input pattern is, according to computational Hering-type approaches, internally sliced into perceptual layers, which pertain to conceptual forms of different types, namely abstract data types for surfaces and their attributes, and data types for illuminations and their attributes. The specific interrelation of these types of “perceptual objects” comprises surface attributes, such as lustre, that code relational surface qualities with respect to interactions with light sources.

Although Hering's account of lustrous appearances as part of the organisation of perceptual space remained highly sketchy, it draws attention to the crucial role of internal factors, notably the structure of internal data types and the types of internal causal analyses associated with them. The peculiar triggering conditions for lustrous appearances that have been revealed so far by corresponding experimental investigations cannot be understood without taking these internal factors into account. Hering's internalist perspective complements—rather than being in conflict with—the externalist one of Oppel and Helmholtz and offers fruitful heuristics regarding the structure of internal data types pertaining to surfaces and their attributes.

1.5. Goal of the present study

This study differentially subjects to experimental tests specific proposals pertaining to the first two explanatory frameworks and aims at identifying triggering conditions that support Hering's conjecture. The common ground for differentially testing the above-mentioned explanatory frameworks pertains to lustrous appearances under binocular viewing conditions. We will accordingly confine our investigation to corresponding types of situations. (Note that, by assumption, none of these three frameworks can provide explanations for lustrous appearances in monocular viewing situations. Corresponding statements naturally apply to other perceptual attributes. In the case of depth, for instance, vivid depth impressions can be elicited in monocular viewing situations. Whereas in the case of depth, a great variety of monocular input properties has been identified by which depth impressions can be triggered, much less is currently known about the relevant monocular input properties by which lustrous appearances can be triggered.)

For binocular viewing conditions, all three types of accounts share the idea that situations in which the corresponding points in the two eyes receive different luminances are favourable for yielding lustrous appearances. The accounts of Oppel–Helmholtz and of Hering share the insight that any considerations of binocular luminance differences have to be based on a given internal data type for surfaces. Accordingly, the requirement for lustrous appearances that “each eye of an observer receives different intensities or qualities of light“ (Beck, 1972) only makes sense with respect to the qualification that the two intensities are perceptually tied to the same locations of a given surface. On the basis of a corresponding assumption that a specific instance of an internal conceptual form for surface is activated by these stimulus conditions, presumptions of alleged binocular conflicts remain theoretically unmotivated. In the case of Helmholtz' display for stereoscopic lustre ( Figure 1 ), the percept is organised in terms of solid perceptual objects. We can therefore assume that by this display, a common surface representation for the two eyes is activated. In addition, Helmholtz tacitly takes advantage of another property: The two half-images exhibit a luminance contrast reversal with respect to the background. Such a contrast reversal is much stronger a condition than the occurrence of luminance differences between the two eyes that follows from Helmholtz considerations about physical regularities pertaining to lustrous surfaces. Contrast reversal with respect to a background turned out to be a crucial requirement for lustrous appearances for the stimulus conditions employed by Anstis (2000). Anstis presented on five luminance levels of a grey surround test patches of six different luminances. He presented pairs of light and dark test patches monocularly by alternating them over time at 16 Hz as well as binocularly by means of a stereoscope. Using a rating procedure, Anstis found that lustrous appearances only were elicited by pairs of test patches that straddled the surround luminance. In his experiments, the appearance of lustre was strongest when a spatial increment in one eye and a spatial decrement in the other eye are fused in the binocular condition or when a spatial increment is alternated with a spatial decrement in the monocular condition. From this, Anstis (2000, p. 2553) concluded: “Clearly it is the contrast reversal of the spot that makes it appear lustrous, in both the monocular and binocular conditions.”

Since a reversal of contrast polarity of monocular inputs is neither required by the Oppel–Helmholtz-type nor by the Hering-type of explanatory account, our experiments on stereoscopic lustre were specifically designed to investigate whether lustrous appearances indeed require the occurrence of spatial increment–decrement pairings. The data from our Experiment 1 clearly show that a spatial contrast reversal is not necessary for lustrous appearances, and constitutes more an ancillary type of variable rather than a crucial one. By Experiments 2 and 3, we therefore attempted to identify, for our stereoscopic input configurations, the critical parameters on which lustrous appearances are based. It turned out that lustrous appearances are tied to temporal phases of counter-modulation between the two eyes. This is a novel constraint that is not directly ensued by the very general explanatory frameworks of Oppel–Helmholtz and Hering. However, as we will point out in the discussion, it is consonant with them and provides interesting insight into the kind of internal causal analysis by which the visual system disentangles attributes of internal data types pertaining to surfaces and illumination, respectively.

2. Experiments

In order to identify relevant image parameters for lustrous appearances, we employed, in line with the experimental studies mentioned above, highly reduced input configuration that do not contain cues pertaining to texture or specular highlights, and that can be regarded as a kind of minimal stimulus (in an ethological sense) for eliciting lustrous appearances. In the static case, Helmholtz' peculiar stimulus configuration depicted in Figure 1 can be regarded as such a minimal stimulus configuration that is particularly suited for eliciting lustrous appearances. In contrast to Helmholtz, we are particularly interested in dynamic aspects. The displays used in our experiments comprise two binocularly fused half-images, each consisting of a spatially homogeneous test patch whose luminance is temporally modulated. Such stimulus configurations can, in the dynamic case, be regarded as a kind of minimal stimulus for eliciting lustrous appearances.

The two half-images of the centre-surround configurations were presented on a CRT monitor (with a refresh rate of 85 Hz). Observers haploscopically fused these pairs by means of a mirror stereoscope (Screen Scope SA200) to a single test patch seen in a homogeneous surround. The quadratic test patches of both half-images had a side length of 8.6° and were embedded in a common surround of 34° × 17°. All stimuli were achromatic (CIE 1931 chromaticity coordinates: x = 0.299, y = 0.315). While the surround luminance was identical in the two half-images and kept temporally constant, the luminance of the centre patches was independently temporally modulated according to a continuous single-peaked modulation function (in the experiments reported here, a Gaussian was used).

2.1. Experiment 1

2.1.1. Stimuli

In Experiment 1, the temporal intensity courses were temporally shifted between the two eyes by a constant peak separation of 150 msec ( Figure 4 ). The temporal intensity courses of the two test patches were presented for 3,000 msec. The modulation function always peaked at approximately 1,425 msec for the left eye, and at approximately 1,575 msec for the right eye (because of the refresh rate of the monitor, the frames with the highest luminances occurred at time values that slightly differed from these values; for simplicity, however, we will refer to the peaks of the continuous temporal intensity function).

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0004.jpg

Left: the two monocular temporal modulation functions that were generated according to a Gaussian function. The function that is shown as the left curve was presented to the left, the other one to the right eye of the observer. The two Gaussians were temporally shifted with a peak separation of 150 msec. Right: The spatial layout of our haploscopic centre-surround configurations and their appearance at time tx.

The width of the temporal modulation functions (which in the case of the Gaussian can be defined by its standard deviation) was varied in steps of 20 msec from 100 to 400 msec. Three different surround luminances were employed, referred to as the “black,” “grey” and “white” surround condition (with a luminance of 0.1, 42.5 and 85 cd/m 2 , respectively). The luminance of the test patch ranged from 2.0 to 83.0 cd/m 2 , so that for the black (white, respectively) surround condition the fused pairs were always spatial increments (decrements, respectively). This procedure furthermore had the advantage that at least in the “black” surround condition the test patch was always clearly discernible from its surround area.

2.1.2. Procedure

Each of the 48 stimulus combinations (16 widths of the temporal Gaussian modulation functions × 3 surround conditions) was repeated 15 times. The entire set of 720 stimuli was presented in random order. The judgemental task of the observers was to indicate whether or not the respective stimulus appeared “lustrous” at any time during the presentation. After each cycle of stimulus presentation, the same stimulus was immediately started anew and the observers were allowed to view as many cycles as they wanted before making a judgement. After observers confirmed their decision by pressing a key, the next stimulus was presented after a 3-sec period of dark adaptation.

2.1.3. Observers

Four observers took part in our experiments which were all well experienced with psychophysical tasks. All had normal or corrected-to-normal visual acuity. One of them was an author of this paper.

2.1.4. Results

The results of the experiment are shown in Figure 5 . In each diagram, the relative frequency of perceived lustre is represented in dependence of the width of the temporal Gaussian intensity functions. The red lines indicate the fit of our data by a Gaussian psychometric function.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0005.jpg

For all four subjects (rows), the results are shown as psychometric functions, separately displayed for the three surround conditions. Each diagram shows the relative frequencies by which the respective stimulus was perceived as lustrous (in dependence of the width of the Gaussian temporal intensity functions).

As the data of Figure 5 clearly indicate, the propensity to perceived lustre strongly depends on the width of the temporal modulation functions, and hence on the speed of the temporal luminance changes. For modulation functions with the smallest width (and thus the steepest slope), test patches were almost always perceived as lustrous. The propensity to perceive lustre then monotonically declines with increasing width of the modulation function. The data also show that lustre is consistently perceived with respect to the black and to the white surround. Its presence thus does not, contrary to Anstis' (2000) claim, depend on a contrast reversal of the luminances with respect to the background. However, the luminance of the surround has a strong effect on the propensity to perceive lustre. The smallest range of stimulus conditions for lustrous appearances is obtained for the black surround, while the grey surround exhibited the largest range, the white surround being in-between. Apparently, the occurrence of spatial increment–decrement pairings between eyes in the “grey” surround condition increases the tolerance, with respect to lustrous appearances, for shallower gradients of the temporal modulation function. We will address the asymmetry between the black and the white condition in the discussion of Experiment 3.

We want to briefly mention in passing that the qualitative results of Figure 5 do not depend on the parametric form of the modulation function. We obtained similar results with other types of modulation functions (such as a linear zigzag function according to which the luminance alternately increases and decreases by constant differences between a fixed minimum and maximum value). Interestingly, observers furthermore reported subtle phenomenological differences associated with different types of the modulations functions, linguistically described as sheen, metallic-silky, or metallic-satiny.

2.2. Experiment 2

Experiment 2 was designed to specifically investigate whether there is a temporal segment of the binocularly combined temporal intensity functions that, if present in isolation, can elicit lustrous appearances. The goal of Experiment 2 thus is to narrow down or to identify image parameters by which lustrous appearances can be triggered.

2.2.1. Stimuli and procedure

The spatial layout of the stimuli employed was the same as in Experiment 1. However, in contrast to the previous experiment, observers did not view, on each trial, full cycles of the modulation functions. Rather, they were shown short time windows of 300 msec duration of the underlying modulation functions (see Figure 6 ). The width of two underlying Gaussian modulation functions was kept constant at 150 msec, their peak separation at 200 msec. These parameter values proved to be particularly suited for invoking strong impressions of lustre in previous experiments (Mausfeld & Wendt, 2006). The intensity function for the left eye (red line in Figure 6 ) peaked at 1.9 sec, the one for the right eye (black line in Figure 6 ) at 2.1 sec.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0006.jpg

Schematic representation of the temporal modulation functions of the two monocular test patches. The red curve displays the modulation function for the left eye and the black curve for the right eye. The brightened area indicates one of the 300 msec time windows presented to the observers. Outside this time window (dark areas in the diagram), the test patches had a constant luminance of 2.0 cd/m 2 .

The luminances of the modulation functions varied between 2.0 and 83.0 cd/m 2 . Each modulation cycle was 4 sec long. We systematically varied, within the entire 4 sec cycle, the temporal position of the 300 msec time window during which the modulated test patch was visible to the observers. The centre of the 300 msec window was varied between temporal locations of 1.55 sec (interval borders of 1.4 and 1.7 sec) and 2.45 sec (interval borders of 2.3 and 2.6 sec) in steps of 50 msec. Outside of this time window, the luminances of the two monocular test patches were set to the luminance of the baseline of the Gaussian modulation functions (2.0 cd/m 2 ). The luminance of the surround was 0.1 cd/m 2 . Each of the 19 conditions was presented 10 times, yielding total of 190 trials which were presented in random order. As in Experiment 1, the task of the observers was to judge whether or not the respective stimulus appeared lustrous at any time during the presentation. After each cycle of 4 sec, the same stimulus was immediately presented again, and the observers were allowed to view as many cycles as they wanted before they made their decision. Before the presentation of the next stimulus, a 3-sec period of dark adaptation was presented.

2.2.3. Results

Figure 7 shows the results of the experiment for four subjects. Each data point represents the relative frequency by which the respective stimulus was perceived as lustrous at any time during the presentation. The abscissa position of each data point represents the temporal mean of the 300 msec time window during which the respective part of the modulation functions was visible. One particular presentation window, extending from 1.85 to 2.15 sec within the virtual modulation cycle of 4 sec, is indicated by the shaded area in the centre of each panel. This time window centres around the intersection of the two-modulation curve at a time position of 2 sec. As can be seen from Figure 7 , in this centre presentation window the propensity to perceive the test patch as lustrous is largest (almost 100% for all observers). While presentation windows that are shifted up to positions 50 msec earlier or later than the centre window elicit for three observers the same proportion of lustrous judgements, the appearance of lustre sharply declines for earlier or later presentation windows. Some presentation windows in the initial phase of cycle also elicit, depending on the observer, a smaller amount of lustre judgements.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0007.jpg

The diagrams display the data of four subjects. Each data point (blue) represents the relative frequency by which the respective stimulus was perceived as lustrous. The position of the data points on the abscissa represents the temporal mean of the respective time window during which the monocular modulation functions were visible to the observer. Diagrams also show the entire underlying modulation functions (red: left eye; black: right eye).

The data of Figure 7 strongly suggest that, for our viewing conditions, the relevant image parameter for eliciting lustrous appearances is a temporal counter-modulation between the left-eye signal and the right-eye signal. That is, lustrous appearances are strongly tied to precisely that temporal segment of the binocularly combined modulation functions in which one curve follows a temporally decremental path, while the other follows an incremental one such that the two curves intersect. It can also be inferred from Figure 7 that it is counter-modulation rather than the phases of largest differences between the luminances of the two-half images to which lustrous appearances are tied. We will address the potential computational role of this counter-modulation in the discussion.

The results of this experiment would appear theoretically unmotivated within both the luminance-based neural conflict framework and Oppel–Helmholtz-type framework. It is important to note, however, that the three types of theoretical perspectives discussed in the beginning provide frameworks only and would have to be supplemented with strong additional assumptions in order to allow any quantitative prediction. For want of sufficiently strong and empirically warranted assumptions, we will instead, for didactical purposes only, compare three simple mechanisms which can be related, in different degrees of naturalness, to the three different frameworks. Figure 8 depicts in each column one of the three types of simple mechanisms each based on a proprietary type of response rule for yielding lustrous appearances. Mechanism A (left) is based on the interocular intensity difference, mechanism B (middle) on the interocular difference in the slope of the temporal intensity change, and mechanism C (right) on counter-modulation. The descriptive models based on these mechanisms are not to be regarded as explanatory models for lustrous appearances, as they do not refer to relevant functional aspects or actual mechanisms of the visual system. Rather, they are only intended to illustrate a simple input-output device that takes as input certain parameters of the temporal luminance profile and yields as output values for lustrous appearances. In order to yield quantitative predictions, input functions and response rules for the three mechanisms have to be specified.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0008.jpg

Predictions for positive “lustrous” responses for three simple mechanisms: mechanism A (left, based on the interocular intensity difference), mechanism B (middle, based on the interocular difference in the slope of the temporal intensity change), and mechanism C (right, based on counter-modulation). (See text for details.)

The input is given by the stimuli described above (see Figure 6 ; hence the input functions are assumed to be continuous and free of noise). Let Im be the 300 msec time interval with temporal mean m (e.g., I1.6 = [1.45 sec, 1.75 sec]), fl(t) and fr (t) the temporal intensity function for the left and right eye, respectively. Correspondingly, let fl (t) and fr (t) denote the first derivative of the temporal intensity function for the left and right eye, respectively. The respective outputs of the three mechanisms can then be defined by the following response rules (which, due to our restrictions on the input function, can be formulated here in a highly simplified way).

Mechanism A (interocular intensity difference): Mechanism A assigns a lustrous appearance to the stimulus in time interval Im if there exists a time t within Im for which the following condition holds: | fl(t) − fr(t)| > c (where c denotes a critical threshold value).

Mechanism B (interocular slope difference): Mechanism B assigns a lustrous appearance to the stimulus in time interval Im if there exists a time t within Im for which the following condition holds: | fl ′(t) − fr ′ (t)| > c (where c denotes a critical threshold value).

Mechanism C (counter-modulation): Mechanism C assigns a lustrous appearance to the stimulus in time interval Im if there exists a time t within Im for which the following two conditions hold:

(i) sign (fl ′(t)) ≠ sign (fr ′(t)),

(ii) fl(t) = fr(t).

On the basis of these response rules and specifications of threshold values, predictions can be derived whether a stimulus, given by a certain time interval of the intensity modulation functions, is assigned a lustrous appearance. These predictions can then be compared to the empirical data obtained by Experiment 2.

We will illustrate these predictions by example of two further specifications. Firstly, we will select a stimulus defined by the input functions in the 300 msec interval with a temporal mean of 1.6 sec in order to exemplify the relation between the function for comparing left- and right-eye input underlying each mechanism (top row of Figure 8 ) and the predictions of that mechanism (bottom row). This interval is denoted by I1.6 and marked by dashed vertical lines in the top row of Figure 8 . The predictions for this interval are indicated by a dashed vertical line at abscissa value 1.6 in the bottom row of Figure 8 . Secondly, we will, in the case of mechanisms A and B, derive predictions for three different values for the critical threshold c. These values, which are again chosen for illustration only, are denoted by c20, c50 and c80, and correspond, respectively, to 20%, 50% and 80% of the maximal value of the underlying parametric function for comparing left- and right-eye input (i.e. | fl(t) − fr(t)| in the case of mechanism A and | fl ′(t) − fr ′(t)| in the case of mechanism B). In the top row of Figure 8 , these threshold values are represented by the green, yellow and red horizontal line, respectively.

On the basis of the response rules and the specification of threshold values, predictions can be made regarding the time intervals that give rise to lustrous appearances. The bottom row of Figure 8 shows these predictions for the three mechanisms (left: A, middle: B, right: C). Each data point (filled circle) marks the response resulting from one of the 19 time intervals of 300 msec stimulation used in Experiment 2, where the position of the data point on the abscissa represents the temporal mean of the respective time interval (points are connected by coloured lines for visual convenience only). For mechanisms A and B, each of the three threshold value depicted in the top row yields an associated response functions (coloured accordingly). To make these three response functions well distinguishable visually, they have been slightly shifted vertically so that they do not locally coincide.

Now we can exemplarily look at the predictions made for the 300 msec interval with a temporal mean of 1.6 sec (i.e. I1.6). For the case of mechanism A (first column), the parametric function for comparing left- and right-eye input (top row, black curve) exceeds, within this interval I1.6, the threshold values c20 (green line) and c50 (yellow line), but not c80 (red line). Therefore, the response for lustrous appearance (bottom row) plotted at time 1.6 sec is “Yes” for the green and yellow response function, but “No” for the red one. In the case of mechanism B (second column), the parametric function for comparing left- and right-eye input (black curve) exceeds, within I1.6, only the threshold value c20 (green line) and therefore only the green response function (bottom row) represents a lustrous appearance (“Yes”) at time 1.6 sec. Note that mechanism C (third column) predicts that no lustrous appearances are associated with time interval I1.6; rather, lustrous appearances are associated with any time interval that includes the moment of counter-modulation at time 2.0 sec at which both response rule conditions are satisfied. This is the case for the seven time intervals with temporal means from 1.85 to 2.15 sec.

By comparing the experimental data of Figure 7 with the bottom row of Figure 8 , it can be seen that the empirical results of the four subjects in Experiment 2 conform best with the response function of mechanism C (based on counter-modulation). The response functions of mechanism A (based on the interocular intensity difference) and of mechanism B (based on the interocular difference in the slope of the temporal intensity change) are substantially broader; these mechanisms predict lustrous appearances also for early and late time intervals for which the subjects in Experiment 2 do not or only rarely report lustrous appearances. This is the case even if a high (50%) or an implausibly extreme (80%) threshold criterion is assumed. The comparative quantitative analysis on the basis of simple descriptive models for three types of mechanisms speaks in favour of the idea that, for our viewing situations, counter-modulation is the relevant parameter for eliciting lustrous appearances.

2.3. Experiment 3

By Experiment 2, counter-modulation has been identified, for our stimulus configurations, as a relevant input parameter for eliciting an impression of lustre. Counter-modulation refers to pairing a temporal increment in one eye with a temporal decrement in the other. This, however, leaves the modulation functions into which counter-modulations are embedded unspecified. In particular, the entrance and exit levels into which the counter-modulation phase is embedded are unspecified. In Experiment 2, the counter-modulation phase was embedded into modulation curves that have their baselines at the lowest luminances. Accordingly, in each 300 msec presentation window (including those that contain the counter-modulation segment), the temporal luminance changes of the test patches are incremental, with respect to the modulation baseline, for both half-images. Now, almost the same course of counter-modulation could be obtained by two modulation curves that have their baselines at the highest luminances (see Figure 9 ). In that case, in each presentation window the luminance changes of the test patches would be decremental, with respect to the modulation baseline, for both half-images.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0009.jpg

The same course of counter-modulation can result from pairs of modulation functions that have their baselines at the lowest or the highest luminances, respectively. For modulation functions with baseline at lowest luminances, the counter-modulation phase starts with an increment with respect to the baseline, for modulation functions with baseline at highest luminances, the counter-modulation phase starts with a decrement.

Experiment 3 was designed to test whether counter-modulation by itself can be regarded as the relevant input parameter for lustrous appearances, independently from the values of the levels, with respect to which counter-modulations are defined. For this purpose, stimulus configurations were designed, in which exactly identical phases of counter-modulation are embedded into pairs of modulation functions whose baselines are systematically varied between lowest and highest luminances.

2.3.1. Stimuli

We used the same stimulus configurations as in the previous experiments. However, the 3,000 msec modulation function of Experiment 3 (see Figure 10 ) was not simply a Gaussian function but rather consisted of three segments. The inner 150 msec counter-modulation segment was kept identical for all modulation functions: For the left eye, it consisted of a linearly increasing modulation function which started at 25% and ended at 75% of the maximal available luminance of our monitor (85.0 cd/m 2 ). For the right eye, it similarly consisted of a linearly decreasing modulation function which started at 75% and ended at 25% of the maximum luminance.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0010.jpg

Three examples of pairs of monocular temporal modulation functions as used in the experiment. The figure displays three different levels for the entrance and exit luminances into which the counter-modulation phase is embedded. The middle phase of counter-modulation is exactly identical for all three levels. The continuous lines represent the modulation functions for the left eye, the dashed lines those for the right eye.

The two outer segments of the modulation functions, which define the entrance and exit level, respectively, consisted of constant modulation functions (of a temporal duration of approximately 800 msec). In order to ensure a smooth modulation between the constant outer segments and the counter-modulation-segment, we had to interpolate the modulation function between the inner points of the two constant outer segments and the entrance and exit points of the counter-modulation phase. In order to connect these points, we used segments of 625 msec of that Gaussian function with a temporal width of 150 msec that yields an entire modulation function that is closest to a Gaussian function.

We omit the technical details for the specification of this function because the specific parametric form of this interpolation segment is, in the context of this experiment, theoretically arbitrary.

In our experiment, we employed 10 different luminance values for the entrance and exit levels, which ranged, in steps of 10%, between 5% and 95% of the maximum luminance of our monitor (85.0 cd/m 2 ). Additionally, we used two different luminances for the surround of our stimuli, referred to as the “black” (0.1 cd/m 2 ) and the “white” (85.0 cd/m 2 ) surround condition.

2.3.2. Procedure

The entire set of 200 stimuli, consisting of 10 replications for each of the 20 stimulus combinations (10 different baseline positions × 2 surround conditions), was presented in random order. The task of the observers was to rate the amount to which the respective stimulus appeared “lustrous” at any time during the presentation. The observers made their judgements by adjusting the length of a continuous horizontally oriented bar presented below the centre patches such that its relative length indicated the amount of perceived lustre. As in the first experiment, the same stimulus was immediately repeated after each cycle of 3 sec, and the observers were allowed to view as many cycles as they wanted before making a judgement. After observers confirmed their decision by pressing a key, the next stimulus was presented after a 3-sec period of dark adaptation.

2.3.3. Observers

Two observers took part in our experiments which were both highly experienced with psychophysical tasks. Both had normal or corrected-to-normal visual acuity. One of them was an author of this paper.

2.3.4. Results

The results of two subjects are shown in Figure 11 , separately displayed for the two surround conditions. The relevant data—indicated as horizontal red bars—in these displays are the strengths of the lustrous appearance, in dependence of the luminance of the entrance/exit levels of the modulation functions. For convenience, these data are attached to the specific modulation functions by which they were yielded.

An external file that holds a picture, illustration, etc. Object name is i-perception-5-1-g0011.jpg

For each of the two subjects (rows) and surround conditions (i.e. the “black” and “white” surround condition, columns), the mean ratings are shown as horizontal red bars in dependence of the luminance levels of the entrance/exit phase of the modulation functions, which are also shown in the diagrams. Error bars refer to SEM.

The data of Experiment 3 show that, for identical phases of counter-modulation, there is a strong dependence between the intensity level of the baseline of the temporal modulation functions and the degree of perceived lustre: The amount of perceived lustre decreases monotonically with the luminance of the intensity level of the baseline. Note that there are four levels, namely, 0.35, 0.45, 0.55 and 0.65, in which a temporal increment–decrement pairing occurs outside the fixed-time window of the counter-modulation. Interestingly, the amount of perceived lustre is quite poor, both for the black and for the white surround.

According to the data of Experiment 3, counter-modulation is only exploited as a cue for lustre in the context of a dynamic situation in which the initial luminance modulation is temporally incremental for both eyes (before entering the phase of counter-modulation). This asymmetry of the role of counter-modulation with respect to a white and dark level into which it is embedded can be tentatively related to an ecological situation by which it could be causally generated, namely binocularly viewing a rotating lustrous surface that is illuminated from a fixed direction (like a dynamical version of the situation depicted in Figure 2 ). In such a situation, the specular part of the reflectance characteristics of a lustrous surface (which physically is in-between a diffuse matte and perfectly specular one) is visually effective with respect to an observer only for a small segment of the entire range of viewing angles. This specular part of the reflectance characteristics is incremental in luminance compared to the diffuse part that is visually effective for the remaining range of viewing angles. A continuous sequence of viewing angles would yield (when considering a small aperture within the field of vision) a temporal pattern of binocular luminances with a short lightening-up with respect to a darker baseline very similar to the modulation functions with dark entrance/exit levels for which the amount of perceived lustre is highest in Experiment 3.

3. Discussion

The perception of material qualities poses an important challenge for perception theory. Corresponding phenomena are particularly suited for exposing the explanatory gap between the information available in the sensory input, on the one hand, and the meaningful categories that characterise the output of the perceptual system, on the other hand. They accordingly direct our theoretical focus to the kind of internal structure that we have to assume in order to account for the explanatory gap. We are obviously endowed with a specific perceptual capacity by which we can visually attain aspects of perceptual objects that pertain to their “hidden” dispositional powers and propensities. Due to this capacity, we can visually grasp an abundant variety of properties of objects that go far beyond purely visual attributes. The capacity to visually attain dispositional properties of objects is part of our more general perceptual capacity for making causal assignments and for embedding all of our experiences into various kinds of internal causal analyses. The principles underlying the perception of material qualities will thus not only mirror physical regularities pertaining to surfaces and their relations with light. Rather, they will as much be determined by the structure of the conceptual forms or abstract data types of the perceptual system, and by the types of causal analyses that are attached to each data type (see Mausfeld, 2010a, for a more detailed account). In order to gain insights into the structure of the conceptual forms in which material appearances figure as an internal attribute, we have to identify the complex triggering conditions by which they can be evoked. The more idiosyncratic with respect to external physical regularities the triggering conditions turn out to be, the more likely will they shed some light on the structure of the conceptual forms or abstract data types to which these attributes belong.

By our experiments, we wanted to assess the theoretical fruitfulness of the explanatory frameworks mentioned in the introduction. To this end, we investigated, for the case of stereoscopic lustre, diagnostically suitable input properties that are used by the visual system as a physical triggering basis for eliciting the surface attribute of lustre. Our results show that—contrary to Anstis' (2000) findings—stereoscopic lustre cannot simply be accounted for by a binocular luminance conflict.

Furthermore, our data also speak against a simple Helmholtzian type of explanation in terms of an internalised physical regularity with respect to a differential binocular input. Although the type of triggering conditions that proved to be crucial in our experiments, viz. counter-modulation, is consonant with the kind of physical considerations that underlie a Helmholtzian approach, it cannot be derived from it. Unless augmented by strong additional assumptions, a Helmholtzian approach does not sufficiently constrain the kind of diagnostic triggering properties actually used by the visual system. It cannot explain or theoretically motivate the selection of precisely those physical invariants that yield counter-modulation in certain viewing situations, from the infinitely many potential mathematical invariants that can be associated with such situations. In contrast, within a Hering-type framework, counter-modulations appear as a computational feature that is particularly suited for certain internal causal analyses pertaining to the segregation of accidental and essential properties of surfaces.

From a computational perspective, luminance changes that differ between the eyes in magnitude only but have the same direction are, in situations as depicted in Figure 2 , less useful as diagnostic properties on which a segregation of accidental and essential attributes of surfaces can be based. If corresponding mechanisms would be based on such quantitative changes alone, they tend to be less robust and stable. In contrast, qualitative dynamic features, such as, in the case of counter-modulation, sign changes provide more stable and reliable diagnostic properties for the corresponding computational task. By taking advantage of the binocular counter-modulation of luminance, the visual system could, in suitable viewing situation, parse the input into two surface-related causal components, an accidental relational surface quality pertaining to an interaction with a light source, and an essential property pertaining to intrinsic surface attributes.

The physical triggering basis for lustrous appearances consists, in our situations, of structural features of specific segments of the binocular temporal modulation curves. Interestingly, we can find instructive similarities with respect to the auditory perception of material qualities. Corresponding experiments (e.g., Klatzky, Pai, & Krotkov, 2000; Carello, Wagman, & Turvey, 2005) show that also in this domain relatively elementary features act as a triggering basis for assigning complex material properties to the “auditory objects” involved (such as the attack portion of the signal for the perceived hardness of the objects, Freed, 1990). Because perceptual material qualities, both in the auditory and the visual domain, are intrinsically transmodal in character, the perception of material qualities will likely rest on deeper and more abstract computational principles, whose elucidation promises to shed more theoretical light on core principles of perception.

Our results call attention to the complex internal constraints underlying the visual system's computations. Perceptual psychology has predominantly focused on external constraints that can be derived from analyses of relevant physical regularities and has gained, from corresponding empirical and theoretical investigations, a wealth of interesting findings. Much less, by contrast, is known about the internal constraints that result from the structural properties of the conceptual forms or abstract data types on which any kind of information processing by definition is based and which determine the conceptual apparatus and the semantic categories of the visual system. These data types cannot simply be derived from “corresponding” physical categories. Their structure rather has to be identified by appropriate experimental investigations. Such investigations, however, are impeded by the naïve realist convictions that are deeply built-in into our commonsense conception of perception. Because of such convictions, we are inclined to identify input and output categories, and hence to erroneously use categories of the output of the perceptual system for a description of its input. In fact, however, the perceptual objects and their attributes that constitute the output of the perceptual system can only be expressed by a logical language that is strictly more powerful than the logical language by which the physico-geometric properties of the input can be expressed. Consequently, the internal structure underlying perceptual organisation in terms of perceptual objects such as surfaces and their associated attributes, such as lustre, cannot be derived, by whatever kind of general inductive machinery, from the sensory input. The structural form of the data types on which perceptual computations are based does not simply mirror and is not solely moulded by external physical regularities. Rather, the form of internal data types is essentially co-determined by constraints that derive from internal functional and computational requirements (Mausfeld, 2010a, 2013). The conceptual forms or abstract data types of the perceptual system must not only be adequate with respect to the external physical world (however one understands such a requirement), they must also be computationally adequate in the sense that they have to fit into the entire computational architecture of the perceptual system and subsequent systems that take advantage of it. Not surprisingly then, internal data types for, say, surfaces and their proprietary attributes exhibit coding properties that appear idiosyncratic when solely viewed from considerations of external physical regularities. It therefore remains a core task of perception theory to better understand the internal data types by which perceptual objects, such as surfaces, and their attributes are coded.

Acknowledgments

This research was supported by grant MA 1025/10-2 from Deutsche Forschungsgemeinschaft (DFG). We thank Franz Faul and two anonymous reviewers for valuable comments.

Contributor Information

Rainer Mausfeld, Department of Psychology, Christian-Albrechts-University Kiel, 24098 Kiel, Germany; e-mail: ed.leik-inu.eigolohcysp@dlefsuam.

Gunnar Wendt, Department of Psychology, Christian-Albrechts-University Kiel, 24098 Kiel, Germany; e-mail: ed.leik-inu.eigolohcysp@tdnewnug.

Jürgen Golz, Department of Psychology, Christian-Albrechts-University Kiel, 24098 Kiel, Germany; e-mail: ed.leik-inu.eigolohcysp@zlog.

References