Can mathematics explain learning?


Have you ever caught a whiff of a particular coffee and, in an instant, been transported back to a specific cafe where you first enjoyed that very aroma? It’s as if the smell alone without seeing the cup, tasting the coffee, or hearing the cafe’s sounds was enough to unlock an entire memory. This fascinating phenomenon highlights the way our brains retrieve memories through cues, aligning closely with ideas proposed by the psychologist William K. Estes in his stimulus sampling theory.

Clark Hull and other learning theorists of the mid-20th century viewed learning as a process of forming direct stimulus-response associations. Take, for example, a pigeon trained to peck at a yellow light to receive food. According to Hull, this training would create a direct bond between seeing the yellow light (stimulus) and performing the pecking action (response). Over time, whenever the yellow light appeared, the pigeon would automatically peck for food.

Estes challenged this view, suggesting that what we consider a single stimulus like a yellow light is actually composed of many smaller elements. When a pigeon sees a yellow light, it does not perceive it as a singular, uniform entity but as a collection of features (e.g., brightness, hue, location). Estes proposed that during each exposure to the yellow light, only a random subset of these features is noticed or “sampled.” With repeated training, enough elements become linked to the response, increasing the likelihood of the correct behavior.

In his 1950 paper, Toward a Statistical Theory of Learning, Estes introduced a probabilistic approach to learning. The probability equation in Estes' paper is:

This equation describes how the probability of a response changes over the course of learning trials. Let's break down its components:


p: The probability of the response occurring at any given time

p_0: The initial probability of the response before learning begins

q: The ratio of sampled stimulus elements to total stimulus elements (s/S)

T: The number of learning trials that have occurred

e: The mathematical constant (approximately 2.71828)


The equation shows that the probability of the response (p) starts at the initial value (p_0) and approaches 1 as the number of trials (T) increases. The rate of this increase is determined by q, which represents how much of the stimulus environment is sampled on each trial.

Key features of this equation:

  1. As T increases, the term e^(-qT) becomes smaller, causing p to increase.

  2. The rate of increase is faster when q is larger, meaning more of the stimulus environment is sampled per trial.

  3. If p_0 is very small (close to 0), the equation simplifies to p ≈ 1 - e^(-qT).

  4. As T approaches infinity, p approaches 1, regardless of the initial probability.

This equation captures the negatively accelerated learning curve often observed in experimental data, where learning is rapid at first and then slows down as performance approaches its maximum

Sadly despite its intuitive appeal and mathematical rigor, as cognitive learning theories emerged emphasizing concepts like schemas and mental models, interest in purely behavioural and statistical models like stimulus sampling theory waned.

References

Image Source: Instagram

Estes, W. K. (1950). Toward a statistical theory of learning. Psychological Review, 57(2), 94–107. https://doi.org/10.1037/h0058559

Comments

Popular posts from this blog

Aplysia : An activity book

Classical Conditioning: Why I get sick in my car