There are stimuli that capture attention automatically. This guide describes those stimuli (and the underlying neuroscience).
Capturing attention used to be easy.
How it was:
How it is today:
Today, it’s pretty tough.
So I read hundreds of journal articles to answer the question: What captures attention?
Everything in this guide is a stimulus that captures attention automatically.
Why do they capture attention? Because of the following three factors.
We can’t see everything, so we use selective attention: We only perceive a fraction of stimuli that enter our consciousness (Moran & Desimone, 1985).
In fact, that’s the mechanism behind subconscious influence. Our eyes perceive more stimuli than we can process. Many stimuli enter our brain without being detected consciously — yet they’re still in our brain. Thus, they influence our perception and behavior.
In order to survive, our ancestors needed to see life-threatening stimuli.
…reproductive potential of individuals, therefore, was predicated on the ability to efficiently locate critically important events in the surroundings. (Öhman, Flykt, & Esteves, 2001, p. 466)
And that’s what happened. Our ancestors developed brain regions that monitored the surrounding environment for critical stimuli:
…there should be systems that incidentally scan the environment for opportunities and dangers; when there are sufficient cues that a more pressing adaptive problem is at hand—an angry antagonist, a stalking predator, a mating opportunity—this should trigger an interrupt circuit on volitional attention (Cosmides & Tooby, 2013, p. 205)
Our brain alerted us whenever it detected a threat.
Today, our brain alerts us toward threats.
But here’s the funny thing. We developed these mechanisms millions of years ago. Stimuli that were considered “life-threatening” are less severe today.
Consider vehicles and animals.
Today, vehicles threaten our survival more than animals. But we’re wired to notice animals more than vehicles.
We are more likely to fear events and situations that provided threats to the survival of our ancestors, such as potentially deadly predators, heights, and wide open spaces, than to fear the most frequently encountered potentially deadly objects in our contemporary environment (Öhman & Mineka, 2001, p. 483)
Humans might detect “vehicle” features thousands of years into the future, but hopefully we’ll be teleporting by then.
Here’s the point: We’re wired to notice stimuli that helped our ancestors survive. Even today. Even with stimuli that seem harmless. If you want to capture attention, you need to display stimuli that threatened the survival of our ancestors.
Color might be the most salient dimension (Milosavljevic & Cerf 2008).
In particular, females are more likely to notice red stimuli because they were foragers. They needed to detect red stimuli among green plants (Regan et al., 2001).
Want people to notice your YouTube video? Look at surrounding thumbnails. Which color are they? Choose a contrasting color that stands out.
We also notice misalignment (Treisman & Gormican, 1988).
Want people to notice your Facebook post? Add white rectangles at the top and bottom so that it appears tilted.
Contrasting size captures attention (Huang & Pashler, 2005). Especially with lengths and numbers (Treisman & Gormican 1988).
Submitting an article to Reddit or Hacker News? Check the length of recent submissions — are they short or long?
- Short? Write a long headline.
- Long? Write a short headline.
A contrasting size will capture more attention.
Motion onsets are changes from stillness to movement (Abrams & Christ, 2003).
Want website visitors to notice your button? Try adding a motion onset, like pulsing.
Looming motion occurs when stimuli get larger:
…looming objects are more likely than receding objects to require an immediate reaction, we speculated that the potential behavioral urgency of a stimulus might contribute to whether or not it captures attention. (Franconeri & Simons, 2005, p. 962)
Perhaps you could start a video by zooming inward — this looming motion is more likely to capture and sustain attention.
Animate motion is unpredictable motion (Pratt, Radulescu, Guo, & Abrams, 2010). If a predator attacked without warning, we needed to be prepared – otherwise, we died.
Motion doesn’t need literal movement. Images capture more attention when they depict motion (Cian, Krishna, & Elder, 2015).
Designing an app thumbnail? Add motion.
Designing a traffic sign? Add motion.
People can find a “V” faster than a “Λ” shape:
In that study, researchers argued that we’re more likely to notice V-shape because it resembles the eyebrows of an angry person (Larson, Aronoff, & Stearns, 2007). Supposedly, this ability helped us survive.
Perhaps…but I’m skeptical. It took me dozens of photos to capture the downward shape in my eyebrows above. And there are many theoretical issues with the “universality” of emotions (which is beyond the scope of this guide).
Instead, I suspect a more plausible explanation: motion capacity.
A V-shape can easily move— it tilts from side to side. However, a Λ-shape remains stable. Thus, we’re more likely to notice stimuli that possess the capacity for motion. This ability helped our ancestors survive.
Finally, humans are wired to detect motion of our species (Troje, 2008).
…the right pSTS, revealed an enhanced response to human motion relative to dog motion. This finding demonstrates that the pSTS response is sensitive to the social relevance of a biological motion stimulus. (Kaiser, Shiffrar, & Pelphrey, 2012, p. 1)
Biological motion requires natural body movements. For example, newly hatched chicks prefer natural body movements of a hen, rather than an artificially rotating hen (Vallortigara, Regolin, & Marconato, 2005).
Humans are the same. We notice biological motion:
Images of people activate a designated region of our brain, called the superior temporal sulcus (STS; Allison, Puce, & McCarthy, 2000).
In particular, faces activate the fusiform gyrus (Puce et al., 1996).
In other studies, researchers found that people detect changes in faces more easily than in other objects (e.g., clothes; Ro, Russell, & Lavie, 2001).
However, faces need to be upright (Eastwood, Smilek, & Merikle, 2003). Thanks to the face inversion effect, we’re slower to detect inverted faces (Epstein et al., 2006).
Also, here’s a question. What makes a face…well…a face? When does our brain stop recognizing a face?
Turns out, our brain looks for underlying geometric patterns:
…our first study indicated that the overall geometric configuration provided by the facial features, rather than individual features, was how a culture defined the emotional representation. (Aronoff, 2006, p. 85)
Ironically, schematic faces can be more attention-grabbing than realistic faces because they are built with geometric shapes.
Our brain can also detect the human body:
…a distinct cortical region in humans that responds selectively to images of the human body, as compared with a wide range of control stimuli. This region was found in the lateral occipitotemporal cortex (Downing, Jiang, Shuman, & Kanwisher, 2001, p. 2470)
In one study, blobs captured more attention when they resembled a human body (Downing et al., 2004).
However, we allocate more attention when faces and bodies are present (Bindemann et al., 2010).
Finally, our brain regions that detect individual body parts (Peelen & Downing, 2007)
For example, researchers found a direct relationship between brain activation and hand realism: Activation was greater with realistic hands (Desimone et al., 1984).
If you want to go viral, you just need cute cats.
Seriously. Our ancestors needed to detect animals for survival:
Information about non-human animals was of critical importance to our foraging ancestors. Non-human animals were predators on humans; food when they strayed close enough to be worth pursuing; dangers when surprised or threatened by virtue of their venom, horns, claws, mass, strength, or propensity to charge (New, Cosmides, & Tooby, 2007, p. 16598)
They developed brain regions that detected animals in their periphery. And modern humans inherited those mechanisms. Today, animals capture attention.
Your brain looks for geometric patterns:
The monitoring system responsible appears to be category driven, that is, it is automatically activated by any target the visual recognition system has categorized as an animal. (Cosmides & Tooby, 2013, p. 206)
As you’ll see later, some animals capture more attention than others.
Eye gaze captures attention automatically.
Sure, it helped us locate objects and people (Emery, 2000). But there’s another reason behind this trait: social dominance.
Each society, including animals, has a dominance hierarchy (Chance, 1967). Some creatures are more important than others. In order to survive, our ancestors needed to understand their position in this hierarchy. And they needed to identify the most dominant creature.
How? They relied on social attention.
Everyone in a society looks at the most dominant creature more often.
Ancestors who failed to notice these gazes (and thus identify the most dominant creature) would have picked a fight with the wrong beast. And they died.
In particular we developed two mechanisms:
- We developed the ability to detect eyes more easily. Gaze following became “hard-wired” in our brain via the superior temporal sulcus, amygdala, and orbitofrontal cortex (Emery, 2000)
- Our eyes became more salient. Indeed, “the physical structure of the eye may have evolved in such a way that eye direction is particularly easy for our visual systems to perceive.” (Langton, Watt, & Bruce, 2000, p. 52)
Bodies can imply the direction of gaze, too.
This effect is additive with eye gaze (Langton & Bruce, 2000). Incorporate as many spatial cues as possible.
Pointing captures attention, too.
Researchers compared multiple gestures —turns out, an isolated index finger captured that most attention (Ariga & Watanabe, 2009).
What’s important about the index finger?
My guess: It’s the optimal combination of ease and accuracy.
The index finger has only one adjacent finger, so we can extend this finger faster than other fingers.
The pinky also has a single adjacent finger, but the index finger is longer (and thus more accurate). It’s the best finger for pointing.
Fast forward to today, parents are teaching their kids about the world by pointing. Enough exposures will instill an automatic response. When we see a pointing gesture, we instinctively look.
If that explanation is correct, then other spatial cues (e.g., arrows) should capture attention, too.
Indeed, arrows capture attention, too (Ristic & Kingstone, 2006).
Spatial words capture attention (Hommel, Pratt, Colzato, & Godijn, 2001).
Don’t ask people to submit the yellow form. Some people are colorblind. Instead, ask them to submit the yellow form below the instructions.
Emotion has two dimensions: arousal and valence (Barrett & Russell, 1999):
- Arousal: Degree of activation
- Valence: Degree of pleasantness
High arousal emotions capture attention (Anderson, 2005).
For example, below are random words. Don’t read them. Just mentally say the color of the text:
Turns out, we’re slower to name a color if the word is emotional (e.g., fear) because those words capture more attention (Algom, Chajut, & Lev, 2004).
Threats are particularly attention-grabbing. If our brain detects a threat, it triggers a defense mechanism before we consciously notice the threat (Öhman & Mineka, 2001).
And that’s great. Imagine if we stopped to evaluate threats:
Now, your attention system is based on conditions that existed millions of years ago. There’s a reason why so many people are afraid of snakes and reptiles — even though we rarely see them today:
…the predatory defense system has its evolutionary origin in a prototypical fear of reptiles in early mammals who were targets for predation by the then dominant dinosaurs. (Öhman & Mineka, 2001, p. 486)
Your brain doesn’t detect the snake itself — it detects the curvilinear shape (LoBue, 2014).
Same with spiders:
…the reflexive capture of attention and awareness by spiders does not even require their categorization as animals. Performance was often comparable between identifiable spiders and stimuli which technically conformed to the spider template but that were otherwise categorically ambiguous (rectilinear spiders; New & German, 2015, p. 21)
Our ancestors were more likely to reproduce when they found a mating partner. Sexual stimuli are hard-wired into our attention system (Most et al., 2007).
Taboo words are more attention-grabbing than emotional words (Mathewson, Arnell, & Mansfield, 2008).
Some speakers (e.g., Tony Robbins) sustain the audience’s attention with profanity.
Infants look at novel patterns more than familiar patterns (Fantz, 1964). Why? Our ancestors were more likely to survive if they could detect novel stimuli:
…novel popout would appear to have a great deal of survival value because it would allow organisms to quickly perceive and prepare to deal with novel intrusions into their familiar surroundings. (Johnston et al., 1990, p. 3)
Perhaps you could try the pique technique: Researchers received more money when they ask for an unusual amount (e.g., 37 cents), rather than a standard amount (e.g., 25 cents, 50 cents; Santos, Leve, & Pratkanis, 1994).
Novelty prevents a mindless refusal by forcing people to evaluate the request.
But any novelty can work. Some websites show a popup as people are leaving:
Perhaps you can make it more novel:
You probably experienced the cocktail party effect (Moray, 1959).
You could be engulfed in a conversion. But if someone nearby mentions your name, your attention system slaps you in the face.
…automatic attentional capture ensures that self- related information is not missed and it is effectively encoded when present in one’s nearby environment (Alexopoulos, Muller, Ric, & Marendaz, 2012, p. 777)
Hearing our name activates the medial prefrontal cortex (Perrin et al., 2005). Babies develop that ability at roughly 4.5 months (Mandel, Jusczyk, & Pisoni, 1995).
It also happens with subliminal exposures to our written name (Alexopoulos et al., 2012).
Personalization can be a powerful marketing tool, but be cautious — it can also be creepy:
Participants reported being more likely to notice ads with their photo, holiday destination, and name, but also increasing levels of discomfort with increasing personalization. (Malheiros et al., 2012, p. 1)
Researchers don’t have a name for it. Maybe we could call it the how-the-f*ck-did-they-know-that effect.
Your face is equally as powerful as your name (Tacikowski & Nowicka, 2010).
A complex bilateral network, involving frontal, parietal and occipital areas, appears to be associated with self-face recognition, with a particularly high implication of the right hemisphere. (Devue & Brédart, 2011, p. 2)
Do you sell clothing online? Perhaps you could create an interactive fitting room. Let users upload their picture to see how the clothing looks on them.
People are more likely to notice stimuli when they don’t have an active goal. Their cognitive load is lower, which leaves spare room for attention (Cartwright- Finch & Lavie, 2007).
For example, shoppers are less likely to notice a banner ad when searching for specific products (Resnick & Albert, 2014). Shoppers who are browsing are more likely to notice advertisements.
To capture attention, advertise in contexts with low cognitive load.
When you search for a blue stimulus, you don’t notice red stimuli (see Baluch & Itti, 2011).
Want people to notice your stimulus? Make it similar to whatever they are monitoring.
Clever advertisers will hire a celebrity for a commercial, and then air this commercial during TV shows that contain this celebrity.
During a commercial break, viewers are subconsciously monitoring for the actor’s voice to determine when the show has returned. Hearing their voice in a commercial will snap their attention to the TV.