The N-Back task has played a central role in cognitive training research over the past 20 years, particularly in the "brain training" industry. It began with a groundbreaking study published by Jaeggi et al. in 2008. This study suggested that N-Back training could improve fluid intelligence (a major aspect of IQ), raising great expectations in both the scientific community and the general public. However, while this discovery fueled the rise of a massive "brain training" industry, subsequent rigorous replication studies and especially meta-analyses produced skeptical results. Eventually, authoritative institutions such as the Stanford Center on Longevity issued statements declaring that "there is no scientific consensus that brain training broadly improves cognitive function," leading to confusion and questions among general users: "Is N-Back training ultimately ineffective too?"
This review aims to answer this question from multiple perspectives based on scientific evidence. It rejects simple dichotomies and clearly distinguishes between the reliable aspects of N-Back training effects and their limitations.
The N-Back task is designed to place high demands on working memory. Participants must remember continuously presented stimuli and judge whether the current stimulus matches the one presented "N steps back." This simple rule encapsulates demands on the major components of working memory.
In particular, the Dual N-Back, which simultaneously processes visual and auditory stimuli, strongly requires Divided Attention capability in addition to these elements, which is why it has been adopted in many studies due to its cognitive complexity.
In evaluating the effectiveness of N-Back training, it is essential to clearly define what "effect" means. This review classifies effects into the following four levels, commonly used in scientific literature.
Much of the criticism that "brain training doesn't work" is directed at Levels 3 and 4—far transfer and ecological transfer—and does not deny all effects of N-Back training. This review will examine each level of these effects to clarify the boundaries of N-Back training effectiveness.
Behind the expectation that N-Back training would improve general and broad cognitive abilities like fluid intelligence, rather than just being memory training, there existed clear theoretical rationale and neuroscientific foundations. This section explains the sources of these expectations.
The most compelling rationale for the hypothesis that N-Back training causes far transfer is the Neural Overlap Hypothesis. This hypothesis is based on the idea that performing N-Back tasks and fluid intelligence tasks depends on the same neural networks in the brain.
Functional MRI (fMRI) and other brain imaging studies have shown that these tasks strongly activate a widespread brain region called the Fronto-Parietal Network. The main brain regions that constitute this network are:
According to this hypothesis, if N-Back training strengthens the functional efficiency or connectivity of this fronto-parietal network, its effects should also extend to fluid intelligence tasks that use the same neural circuits.
In fact, changes in the brain (neuroplasticity) brought about by N-Back training have been reported from both functional and structural perspectives.
At the molecular level, working memory function is associated with dopamine transmission efficiency in the prefrontal cortex, and findings have accumulated that genetic factors such as COMT gene polymorphisms can explain individual differences in training effects (so-called "responders" and "non-responders"). This is an important perspective for explaining why the same training produces different effects in different people.
However, two competing hypotheses exist regarding the essence of effects brought about by these brain changes.
Many recent meta-analyses and rigorous studies tend to support the latter Strategy Learning Hypothesis. That is, what participants are acquiring is likely not an across-the-board enhancement of cognitive abilities applicable in all situations, but skills specific to particular tasks. This theoretical conflict is at the heart of the far transfer controversy and is extremely important for understanding the evolution of evidence detailed in the next chapter.
The biggest point of contention in N-Back research is whether training improves fluid intelligence (Gf)—the ability to solve new problems and think abstractly—the "far transfer" effect. This section traces the evolution of academic debate from Jaeggi et al.'s initial groundbreaking research to the latest meta-analyses and clarifies the current scientific consensus.
The study by Jaeggi et al. published in 2008 shocked the cognitive science community. They reported that participants who underwent dual N-Back training for several weeks significantly improved their scores on fluid intelligence tests they had not trained on. In particular, the demonstration of a "dose-response relationship"—that more training days led to greater intelligence improvement—was seen as enhancing the reliability of the results and became the catalyst for the brain training boom.
However, subsequent verification revealed several methodological issues lurking in this initial optimism.
From the 2010s onward, multiple meta-analysis studies were conducted to integrate the variability of individual study results and clarify overall trends. The evolution of these studies is a good example of how scientific consensus is formed. As research methodology became more rigorous, observed effect sizes systematically converged toward zero. Early replication studies introduced active control groups that performed simple games to control for expectation effects. Later meta-analyses used statistical methods to correct for publication bias, where positive results are more likely to be published.
| Meta-Analysis Study | Main Results (Transfer to Fluid Intelligence Gf) | Implications of Conclusions |
| Melby-Lervåg & Hulme (2013, 2016) | Effects are very small or not significant. | Evidence for far transfer is lacking. |
| Au et al. (2015) | Small but significant effect (g ≈ 0.24). | Suggested possibility of far transfer, but criticized for control group selection affecting results. |
| Soveri et al. (2017) | Multilevel analysis results show effect on Gf is not significant. | Reinforces view that near transfer exists but far transfer does not. |
| Sala & Gobet (2019) | Second-order meta-analysis results show effect on Gf is essentially zero. | Further integrating existing meta-analyses and correcting for publication bias etc., effect disappears. Definitive conclusion that far transfer does not exist across the field. |
| Latest Meta-Analysis (2024) | No significant effect on Gf detected, and WM improvement is not associated with IQ changes. | Latest findings reconfirm absence of far transfer. |
As the evolution of these meta-analysis studies shows, academic consensus has shifted from initial optimism to a more skeptical and rigorous view. Based on this analysis, the current scientific consensus can be summarized as follows:
N-Back training does not reliably improve fluid intelligence in healthy adults.
However, this does not mean that N-Back training is completely worthless. While effects on the distant goal of fluid intelligence have been denied, effects on "closer" domains—namely working memory itself—have been established. The next chapter will look in detail at this reliable effect, "near transfer."
In contrast to the negative conclusions regarding far transfer, it is widely recognized scientifically that N-Back training has reliable effects on specific cognitive domains, namely working memory itself. This "Near Transfer" is the area where the most robust evidence exists for discussing the effectiveness of N-Back training.
Consistent N-Back training improves performance on other working memory tasks that have not been trained. However, caution is needed when interpreting effect sizes. According to the latest meta-analysis (2024), the magnitude of effects depends heavily on the similarity between trained and test tasks. For tasks very similar to trained N-Back tasks, very large effects of SMD ≈ 1.15 are observed, but this is close to practice effects. On the other hand, the effect size for more generalized near transfer to working memory tasks in general that differ in format from N-Back is SMD ≈ 0.18, which, while statistically significant, is modest in magnitude. This result suggests that N-Back training specifically strengthens certain core functions of working memory, particularly the Updating function of constantly replacing information.
The effects of N-Back training are not limited to mere information retention and updating. Many studies have also shown improvement in the ability to concentrate on correct information without being confused by misleading information during task performance, especially lure stimuli—non-target stimuli similar to targets—namely Interference Control. This is related to the ability to stay focused on tasks at hand without being distracted by irrelevant thoughts or external stimuli in daily life, and may have practical significance.
Why does near transfer occur while far transfer is less likely? An interesting hypothesis proposed to answer this question is the Gating Model of Transfer. This model suggests that first achieving near transfer (improvement in working memory ability) through training serves as a "gate" (barrier)—a prerequisite—for far transfer to more distant tasks to occur. One reason why far transfer was not consistently observed in past studies may be that many participants had not adapted to training to a sufficient level to pass through this "gate."
In this way, N-Back training has reliable effects in strengthening the specific cognitive function of working memory. The next chapter will further explore what this established near transfer effect means in specific clinical applications such as ADHD.
The value of N-Back training should be evaluated not only in terms of IQ improvement in healthy general adults but also from the perspective of application in populations with specific challenges. This section focuses on research targeting people with ADHD (Attention Deficit Hyperactivity Disorder) and older adults, and considers its clinical and applied value from multiple perspectives.
Because one of the core impairments in ADHD is weakness in working memory and executive function, N-Back training has been theoretically expected to be an effective intervention.
Many studies have confirmed that N-Back training improves working memory test performance in children and adults with ADHD. However, when interpreting evidence regarding improvement in ADHD symptoms, it is essential to understand the "blinding paradox," where results differ dramatically depending on who evaluates the effects.
This remarkable discrepancy, where the effect size is reduced by more than half, strongly suggests that much of the observed improvement may be due to expectations of those around (placebo effect) rather than the effect of the treatment itself.
Compared to pharmacotherapy such as stimulant medications, the effect size of N-Back training is limited. Therefore, the general view among experts is that it should be positioned as one of the adjunctive interventions rather than as an alternative to pharmacotherapy. Additionally, given the characteristics of ADHD, continuing monotonous training can be difficult, so "gamification" incorporating rewards and storylines is an essential element for maintaining motivation.
Maintaining cognitive function is an important issue in an aging society, but here too the position of N-Back training needs to be carefully evaluated.
In older adults as well, performance improvement (training effect) through N-Back task practice is clearly observed. However, compared to younger people, the extent to which effects "transfer" to other tasks tends to be reported as more limited.
For the broad goal of cognitive function maintenance in older adults, evidence has been accumulating that Physical Exercise is likely superior to N-Back training. Aerobic exercise and strength training have clear physiological bases, such as promoting the secretion of Brain-Derived Neurotrophic Factor (BDNF), and meta-analyses have shown consistent improvement effects on broad cognitive functions including memory and executive function.
One of the most attention-getting approaches in recent years is "dual-task training" that combines exercise with cognitive tasks. For example, interventions such as performing N-Back tasks while riding an exercise bike have been shown to produce higher synergistic effects than doing each alone, and may be particularly effective for improving executive function and memory.
As we have seen, the effects of N-Back training are greatly influenced not only by the target population but also by study design. This finding provides important implications for correctly evaluating the value of N-Back and drawing conclusions.
This review has examined evidence on the effects of N-Back training from diverse perspectives, including its hierarchical nature, neural mechanisms, and clinical applications. Here, we integrate these findings and draw final conclusions about the true value and limitations of the N-Back task.
What has been highlighted again by this analysis is the unbridgeable gap between the exaggerated effects promised by commercial "brain training" (e.g., improved IQ, brain rejuvenation) and the limited effects that have been scientifically demonstrated (e.g., improvement in specific working memory skills). Statements from institutions such as the Stanford Center on Longevity should be understood not as denying the N-Back task itself, but as warnings against consumers having excessive expectations about this gap and spending time and money accordingly.
The simplistic view that "if it doesn't raise IQ, it's meaningless" misses the essential value of N-Back training. Its value should be redefined not as a "universal intelligence enhancement tool," but as a "specific drill for honing the important skill of working memory." The ability to temporarily hold and manipulate information is the foundation of intellectual productivity in today's knowledge-intensive society. If N-Back training increases mental persistence when processing complex information and resistance to interfering information, it may have beneficial effects in real life even if not directly reflected in IQ scores.
Based on the above analysis, we present the following three recommendations for individuals and professionals considering N-Back training.
The final answer to the question posed at the beginning of this review—"Is N-Back training ineffective too?"—is "A conditional 'No' (there are effects)."
Its effects certainly exist in "near transfer," which improves performance on working memory tasks similar to the trained task. However, it is now almost established as current scientific consensus that it is not a "magic bullet" that raises general intelligence (IQ) and dramatically improves performance in all everyday situations.
The N-Back task is not a "cure-all for training the brain." It is a "specific drill for making certain cognitive circuits more efficient." This understanding is the first step toward properly utilizing its value.