Most nursing students encounter the evidence pyramid early — systematic reviews and meta-analyses at the top, expert opinion at the bottom, with RCTs, cohort studies, and case-control studies stacked in between. The pyramid gets memorized, reproduced in a paper's literature review section, and then often forgotten as soon as the actual writing begins. That's a missed opportunity, because levels of evidence aren't just a classification exercise — they're a tool for making decisions about which sources should carry the most weight in your argument, how to frame conflicting findings, and how to justify your capstone's evidence base to a committee that will absolutely notice if your strongest claims rest on your weakest sources. This guide goes through the major evidence hierarchies used in nursing programs, what each level actually means in practice, and — the part that matters most for your writing — how to use levels of evidence as an organizing principle rather than a one-time citation. If you're building a literature review and want help organizing sources by evidence strength, our writers can help structure that synthesis from the start.
Why Levels of Evidence Exist
The core idea behind any evidence hierarchy is straightforward: different study designs are more or less vulnerable to bias, and that vulnerability affects how much you should trust their conclusions. A well-conducted randomized controlled trial controls for confounding variables through randomization in a way that an observational cohort study simply cannot — not because cohort researchers are less careful, but because the study design itself doesn't eliminate the possibility that some other factor (not the intervention) explains the difference between groups. A systematic review that pools multiple RCTs reduces the risk that any single study's quirks (an unusual sample, a site-specific factor) drive the overall conclusion.
This is why evidence hierarchies place review/synthesis designs at the top, experimental designs (RCTs) next, observational designs in the middle, and expert opinion or case reports at the bottom — the ordering reflects how much each design type, on average, controls for alternative explanations of a finding. The qualifier "on average" matters: a poorly conducted RCT can be less trustworthy than a well-conducted cohort study, which is why critical appraisal of individual studies remains necessary even after you've identified a study's evidence level. The level tells you the design's ceiling; appraisal tells you how close to that ceiling a specific study actually gets.
Common Evidence Hierarchy (Adapted from Johns Hopkins & Similar Models)
| Level | Study Types | What It Tells You |
|---|---|---|
| Level I | Systematic reviews and meta-analyses of RCTs; evidence-based clinical practice guidelines based on systematic reviews | Highest confidence — synthesizes multiple experimental studies, reducing the influence of any single study's idiosyncrasies |
| Level II | Well-designed randomized controlled trials (RCTs) | Strong causal evidence — randomization controls for most confounding, though a single trial may have limited generalizability |
| Level III | Quasi-experimental studies (controlled trials without randomization) | Useful causal evidence with a higher risk of confounding than a true RCT, often used when randomization isn't feasible or ethical |
| Level IV | Case-control and cohort studies | Observational evidence — can identify associations and risk factors but cannot establish causation as confidently as experimental designs |
| Level V | Systematic reviews of descriptive and qualitative studies | Synthesizes non-experimental evidence — valuable for understanding experience, context, and "why" questions |
| Level VI | Single descriptive or qualitative studies | Provides depth and context but findings from a single study are not generalizable on their own |
| Level VII | Expert opinion, committee reports, consensus statements without systematic review | Lowest evidence level — useful when little research evidence exists, but reflects judgment rather than data |
The Pyramid Isn't the Only Model — and Pyramids Have Limits
The classic pyramid (or numbered Levels I–VII model shown above, used by Johns Hopkins and several other nursing-specific frameworks) is the version most BSN and MSN programs introduce first, and it's a reasonable starting point. But it's worth knowing that other models exist and serve slightly different purposes, especially if your program references them.
GRADE
The GRADE approach, common in systematic reviews and clinical guideline development, doesn't just rank study designs — it rates the overall certainty of evidence for a specific outcome (high, moderate, low, very low), taking into account not just design but also risk of bias, inconsistency across studies, indirectness, imprecision, and publication bias. A body of RCT evidence can be downgraded to "low certainty" under GRADE if the trials had serious limitations, while a body of observational evidence can occasionally be upgraded if the effect size is very large. GRADE is less about a single study's design and more about the overall confidence in a conclusion across all available evidence on a specific question.
The "5S" or "evolving pyramid" models
Some newer models reorganize the pyramid around how pre-digested the evidence is — raw studies at the bottom, syntheses (systematic reviews) above that, synopses (structured summaries of syntheses) above that, summaries (evidence-based guidelines/textbooks that integrate multiple syntheses) near the top, and systems (point-of-care decision support tools) at the very top. This model is less about study design and more about information-seeking efficiency — the idea being that a busy clinician should look for pre-filtered, synthesized sources before searching for individual studies.
What this means for your writing
If your program specifies a hierarchy (most do), use that one consistently and cite it. If you reference GRADE or another model in passing, briefly note how it relates to the primary hierarchy your program uses, rather than switching between systems without explanation — mixing hierarchies without acknowledgment is a common source of confusion in literature review sections.
Using Levels of Evidence to Organize a Literature Review
- Group your sources by theme first (as covered in literature review guidance generally), not by evidence level — level is a quality dimension within each theme, not the primary organizing structure
- Within each theme, identify which sources represent the highest level of evidence available — if a systematic review exists on your topic, it should generally anchor that theme's discussion
- When higher-level evidence is sparse or absent for your specific topic, say so explicitly — "evidence on this specific intervention in this population is currently limited to Level IV cohort studies" is a legitimate and useful observation, not an admission of failure
- When sources at different levels agree, note that the agreement across designs strengthens confidence — convergent findings from an RCT and a cohort study are more convincing together than either alone
- When sources at different levels disagree, generally weight the higher-level evidence more heavily in your synthesis, but consider whether the lower-level study offers context (population differences, setting) that explains the discrepancy
- In your discussion chapter, connect your own project's evidence level (most capstone projects, being small-scale and often without a control group, sit lower on the hierarchy — Level IV, V, or VI) to what that means for how your findings should be interpreted and what a logical next step (a larger study, a controlled trial) might look like
Where Capstone Projects Typically Fall — and Why That's Fine
- Most capstone and DNP projects are Level IV (cohort-style pre-post designs), Level VI (descriptive/qualitative), or quality-improvement designs that don't map neatly onto the traditional research hierarchy at all
- This is appropriate and expected — capstones are scoped for feasibility within an academic timeline, not to produce Level I evidence, and committees evaluate them accordingly
- The key writing move is being explicit about this in your discussion chapter — acknowledging your project's evidence level as a limitation shows methodological awareness, rather than letting a committee member point it out first
- Framing your project as contributing to a larger evidence base (e.g., "this project's findings, while limited to a single site and small sample, are consistent with the Level II evidence reviewed in chapter two, and support further investigation at a larger scale") connects your work to the hierarchy productively
- If your project used a stronger design than typical — a controlled comparison, a larger multi-site sample — that's worth highlighting as a relative strength, since it's genuinely less common at the capstone level
Common Confusions: Evidence Level vs. Study Quality vs. Relevance
One of the most persistent sources of confusion is treating "evidence level," "study quality," and "relevance to my topic" as the same thing, when they're three separate dimensions that all factor into how you use a source. A Level I systematic review (high evidence level) might still be of limited relevance if it addresses a population very different from yours. A Level IV cohort study (lower evidence level) might be extremely high quality — large sample, long follow-up, careful confounder adjustment — and highly relevant to your specific population, making it more useful for your literature review than a higher-level but less relevant systematic review.
This is the connection point to critical appraisal: evidence level sets a design's theoretical ceiling for trustworthiness, appraisal evaluates how well a specific study reaches that ceiling, and relevance evaluates whether the study's population, setting, and intervention match your purposes closely enough to matter. A strong literature review weighs all three together rather than defaulting to "use the highest level available" as a mechanical rule. A committee reading your literature review can usually tell when sources were selected purely by evidence level rather than by genuine relevance — the discussion tends to feel disconnected from the project's actual clinical context.
If you're trying to balance these three dimensions across a large source list and it's starting to feel unwieldy, working with a writer who's done this kind of synthesis repeatedly can help you triage quickly — identifying which sources genuinely anchor your argument versus which provide supporting or contextual detail.
Common Mistakes to Avoid
- Citing the evidence pyramid once in the literature review's introduction and then never referencing evidence levels again when discussing individual sources.
- Treating evidence level as the only factor that matters, ignoring study quality (appraisal) and relevance to your specific population or setting.
- Apologizing for or hiding a capstone's own lower evidence level instead of acknowledging it directly as an expected limitation of project-scale work.
- Mixing hierarchy models (e.g., referencing both the numbered Level I–VII system and GRADE) without explaining how they relate, creating confusion for the reader.
- Assuming a systematic review is automatically the strongest source for every topic, even when no systematic review specific to your population or intervention actually exists.
- Organizing a literature review by evidence level as the primary structure, rather than by theme — resulting in a list-like review instead of a synthesized one.
- Failing to note when higher-level evidence on a specific topic simply doesn't exist, instead silently relying on lower-level sources without comment.
- Assuming a Level IV or V source is automatically weak — some of the most relevant, well-executed sources for a capstone's specific population may sit at these levels.
Ready to Start?
Send us your topic and source list, and we'll help you organize your literature review around evidence strength and relevance together — so your strongest claims rest on your strongest sources, and any gaps are addressed directly.
Improve my academic draftSee academic servicesRelated Guides
Levels Of Evidence Nursing Research: Complete Nursing Guide FAQ
The Johns Hopkins Evidence-Based Practice model is one widely used version of the evidence hierarchy in nursing, but it's not the only one — other models (GRADE, the 5S model, various organization-specific hierarchies) exist and emphasize slightly different things. Use whichever your program specifies, and most ordering logic (reviews/RCTs above observational studies above expert opinion) is broadly consistent across them.
Not necessarily for every single source, but for the sources that anchor your key claims — especially in a synthesis paragraph where you're weighing conflicting or converging findings — noting evidence level helps justify why you're weighting one source more heavily than another.
This is common for newer or narrower topics, and it's worth stating directly — "current evidence on [topic] is limited primarily to descriptive and qualitative studies, indicating a gap that experimental research has not yet addressed." This framing can also support the rationale for your own project.
Programs generally don't penalize a capstone for having a lower evidence level than a published RCT — that's expected given the scope and timeline. What matters is whether you demonstrate awareness of your project's evidence level and its implications, typically in the discussion/limitations section.
Yes, depending on your research question. If your topic involves understanding experiences, perceptions, or barriers — questions an RCT isn't designed to answer — a well-conducted qualitative study may be more directly relevant, even though it sits lower on the traditional hierarchy. See our mixed-methods guide for how both types of evidence can work together.
Evidence-based clinical practice guidelines that are themselves based on systematic reviews are often placed at Level I, since they represent a synthesis of high-level evidence translated into practice recommendations. However, the quality of a guideline depends on the rigor of the systematic review(s) underlying it — worth checking rather than assuming.
You can, especially if you encountered GRADE in a source you're citing (many systematic reviews report GRADE certainty ratings directly), but briefly explain the relationship rather than assuming the reader will translate between systems automatically.
Yes — our writers can help classify your sources by evidence level, assess their quality and relevance together, and organize your literature review so the strongest evidence anchors each theme, with gaps addressed directly rather than glossed over.