50% off SaaS Starter Kit — only for the first 100 buildersGrab it →
← Back to blog
next.jsMay 20, 2026·8 min read

Streaming and Suspense in Next.js — How Progressive Loading Actually Works

Suspense boundaries and streaming let your pages load fast even when data is slow. Here's how to use them without breaking everything.

Robert Seghedi

Robert Seghedi

Co-founder, peal.dev

Streaming and Suspense in Next.js — How Progressive Loading Actually Works

We had a dashboard page that took 3.2 seconds to load. Not because the server was slow — the main content was ready in 200ms. It was one analytics widget making a heavy database query that held up the entire page. Every user stared at a blank screen waiting for a chart that was below the fold anyway. Classic mistake. The fix was Suspense and streaming, and it cut the perceived load time to under 400ms without touching the database query at all.

That's the core promise of streaming in Next.js: stop making users wait for your slowest thing before they can see anything. Ship the fast parts immediately, stream the slow parts as they become ready. It sounds obvious when you say it out loud, but most apps don't do it.

What Streaming Actually Does

Traditional SSR works like this: request comes in, server runs all your data fetching, renders the full HTML, sends it down the wire. The browser gets everything at once. With streaming, the server sends HTML in chunks as it becomes ready. The browser can start rendering and even showing content while the rest of the response is still being generated.

React's Suspense is the mechanism that makes this work in practice. You wrap a component in a Suspense boundary, give it a fallback (usually a skeleton), and React will render the fallback immediately while waiting for the component to resolve. With Next.js App Router, this works end-to-end: the fallback HTML is sent in the initial chunk, and the actual content is streamed in when it's ready.

Streaming doesn't make slow things faster. It makes fast things immediately visible while slow things load in the background. That distinction matters a lot.

The Basic Setup

In the App Router, you get streaming for free by using async Server Components with Suspense boundaries. Here's the pattern we use for almost every data-heavy page:

// app/dashboard/page.tsx
import { Suspense } from 'react'
import { UserStats } from './user-stats'
import { RecentActivity } from './recent-activity'
import { AnalyticsChart } from './analytics-chart'
import { StatsSkeleton, ActivitySkeleton, ChartSkeleton } from './skeletons'

export default function DashboardPage() {
  return (
    <div className="grid grid-cols-3 gap-6">
      {/* Fast query — probably cached */}
      <Suspense fallback={<StatsSkeleton />}>
        <UserStats />
      </Suspense>

      {/* Medium query */}
      <Suspense fallback={<ActivitySkeleton />}>
        <RecentActivity />
      </Suspense>

      {/* Slow query — heavy aggregation */}
      <Suspense fallback={<ChartSkeleton />}>
        <AnalyticsChart />
      </Suspense>
    </div>
  )
}

Each of those components is an async Server Component that fetches its own data. They run in parallel, and each one streams in as soon as it's done. The page shell renders immediately, skeletons appear for everything, and then components pop in as their data arrives.

// app/dashboard/analytics-chart.tsx
async function AnalyticsChart() {
  // This query takes 2 seconds — no longer blocks the page
  const data = await db
    .select()
    .from(events)
    .where(gte(events.createdAt, subDays(new Date(), 30)))
    .groupBy(sql`DATE(created_at)`)

  return (
    <div className="rounded-lg border p-4">
      <h3 className="font-semibold mb-4">Last 30 Days</h3>
      <LineChart data={data} />
    </div>
  )
}

The loading.tsx File — Automatic Suspense

Next.js has a shortcut: if you create a loading.tsx file in a route folder, it automatically wraps the entire page in a Suspense boundary with that file as the fallback. It's the right tool when the whole page is loading, like navigating to a new route.

// app/dashboard/loading.tsx
export default function DashboardLoading() {
  return (
    <div className="grid grid-cols-3 gap-6">
      <div className="h-32 rounded-lg bg-muted animate-pulse" />
      <div className="h-32 rounded-lg bg-muted animate-pulse" />
      <div className="h-64 col-span-full rounded-lg bg-muted animate-pulse" />
    </div>
  )
}

The difference between loading.tsx and explicit Suspense boundaries is important. loading.tsx covers the whole page — useful for initial navigation. Explicit Suspense boundaries give you granular control — useful when parts of a page load at different speeds. We use both: loading.tsx for route-level loading states, and explicit Suspense for components that are slower than the rest.

Parallel vs Sequential Fetching — The Mistake Everyone Makes

Suspense doesn't magically parallelize your data fetching. If your components await data sequentially, they're still sequential. This is the most common mistake we see, and we've made it ourselves.

// BAD: sequential fetching — total time = A + B + C
async function SlowDashboard() {
  const user = await getUser()        // 100ms
  const stats = await getStats(user.id)  // 300ms
  const activity = await getActivity(user.id)  // 200ms
  // Total: 600ms before anything renders
  return <div>...</div>
}

// GOOD: parallel fetching — total time = max(A, B, C)
async function FastDashboard() {
  const userId = await getUserId()  // need this first, that's fine
  
  // Start all fetches simultaneously
  const [stats, activity] = await Promise.all([
    getStats(userId),    // 300ms
    getActivity(userId)  // 200ms
  ])
  // Total: 100ms + 300ms = 400ms
  return <div>...</div>
}

// BEST: let Suspense boundaries parallelize across components
// Each component fetches independently — they all start at the same time
async function StatsComponent() {
  const stats = await getStats()  // runs in parallel with other components
  return <StatsUI stats={stats} />
}

async function ActivityComponent() {
  const activity = await getActivity()  // runs in parallel
  return <ActivityUI activity={activity} />
}

When you have separate Suspense-wrapped Server Components, their data fetching starts simultaneously. React kicks them all off in parallel. That's why the component-per-data-source pattern works so well — you get parallelization almost for free.

Streaming with Client Components — Where It Gets Tricky

Server Components stream naturally. Client Components need a little more thought. If you need to stream data into a Client Component, the cleanest pattern is wrapping it in a Server Component that fetches the data, then passing it down as props.

// Server Component wrapper — handles async data
async function UserProfileServer() {
  const profile = await getUserProfile()
  // Pass resolved data to Client Component
  return <UserProfileClient initialData={profile} />
}

// Client Component — uses data from server, can add interactivity
'use client'
function UserProfileClient({ initialData }: { initialData: UserProfile }) {
  const [profile, setProfile] = useState(initialData)
  
  // Now you can add real-time updates, optimistic updates, etc.
  return (
    <div>
      <h2>{profile.name}</h2>
      <button onClick={() => updateProfile(profile)}>Edit</button>
    </div>
  )
}

// In page — wrap the server component in Suspense
export default function Page() {
  return (
    <Suspense fallback={<ProfileSkeleton />}>
      <UserProfileServer />
    </Suspense>
  )
}

This pattern — Server Component fetches, Client Component renders — is the bread and butter of App Router development. The client component gets pre-populated with data from the server, so there's no client-side loading flash, but you still get full interactivity.

Error Boundaries Next to Suspense Boundaries

Here's something we learned after a few production incidents: always pair Suspense boundaries with error boundaries. If a streamed component throws, and you don't have an error boundary, the error will bubble up and potentially crash your whole page. Not great.

// app/dashboard/error.tsx
'use client'

export default function DashboardError({
  error,
  reset,
}: {
  error: Error & { digest?: string }
  reset: () => void
}) {
  return (
    <div className="rounded-lg border border-destructive/50 p-4">
      <p className="text-sm text-muted-foreground">
        Failed to load this section
      </p>
      <button
        onClick={reset}
        className="mt-2 text-sm underline"
      >
        Try again
      </button>
    </div>
  )
}

The error.tsx file in a route folder becomes an error boundary that wraps the page. But for more granular control — like having a chart fail without taking out the stats — you want error boundaries at the same level as your Suspense boundaries. React's ErrorBoundary class component, or the react-error-boundary package, works well here.

Suspense without an error boundary is like deploying without error monitoring. You're optimistic in a way that will eventually bite you.

When Not to Use Streaming

Streaming and Suspense aren't always the right answer. A few cases where we've kept things synchronous:

  • Authentication checks — if the user isn't logged in, you want to redirect before rendering anything. Don't Suspense-wrap your auth check.
  • Critical above-the-fold content — if a user needs to see data to understand the page at all, a skeleton isn't helpful. Fetch it synchronously.
  • Pages with only one data source — if your whole page depends on one query and it's fast (under 200ms), just fetch it. Suspense adds complexity for no gain.
  • SEO-critical content — while streaming does work with crawlers, if a page's entire value is its content (like a blog post), just render it fully on the server.
  • Small datasets — don't stream a list of 10 items. The overhead isn't worth it.

The rule we follow: use Suspense when you have genuinely independent pieces of data that load at different speeds, and the page is usable without all of them. A dashboard with stats, a feed, and a chart is the perfect use case. A product detail page with a title, description, and price — just fetch it all.

Measuring the Impact

The metrics to watch when you add streaming are Time to First Byte (TTFB), First Contentful Paint (FCP), and Largest Contentful Paint (LCP). Streaming typically improves FCP dramatically — users see something much faster. LCP depends on whether your LCP element is in a Suspense boundary or not.

For the dashboard we mentioned at the start: TTFB stayed roughly the same (it's the first byte, so same), FCP dropped from 3.2s to 0.4s, and LCP improved because the main content area — which was fast — could now render without waiting for the chart. The chart still takes 2 seconds to fully load, but users are already reading and clicking before it appears.

One thing worth tracking: when you add Suspense boundaries, watch for layout shift. If your skeleton isn't the same size as the real content, you'll get CLS (Cumulative Layout Shift) when the real component loads in. Measure this in Chrome DevTools or Vercel Analytics. The fix is making your skeletons match the expected dimensions of the real content — not pixel-perfect, just roughly right.

Putting It Together

Most of the peal.dev templates we build ship with this pattern baked in. Dashboard templates especially — there's no good reason a dashboard should block on its slowest query. The structure is always the same: page component as a layout with Suspense boundaries, each major section as its own async Server Component, skeletons that roughly match the content shape, and error boundaries at every level that matters.

If you're starting fresh: don't try to add Suspense everywhere at once. Pick your slowest page, identify the slowest component on that page, wrap it in a Suspense boundary with a decent skeleton, and measure. You'll probably see enough improvement in that one change to justify doing it everywhere else.

The mental model shift is the hard part. Stop thinking about pages as atomic units that load all at once. Think about them as independent pieces that load as fast as their data allows. Once that clicks, Suspense stops feeling like a React API and starts feeling like the obvious way to build interfaces.

Newsletter

Liked this post? There's more where it came from.

Dev guides, honest build stories, and the occasional 2am debugging confession — straight to your inbox. No spam, unsubscribe anytime.

Browse templates
Written by humansWeekly dropsSubscriber perks

Join the Discord

Ask questions, share builds, get help from founders