Author name: RukeRee

AI Infrastructure Study

Day 2: HBM vs DRAM vs SSD

AI Infrastructure Study Series

Day 2: HBM vs DRAM vs SSD

Understanding the memory hierarchy of AI infrastructure — and why the next bottleneck often moves from compute to memory and data movement.

Summary

AI infrastructure is not just a compute story. It is also a memory story. HBM sits closest to the accelerator and delivers extreme bandwidth, DRAM acts as the larger system memory layer, and SSD provides persistent storage for model files, checkpoints, and overflow data. As AI workloads scale, bottlenecks often move from raw compute to memory hierarchy and data movement.

1) Why This Matters

Faster chips alone do not solve the full AI problem. A model can only run efficiently if data reaches the compute engine quickly enough. That is why modern AI systems are built around a memory hierarchy rather than a single memory pool.

For investors, this matters because the next bottleneck in AI infrastructure is often not the processor itself, but the system that feeds it: high-bandwidth memory, system DRAM, storage, packaging, and the software stack that moves data across those layers.

2) One-Sentence Definitions

Memory Layer Simple Definition Core Strength
HBM High-bandwidth memory placed very close to the accelerator for extremely fast data movement. Speed + bandwidth
DRAM The larger system memory layer used by servers and CPUs to buffer, stage, and manage data. Capacity + flexibility
SSD Persistent flash storage used for model files, checkpoints, datasets, and overflow tiers in AI systems. Scale + persistence

3) A Simple Analogy

The easiest way to understand this is to imagine a work desk.

HBM = the tools on your desk that you can reach instantly

DRAM = the shelf next to your desk where you keep more materials nearby

SSD = the storage room where the larger files and long-term materials are kept

4) What Each Memory Layer Actually Does in AI

HBM: The Fastest Working Memory

HBM is designed for bandwidth-heavy AI workloads. It sits close to the GPU or accelerator and is built to feed the compute engine as quickly as possible. In large-scale training and inference, that matters because the model cannot stay efficient if memory throughput falls behind the rate of computation.

DRAM: The System Buffer

DRAM is not as fast as HBM, but it is more scalable as a general-purpose server memory layer. It acts as a staging area for model loading, buffering, and system-level coordination. In practical terms, this means DRAM often carries data that is too large, too cold, or too expensive to keep in HBM all the time.

SSD: The Capacity Layer

SSD is not working memory in the same way as HBM or DRAM, but it is still essential. Model weights, checkpoints, datasets, and long-tail inference data often begin or end their lives in storage. As AI systems scale, SSD becomes part of the performance conversation because loading and moving large assets quickly is no longer optional.

5) Where the Bottleneck Shows Up

Training

In training, the main challenge is feeding massive amounts of data and parameters into compute units fast enough. HBM becomes critical here because large models create enormous memory bandwidth demands, and slow movement can leave expensive accelerators underutilized.

Inference

In inference, the challenge shifts toward latency, cost, and memory tiering. As models handle longer context windows and more requests, some data must move between GPU memory, CPU memory, and storage. That makes the memory hierarchy itself part of the inference architecture.

This is why the phrase memory bottleneck matters so much in AI. Compute can improve, but if memory and data movement do not improve with it, system efficiency breaks down.

6) So Which One Is Better?

The better question is not which memory is “best,” but which layer is right for the job.

  • HBM: Best when extreme bandwidth and proximity to the accelerator matter most.
  • DRAM: Best when the system needs a larger, more flexible working layer.
  • SSD: Best when persistence, scale, and lower-cost capacity matter more than raw speed.

7) Why Investors Should Care

AI is not just a race for faster chips. It is also a race to solve the memory hierarchy. That means value can accrue not only to accelerator vendors, but also to memory suppliers, storage providers, advanced packaging players, and system software companies that make tiered memory practical at scale.

If Day 1 explains who performs the work, Day 2 explains what makes that work sustainable. The faster the model, the more valuable efficient memory and data movement become.

8) What I Learned Today

  • HBM is the fastest and most bandwidth-focused memory layer in modern AI systems.
  • DRAM acts as the larger system memory layer that supports staging, buffering, and coordination.
  • SSD is not just cheap storage — it increasingly matters for model loading, persistence, and overflow data in large AI systems.

9) One Question I’m Still Thinking About

If compute keeps improving faster than memory economics, does the real AI bottleneck eventually shift from chips to data movement and storage architecture?

10) What Comes Next

In Day 3, I’ll move from memory to networking and study NVLink vs InfiniBand vs Ethernet. Once data moves inside a chip, the next question is how it moves across many chips and many servers.

Continue the AI Infrastructure Study Series

This series is designed to make the AI stack easier to follow — one layer at a time, from compute and memory to networking, packaging, and inference economics.

NNext: Day 3 — NVLink vs InfiniBand vs Ethernet
Stock Market Updates

Weekly Market Recap (Mar 23–Mar 27, 2026)

Weekly Market Recap (March 23–27, 2026)

U.S. equities tumbled again as Wall Street’s alarm over the Iran war deepened, sending the S&P 500 to a fifth straight weekly loss and dragging major indexes toward correction territory.

Markets increasingly treated oil as the dominant macro variable, with traders focusing less on political headlines and more on the physical reality of tanker traffic, troop movements, and the shrinking buffer of available supply.

Index Performance (Weekly)

Index Weekly Change
S&P 500−3.22%
Nasdaq−4.55%
Dow Jones−2.25%

Sector Snapshot (1-Week)

Basic Materials
+5.41%
Energy
+4.73%
Utilities
+2.41%
Consumer Defensive
+1.26%
Industrials
−0.72%
Healthcare
−0.73%
Real Estate
−0.89%
Consumer Cyclical
−1.29%
Financial
−1.48%
Technology
−3.08%
Communication Services
−6.45%

The Score — What Drove the Market

  • Wall Street’s alarm rose: Investors increasingly traded as if the worst economic pain from the Iran war still lies ahead, pushing the S&P 500 to a fifth straight weekly loss.
  • Oil became the singular variable: Brent crude closed above $112 as the effective closure of the Strait of Hormuz tightened supply expectations and kept energy markets on edge.
  • Headlines lost influence: Traders began discounting political messaging and focused more on troop movements, tanker traffic, and physical supply constraints.
  • Fear trade intensified: Demand for bearish equity options jumped, inflation expectations rose, and bets on Fed rate cuts were pulled back as higher oil threatened growth.
  • Consumers turned more cautious: March sentiment worsened as higher energy prices and stock-market weakness weighed more heavily on household outlooks.

Key Takeaway

The market is no longer pricing a short disruption. It is increasingly preparing for a sustained oil shock, tighter financial conditions, and a slower path for rate relief. Energy and hard-asset exposure remain the clearest winners, while growth-heavy sectors continue to bear the brunt of rising macro stress.

Week ended March 27, 2026. Data based on provided figures.

AI Infrastructure Study

Day 1: GPU vs ASIC vs CPU — Understanding the Compute Layer of AI Infrastructure

AI Infrastructure Study Series

Day 1: GPU vs ASIC vs CPU

Understanding the compute layer of AI infrastructure — and why investors should not look at GPUs alone.

Summary

AI infrastructure starts with compute, but not all chips play the same role. GPUs are the most flexible and dominant accelerators for large-scale AI workloads, ASICs are purpose-built chips designed for specific tasks with stronger efficiency in narrow use cases, and CPUs remain the control layer that keeps the overall system running. The real lesson for investors is simple: AI is not a one-chip story. It is a stack.

1) Why This Matters

Many people think AI infrastructure begins and ends with Nvidia. That is understandable, because GPUs sit at the center of today’s AI boom. But to really understand where value is created, it helps to step back and look at the broader compute layer. GPUs, ASICs, and CPUs each play different roles, and the balance between them shapes cost, performance, and long-term competitive advantage.

This is why Day 1 starts here. Before studying memory, networking, packaging, or inference economics, it is important to understand the basic job of each chip.

2) One-Sentence Definitions

Chip Simple Definition Core Strength
GPU A highly parallel processor built to accelerate massive computations, especially in AI training and inference. Flexibility + scale
ASIC A purpose-built chip optimized for a specific workload or model type. Efficiency in narrow tasks
CPU A general-purpose processor that manages the system and supports broader computing tasks around AI workloads. Control + orchestration

3) A Simple Analogy

The easiest way to understand this is to imagine a factory.

CPU = the factory manager

GPU = the large, flexible production line

ASIC = the specialized machine built to do one task extremely well

4) What Each Chip Actually Does in AI

GPU: The Main Workhorse

In today’s AI market, the GPU is the dominant general-purpose accelerator. It is powerful enough for large-scale training and still flexible enough for a wide range of inference workloads. That flexibility matters. When models change quickly, or when developers want one common platform across many applications, GPUs are usually the default choice.

ASIC: The Specialized Competitor

ASICs matter because the largest cloud companies do not want to rely forever on the same economics as everyone else. If a hyperscaler can design a chip for a narrower workload and run that workload more efficiently, it can lower cost and improve internal control. That is where products like TPU, Trainium, and Inferentia become important.

CPU: Still the System Coordinator

The CPU is often underestimated in AI discussions. But AI servers are not just accelerators plugged into empty boxes. Someone still has to manage data movement, system control, orchestration, scheduling, and many surrounding software tasks. In that sense, the CPU remains the coordinator of the broader machine.

5) Training vs Inference

Training

Training is the process of teaching a model. The system repeatedly processes data, compares predictions with targets, and updates model weights. This usually demands the highest raw compute power and is where GPUs have become especially dominant.

Inference

Inference is the process of using a trained model to answer new inputs. Here, cost efficiency, latency, and throughput become more important. This is where GPUs still matter, but ASICs and CPUs can play a larger role depending on the use case.

This difference is important because not every AI dollar goes to the same part of the stack. Training rewards raw compute leadership. Inference often rewards efficiency, system design, and cost control.

6) So Which One Is Better?

The wrong answer is to say one chip is simply “the best.” The better answer is that each one wins under different conditions.

  • GPU: Best when flexibility, ecosystem, and broad workload support matter.
  • ASIC: Best when scale and specialization justify building around a narrow workload.
  • CPU: Best when general-purpose control, orchestration, and support functions are the priority.

7) Why Investors Should Care

The real investment takeaway is that AI infrastructure should not be viewed as a single-product story. GPUs capture the broadest demand, but ASICs reveal how hyperscalers try to improve their own economics, and CPUs remain essential because AI systems still need a host layer to function efficiently.

In other words, AI is not just about who sells the fastest chip. It is also about who controls the platform, who captures the system economics, and where the bottleneck moves next.

8) What I Learned Today

  • GPU is the most flexible and widely used AI accelerator today.
  • ASIC is not automatically better than GPU — it becomes attractive when specialization and efficiency matter more.
  • CPU is still critical because AI infrastructure is a full system, not just a box full of accelerators.

9) One Question I’m Still Thinking About

If GPUs are so dominant, why are hyperscalers still spending heavily to build their own ASICs?

10) What Comes Next

In Day 2, I’ll move from compute to memory and study HBM vs DRAM vs SSD. That is where the story gets even more interesting, because raw compute power means less if data cannot move fast enough.

Follow the AI Infrastructure Study Series

I’m documenting this series to better understand how the AI stack really works — from compute and memory to networking, packaging, and inference economics.

Next: Day 2 — HBM vs DRAM vs SSD
Stock Market Updates

Weekly Market Recap (Mar 16–Mar 20, 2026)

Weekly Market Recap (March 16–20, 2026)

U.S. equities fell for a fourth straight week as the deepening Middle East conflict kept oil prices elevated and pushed investors toward a more defensive posture.

Rising crude, a stronger dollar, firmer Treasury yields, and growing discussion of a possible Fed hike combined to pressure nearly every major sector outside of energy.

Index Performance (Weekly)

Index Weekly Change
S&P 500−2.88%
Nasdaq−3.25%
Dow Jones−2.92%

Sector Snapshot (1-Week)

Energy
+3.23%
Financial
−0.30%
Communication Services
−1.70%
Technology
−1.80%
Industrials
−1.91%
Healthcare
−2.80%
Consumer Cyclical
−3.18%
Real Estate
−3.98%
Consumer Defensive
−4.67%
Utilities
−4.90%
Basic Materials
−7.58%

The Score — What Drove the Market

  • Fourth straight weekly decline: All three major U.S. indexes extended their losing streak as the market adjusted to a longer and more economically consequential conflict.
  • Oil shock intensifies: Brent crude rose above $112, extending a massive monthly gain and reinforcing fears that energy costs will keep inflation elevated.
  • Fed outlook shifts: Traders sharply reduced expectations for rate cuts and even began pricing in the possibility of a rate hike later this year.
  • Rates and dollar pressure: Treasury yields climbed alongside the U.S. dollar, tightening financial conditions and weighing on most equity sectors.
  • Credit stress concern: Redemption pauses and markdowns at some private-credit firms added to worries that higher energy prices could spill into broader financial instability.

Key Takeaway

The market is no longer treating the conflict as a brief geopolitical shock. With oil prices still climbing and rate-cut hopes fading, investors are increasingly pricing a stagflationary backdrop in which energy remains the main hedge and most other sectors stay under pressure.

Week ended March 20, 2026. Data based on provided figures.

Stock Market Updates

The AI Infrastructure Map: How HBM, GPUs, ASICs, Networking, and Data Centers All Fit Together

The AI Infrastructure Map: How HBM, GPUs, ASICs, Networking, and Data Centers All Fit Together

A beginner-friendly guide to the full AI stack — from compute chips and memory to cloud monetization.

Why this matters: Most people talk about AI as if it were just one chip or one company. But AI is not a single product. It is a full infrastructure chain. The real story starts with demand from cloud providers and AI applications, then flows through software, compute, memory, networking, manufacturing, and finally into power-hungry data centers. If you want to understand the AI boom deeply, you need to understand the whole system — not just one layer.

1) The Simplest Way to Think About AI Infrastructure

The easiest mistake is to look at AI infrastructure as a list of buzzwords: HBM, GPU, ASIC, networking, foundry, cloud. A better way is to think of it as a workflow. AI demand starts at the top, with hyperscalers, model builders, and enterprise customers who want to train models or run inference. That demand then moves through the software stack, into compute chips, into memory and networking, into systems and manufacturing, and finally into physical data centers powered by electricity and cooling.

In other words, AI is a chain. If one link is weak, the whole system slows down. A powerful chip is useless without enough memory bandwidth. Fast chips are wasted if networking is weak. Great chip designs mean little if advanced packaging is constrained. And none of it matters if AI spending cannot eventually turn into real monetization.

2) The Whole Workflow at a Glance

1. End Demand / Monetization

Cloud AI, enterprise AI, copilots, agents, ads, AI applications

This is where the economic justification for AI spending must come from.

2. Software Stack

CUDA, Neuron, TPU software, compilers, orchestration

Hardware only matters if developers can actually use it efficiently.

3. Compute Layer

GPU / ASIC / CPU

This is the "brain" that performs AI training and inference.

4A. Memory

HBM / DRAM / SSD

Feeds data into the AI chip at very high speed.

4B. Networking

NVLink / InfiniBand / Ethernet

Connects many chips, servers, and racks into one system.

5. System Layer

Boards / servers / racks / clusters

Customers do not buy chips alone. They buy complete working systems.

6. Foundry & Advanced Packaging

Wafer fabrication / packaging / HBM integration / testing

Even the best design is meaningless if it cannot be manufactured at scale.

7. Physical Infrastructure

Data centers / power / cooling / optics / cables

As AI clusters grow, power and cooling can become the next big bottleneck.

3) Start with Compute: GPU, ASIC, and CPU

The compute layer is where AI work actually happens. The most familiar product is the GPU. NVIDIA's Blackwell platform is a good example of why the market increasingly treats AI as infrastructure instead of just chips: Blackwell systems combine GPUs, Grace CPUs, NVLink interconnect, and networking like Quantum-X800 InfiniBand and Spectrum-X800 Ethernet into one tightly integrated platform.

ASICs are different. They are custom AI accelerators built for more specific workloads or internal cloud use. Google's TPU, AWS Trainium2, and Microsoft Maia are the most important examples. AWS says Trainium2 offers up to 4x the performance of first-generation Trainium and 30–40% better price-performance than certain GPU-based EC2 instances, while Google says its Trillium TPU doubled HBM capacity and bandwidth versus TPU v5e and significantly improved efficiency.

The CPU is still critical too. It handles orchestration, scheduling, control logic, and many non-accelerator tasks inside the system. In modern AI infrastructure, CPU, accelerator, and networking are increasingly designed together rather than as separate pieces.

4) Why HBM Matters So Much

HBM stands for High Bandwidth Memory. It is one of the most important parts of the AI story because a powerful AI accelerator is often limited not only by raw compute, but by how fast data can move in and out of the chip. That is why HBM is attached close to the accelerator package: it feeds the chip much faster than ordinary memory can.

This is also why SK hynix, Samsung, and Micron matter so much. SK hynix has explicitly positioned HBM3E and HBM4 as core drivers of the AI memory cycle in 2026, and in March 2026 it highlighted HBM4's much higher bandwidth and improved power efficiency versus the prior generation.

The relationship is simple: better AI chips pull more HBM demand. If NVIDIA, AMD, Google, AWS, and Microsoft all want more powerful systems, they also need more advanced memory and more sophisticated packaging to make the system work.

5) Networking Is Not Optional — It Is the System

One of the biggest beginner mistakes is to think AI performance comes mostly from one chip. In reality, large AI models depend on many chips communicating quickly. That makes networking a first-class layer of the AI stack.

At the tightest level, there is scale-up networking: connecting chips within a node or rack. NVIDIA uses NVLink for this. At the broader level, there is scale-out networking: connecting servers and racks across the data center, often with InfiniBand or Ethernet fabrics. NVIDIA's Blackwell platform explicitly combines GPU compute with 800Gb/s networking, because fast chips without fast communication create new bottlenecks.

This is why companies like Broadcom, Marvell, and Arista matter even though they do not sell flagship merchant GPUs. AI is moving toward a world where the cluster matters as much as the chip.

6) Foundry and Packaging: The Hidden Bottleneck

A strong chip design is not the same thing as real supply. Advanced AI chips need leading-edge wafer manufacturing, advanced packaging, chip-to-chip integration, and memory stacking. That is why TSMC and Samsung are strategically important even when they are not the most visible names to retail investors.

This is especially true for HBM-based systems. HBM only creates value if it can be integrated successfully with the accelerator package at scale. In practice, that means manufacturing capacity and packaging capacity can be just as important as chip demand.

7) AI Is Also a Data Center Story

AI infrastructure does not stop at semiconductors. As clusters become denser, physical deployment becomes more important: racks need more power, more cooling, more optical connectivity, and more careful system design. This is why AI increasingly looks like an infrastructure buildout rather than a pure semiconductor cycle.

Microsoft's Maia messaging, for example, frames AI infrastructure from silicon to software to systems, which reflects how hyperscalers now think about the stack: not as isolated components, but as an integrated platform that must work end-to-end.

8) Major Companies by Layer

Layer Key Companies Why They Matter
GPU NVIDIA, AMD They supply the main merchant AI accelerators.
ASIC Google, AWS, Microsoft They build custom chips to improve performance, efficiency, and cost inside their own clouds.
HBM / Memory SK hynix, Samsung, Micron They supply the memory bandwidth that modern AI chips depend on.
Foundry / Packaging TSMC, Samsung They turn advanced designs into real physical products at scale.
Networking NVIDIA, Broadcom, Marvell, Arista They connect chips, servers, and racks into large AI clusters.
Systems / Servers NVIDIA systems, Dell, HPE, Supermicro, ODMs They integrate components into complete AI infrastructure.
Cloud / Demand AWS, Azure, Google Cloud, Meta They decide capex levels and must eventually turn AI infrastructure into revenue.

9) The Most Important Relationships

HBM ↔ GPU / ASIC

Stronger AI chips need more memory bandwidth, so compute demand often pulls HBM demand with it.

GPU / ASIC ↔ Networking

Fast chips alone do not create a fast cluster. Communication speed matters just as much.

Design ↔ Packaging

A winning chip design still loses if packaging capacity is constrained.

Capex ↔ Monetization

The full AI stack only remains healthy if spending eventually turns into durable revenue and cash flow.

10) What Beginners Should Study First

If you are trying to build a mental map of this industry, do not study every company at once. Start with the architecture. First, understand the difference between GPU, ASIC, and CPU. Then learn why HBM matters more than ordinary memory for AI. After that, study scale-up versus scale-out networking, then move to foundry and advanced packaging, and finally to data center power and cooling.

Once those layers make sense, the company map becomes much easier. You stop seeing AI as a list of ticker symbols and start seeing it as a system with bottlenecks, pricing power, and shifting winners as the market moves from training-heavy demand toward a more inference-heavy future.

Final takeaway: The AI boom is not just about one company building the best chip. It is about how demand flows through a chain: software, compute, memory, networking, packaging, systems, and physical infrastructure. The more you understand how those layers depend on each other, the better you can understand both the technology and the investment map.

Sources: NVIDIA Blackwell platform and networking details; AWS Trainium2 / Trn2; Google Trillium TPU; SK hynix HBM market outlook and HBM4 product updates.

Stock Market Updates

Weekly Market Recap (Mar 9–Mar 13, 2026)

Weekly Market Recap (March 9–13, 2026)

U.S. equities declined sharply as escalating conflict in the Middle East sent oil prices surging and reignited fears of inflation and global economic disruption.

A de facto closure of the Strait of Hormuz pushed crude prices sharply higher, triggering a broad selloff in economically sensitive sectors while energy stocks stood out as the market’s primary hedge against geopolitical risk.

Index Performance (Weekly)

Index Weekly Change
S&P 500−2.06%
Nasdaq−1.59%
Dow Jones−2.87%

Sector Snapshot (1-Week)

Energy
+1.43%
Technology
−1.42%
Communication Services
−1.72%
Real Estate
−2.26%
Utilities
−2.53%
Consumer Cyclical
−2.72%
Financial
−3.13%
Healthcare
−4.62%
Industrials
−4.69%
Consumer Defensive
−5.19%
Basic Materials
−9.58%

The Score — What Drove the Market

  • Middle East Conflict: Escalating war involving Iran and disruptions around the Strait of Hormuz sparked fears of global supply shocks.
  • Oil Shock: U.S. crude surged more than 25% in five sessions, marking the largest jump since 2020 and lifting energy stocks.
  • Stagflation Concerns: Rising energy costs increased fears that inflation could rebound while economic growth slows.
  • Rate Expectations: Treasury yields climbed as traders dialed back expectations for Federal Reserve rate cuts.
  • AI Policy Risk: Reports that the U.S. could restrict global AI-chip exports weighed on semiconductor sentiment.

Key Takeaway

Markets were reminded how quickly geopolitical shocks can ripple through inflation expectations, interest-rate outlooks, and equity valuations. With oil acting as the dominant macro driver, investors increasingly view energy exposure as the primary hedge against geopolitical risk.

Week ended March 6, 2026. Data based on provided figures.

Scroll to Top