Skip to content
Tutorial emka
Menu
  • Home
  • Debian Linux
  • Ubuntu Linux
  • Red Hat Linux
Menu
deepseek engram

What is DeepSeek’s Engram?

Posted on January 14, 2026

DeepSeek has released a new technical paper detailing “Engram,” a conditional memory-based technique that allows AI models to utilize a queryable database of information committed to system memory. By committing sequences of data to static memory, Engram achieves demonstrably higher performance in long-context queries. This approach eases the reliance on reasoning for AI models, allowing GPUs to focus on more complex tasks. Crucially, this method increases performance while reducing the industry’s heavy reliance on scarce High-Bandwidth Memory (HBM).

The paper details how N-grams—statistical sequences of words—are integrated into the model’s neural networks, effectively placing them into a queryable memory bank. Engram allows models to simply “remember” facts rather than having to reason them out, a process that is far more computationally expensive. Released on the company’s GitHub page, Engram aims to curb the reliance on complex GPU memory by committing a knowledge library to more common system memory standards, such as CXL, enabling static memory to be held separately from an LLM’s compute power.

As detailed in the paper, an Engram-based model scaled to nearly 27 billion parameters can outperform a standard Mixture of Experts (MoE) model in long-context training. Standard MoE models utilize “conditional computation,” forcing the model to reconstruct data pieces every time they are referenced. Engram eliminates this computational waste by asking, “Do I already have this data?” This avoids what the paper describes as “expensive runtime reconstruction of a static lookup table,” saving valuable sequential depth for higher-level reasoning.

Engram is distinct from solutions like Nvidia’s KVCache, which offloads context data to NVMe memory. While KVCache acts as a short-term solution for remembering recent conversation history—akin to storing handwritten notes—Engram acts as a persistent record of a whole encyclopedia. Through tokenizer compression and “Multi-Head Hashing,” Engram reduces vocabulary size and allows for rapid parsing of information, ensuring distinct concepts (like “Universal” vs. “Universal Studios”) are retrieved without error via context-aware gating.

DeepSeek also explored the optimal balance between memory and compute, discovering a “U-curve” where allocating roughly 20–25% of the sparse parameter budget to Engram yields the best performance. In an experiment dubbed the “Infinite Memory Regime,” they found that performance scales linearly with memory size even when the compute budget is fixed. This implies that future AI improvements may not be solely bound by compute power, but could be achieved by expanding long-term “Engram” memory banks using standard DRAM within data centers.

The performance results highlight the potential of this architecture. In parallel testing, an Engram-27B model surpassed a standard 27B MoE model by 3.4 to 4 points in knowledge-intensive tasks and saw a massive leap in “Needle in a Haystack” long-context accuracy, scoring 97% compared to the MoE’s 84.2%. With DeepSeek viewing conditional memory as an “indispensable modeling primitive,” industry observers suggest this technology could be central to the rumored DeepSeek V4, potentially shifting hardware demand from HBM to standard system DRAM.

source: https://github.com/deepseek-ai/Engram

Recent Posts

  • Build Ultra-Fast and Tiny Desktop Apps with Electrobun: A Beginner’s Guide
  • The Ultimate 2026 Coding Roadmap: How to Master Software Engineering with AI Agents
  • How to Master Cloud Infrastructure with Ansible and Terraform
  • How to Fix VirtualBox Stuck on Saving State: A Complete Guide
  • How to Run Windows Apps on Linux: A Complete Guide to WinBoat, WINE, and Beyond
  • Build Your Own AI Development Team: Deploying OpenClaw and Claude Code on a VPS!
  • How to Measure Real Success in the Age of AI: A Guide to Software Metrics That Actually Matter
  • Kubernetes Traffic Tutorial: How to Create Pod-Level Firewalls (Network Policies)
  • This Is Discord Malware: Soylamos; How to Detect & Prevent it
  • How Stripe Ships 1,300 AI-Written Pull Requests Every Week with ‘Minions’
  • How to Disable Drag Tray in Windows 11: Simple Steps for Beginners
  • About Critical Microsoft 365 Copilot Security Bug: Risks and Data Protection Steps
  • Is the $600 MacBook Neo Actually Any Good? A Detailed Deep-Dive for Student!
  • Build Your Own Mini Data Center: A Guide to Creating a Kubernetes Homelab
  • How Enterprise Stop Breaches with Automated Attack Surface Management
  • The Roadmap to Becoming a Professional Python Developer in the AI Era
  • Why Your High Linux Uptime is Actually a Security Risk: A Lesson for Future Sysadmins
  • Portainer at ProveIt Con 2026
  • How to Reset a Virtual Machine in VirtualBox: A Step-by-Step Guide
  • Notepad Security Risks: How Feature Creep Turned a Simple Tool Into a Potential Backdoor
  • How to Generate Battery Report in Windows 11: A Simple Guide
  • How to Setting Up a Pro-Level Security System with Reolink and Frigate NVR
  • How to Install DaVinci Resolve on Nobara Linux and Fix Video Compatibility Issues Like a Pro
  • How to Master GitHub’s New Power Tools: Copilot CLI, Dashboards, and More!
  • How to Create and Configure DNS Server on RHEL 10
  • Inilah Cara Pakai Google Maps Offline Biar Mudik Lebaran 2026 Nggak Nyasar Meski Tanpa Sinyal!
  • Inilah Alasan Mahkamah Agung Tolak Kasasi Google, Denda Rp202,5 Miliar Resmi Menanti Akibat Praktik Monopoli
  • Inilah Cara Daftar dan Syarat SPMB SMK Boarding Jawa Tengah 2026, Sekolah Gratis Sampai Lulus!
  • Inilah Daftar Sekolah Kedinasan 2026 untuk Lulusan SMK, Bisa Kuliah Gratis dan Berpeluang Besar Langsung Jadi CPNS!
  • Inilah Pajak TER: Skema Baru PPh 21 yang Nggak Bikin Pusing, Begini Cara Hitungnya!
  • How to Setup Clawdbot Computer Agents Client
  • Bytedance Helios: How to Generate Real-Time Long AI Videos on Your Own Computer
  • New Perplexity AI Feature, March 2026 is Insane
  • How to LLM Finetuning with FPT AI Factory
  • New ComfyUI Released, The App Mode is Amazing
  • Apa itu Spear-Phishing via npm? Ini Pengertian dan Cara Kerjanya yang Makin Licin
  • Apa Itu Predator Spyware? Ini Pengertian dan Kontroversi Penghapusan Sanksinya
  • Mengenal Apa itu TONESHELL: Backdoor Berbahaya dari Kelompok Mustang Panda
  • Siapa itu Kelompok Hacker Silver Fox?
  • Apa itu CVE-2025-52691 SmarterMail? Celah Keamanan Paling Berbahaya Tahun 2025
©2026 Tutorial emka | Design: Newspaperly WordPress Theme