Skip to content
Tutorial emka
Menu
  • Home
  • Debian Linux
  • Ubuntu Linux
  • Red Hat Linux
Menu
deepseek engram

What is DeepSeek’s Engram?

Posted on January 14, 2026

DeepSeek has released a new technical paper detailing “Engram,” a conditional memory-based technique that allows AI models to utilize a queryable database of information committed to system memory. By committing sequences of data to static memory, Engram achieves demonstrably higher performance in long-context queries. This approach eases the reliance on reasoning for AI models, allowing GPUs to focus on more complex tasks. Crucially, this method increases performance while reducing the industry’s heavy reliance on scarce High-Bandwidth Memory (HBM).

The paper details how N-grams—statistical sequences of words—are integrated into the model’s neural networks, effectively placing them into a queryable memory bank. Engram allows models to simply “remember” facts rather than having to reason them out, a process that is far more computationally expensive. Released on the company’s GitHub page, Engram aims to curb the reliance on complex GPU memory by committing a knowledge library to more common system memory standards, such as CXL, enabling static memory to be held separately from an LLM’s compute power.

As detailed in the paper, an Engram-based model scaled to nearly 27 billion parameters can outperform a standard Mixture of Experts (MoE) model in long-context training. Standard MoE models utilize “conditional computation,” forcing the model to reconstruct data pieces every time they are referenced. Engram eliminates this computational waste by asking, “Do I already have this data?” This avoids what the paper describes as “expensive runtime reconstruction of a static lookup table,” saving valuable sequential depth for higher-level reasoning.

Engram is distinct from solutions like Nvidia’s KVCache, which offloads context data to NVMe memory. While KVCache acts as a short-term solution for remembering recent conversation history—akin to storing handwritten notes—Engram acts as a persistent record of a whole encyclopedia. Through tokenizer compression and “Multi-Head Hashing,” Engram reduces vocabulary size and allows for rapid parsing of information, ensuring distinct concepts (like “Universal” vs. “Universal Studios”) are retrieved without error via context-aware gating.

DeepSeek also explored the optimal balance between memory and compute, discovering a “U-curve” where allocating roughly 20–25% of the sparse parameter budget to Engram yields the best performance. In an experiment dubbed the “Infinite Memory Regime,” they found that performance scales linearly with memory size even when the compute budget is fixed. This implies that future AI improvements may not be solely bound by compute power, but could be achieved by expanding long-term “Engram” memory banks using standard DRAM within data centers.

The performance results highlight the potential of this architecture. In parallel testing, an Engram-27B model surpassed a standard 27B MoE model by 3.4 to 4 points in knowledge-intensive tasks and saw a massive leap in “Needle in a Haystack” long-context accuracy, scoring 97% compared to the MoE’s 84.2%. With DeepSeek viewing conditional memory as an “indispensable modeling primitive,” industry observers suggest this technology could be central to the rumored DeepSeek V4, potentially shifting hardware demand from HBM to standard system DRAM.

source: https://github.com/deepseek-ai/Engram

Recent Posts

  • How to build a high-performance private photo cloud with Immich and TrueNAS SCALE
  • How to Build an Endgame Local AI Agent Setup Using an 8-Node NVIDIA Cluster with 1TB Memory
  • How to Master Windows Event Logs to Level Up Your Cybersecurity Investigations and SOC Career
  • How to Build Ultra-Resilient Databases with Amazon Aurora Global Database and RDS Proxy for Maximum Uptime and Performance
  • How to Build Real-Time Personalization Systems Using AWS Agentic AI to Make Every User Feel Special
  • How to Transform Your Windows 11 Interface into a Sleek and Modern Aesthetic Masterpiece
  • How to Understand Google’s New TPU 8 Series for Massive AI Training and Inference
  • How to Level Up Your PC Gaming Experience with the New Valve Steam Controller and Its Advanced Features
  • Is it Time to Replace Nano? Discover Fresh, the Terminal Text Editor You Actually Want to Use
  • How to Design a Services Like Google Ads
  • How to Fix 0x800ccc0b Outlook Error: Step-by-Step Guide for Beginners
  • How to Fix NVIDIA App Error on Windows 11: Simple Guide
  • How to Fix Excel Formula Errors: Quick Fixes for #NAME
  • How to Clear Copilot Memory in Windows 11 Step by Step
  • How to Show Battery Percentage on Windows 11
  • How to Fix VMSp Service Failed to Start on Windows 10/11
  • How to Fix Taskbar Icon Order in Windows 11/10
  • How to Disable Personalized Ads in Copilot on Windows 11
  • What is the Microsoft Teams Error “We Couldn’t Connect the Call” Error?
  • Why Does the VirtualBox System Service Terminate Unexpectedly? Here is the Full Definition
  • Why is Your Laptop Touchpad Overheating? Here are the Causes and Fixes
  • How to Disable All AI Features in Chrome Using Windows 11 Registry
  • How to Avoid Problematic Windows Updates: A Guide to System Stability
  • What is Microsoft Visual C++ Redistributable and How to Fix Common Errors?
  • What is the 99% Deletion Bug? Understanding and Fixing Windows 11 File Errors
  • Inilah Jadwal Pelaksanaan SPMB SD Jakarta 2026
  • Tanggal Penerbitan KK & SKD untuk Pendaftaran SPMB 2026 Dimana?
  • Inilah Lima HP Xiaomi Rp1 Jutaan Sudah Punya NFC
  • Apa itu Jabatan Panitera Muda Mahkamah Agung, Berapa Gaji & Tunjangannya 2026?
  • Inilah Kenapa Bisa Ada Sensasi Mencekam di Bangunan Tua
  • How to Automate Your Entire SEO Strategy Using a Swarm of 100 Free AI Agents Working in Parallel
  • How to create professional presentations easily using NotebookLM’s AI power for school projects and beyond
  • How to Master SEO Automation with Google Gemini 3.1 Flash-Lite in Google AI Studio
  • How to create viral AI video ads and complete brand assets using the Claude and Higgsfield MCP integration
  • How to Transform Your Mac Into a Supercharged AI Assistant with Perplexity Personal Computer
  • Apa itu Spear-Phishing via npm? Ini Pengertian dan Cara Kerjanya yang Makin Licin
  • Apa Itu Predator Spyware? Ini Pengertian dan Kontroversi Penghapusan Sanksinya
  • Mengenal Apa itu TONESHELL: Backdoor Berbahaya dari Kelompok Mustang Panda
  • Siapa itu Kelompok Hacker Silver Fox?
  • Apa itu CVE-2025-52691 SmarterMail? Celah Keamanan Paling Berbahaya Tahun 2025
©2026 Tutorial emka | Design: Newspaperly WordPress Theme