Skip to content
Tutorial emka
Menu
  • Home
  • Debian Linux
  • Ubuntu Linux
  • Red Hat Linux
Menu
deepseek engram

What is DeepSeek’s Engram?

Posted on January 14, 2026

DeepSeek has released a new technical paper detailing “Engram,” a conditional memory-based technique that allows AI models to utilize a queryable database of information committed to system memory. By committing sequences of data to static memory, Engram achieves demonstrably higher performance in long-context queries. This approach eases the reliance on reasoning for AI models, allowing GPUs to focus on more complex tasks. Crucially, this method increases performance while reducing the industry’s heavy reliance on scarce High-Bandwidth Memory (HBM).

The paper details how N-grams—statistical sequences of words—are integrated into the model’s neural networks, effectively placing them into a queryable memory bank. Engram allows models to simply “remember” facts rather than having to reason them out, a process that is far more computationally expensive. Released on the company’s GitHub page, Engram aims to curb the reliance on complex GPU memory by committing a knowledge library to more common system memory standards, such as CXL, enabling static memory to be held separately from an LLM’s compute power.

As detailed in the paper, an Engram-based model scaled to nearly 27 billion parameters can outperform a standard Mixture of Experts (MoE) model in long-context training. Standard MoE models utilize “conditional computation,” forcing the model to reconstruct data pieces every time they are referenced. Engram eliminates this computational waste by asking, “Do I already have this data?” This avoids what the paper describes as “expensive runtime reconstruction of a static lookup table,” saving valuable sequential depth for higher-level reasoning.

Engram is distinct from solutions like Nvidia’s KVCache, which offloads context data to NVMe memory. While KVCache acts as a short-term solution for remembering recent conversation history—akin to storing handwritten notes—Engram acts as a persistent record of a whole encyclopedia. Through tokenizer compression and “Multi-Head Hashing,” Engram reduces vocabulary size and allows for rapid parsing of information, ensuring distinct concepts (like “Universal” vs. “Universal Studios”) are retrieved without error via context-aware gating.

DeepSeek also explored the optimal balance between memory and compute, discovering a “U-curve” where allocating roughly 20–25% of the sparse parameter budget to Engram yields the best performance. In an experiment dubbed the “Infinite Memory Regime,” they found that performance scales linearly with memory size even when the compute budget is fixed. This implies that future AI improvements may not be solely bound by compute power, but could be achieved by expanding long-term “Engram” memory banks using standard DRAM within data centers.

The performance results highlight the potential of this architecture. In parallel testing, an Engram-27B model surpassed a standard 27B MoE model by 3.4 to 4 points in knowledge-intensive tasks and saw a massive leap in “Needle in a Haystack” long-context accuracy, scoring 97% compared to the MoE’s 84.2%. With DeepSeek viewing conditional memory as an “indispensable modeling primitive,” industry observers suggest this technology could be central to the rumored DeepSeek V4, potentially shifting hardware demand from HBM to standard system DRAM.

source: https://github.com/deepseek-ai/Engram

Leave a Reply Cancel reply

You must be logged in to post a comment.

Recent Posts

  • What is DeepSeek’s Engram?
  • How to Installing Zabbix 7.2 on Ubuntu 25.10 for Real-Time Monitoring
  • Review MySQL Database Recovery Tool by Stellar
  • RQuickShare Tutorial: How to Bring Android’s Quick Share Feature to Your Linux Desktop
  • Why Storage & Memory Price Surges | Self-hosting Podcast January 14th, 2026
  • Tailwind’s Revenue Down 80%: Is AI Killing Open Source?
  • Building Open Cloud with Apache CloudStack
  • TOP 1% AI Coding: 5 Practical Techniques to Code Like a Pro
  • Why Your Self-Hosted n8n Instance Might Be a Ticking Time Bomb
  • CES 2026: Real Botics Wants to Be Your Best Friend, but at $95k, Are They Worth the Hype?
  • Apa itu Cosmic Desktop: Pengertian dan Cara Pasangnya di Ubuntu 26.04?
  • Apa Itu Auvidea X242? Pengertian Carrier Board Jetson T5000 dengan Dual 10Gbe
  • Elementary OS 8.1 Resmi Rilis: Kini Pakai Wayland Secara Standar!
  • Apa Itu Raspberry Pi Imager? Pengertian dan Pembaruan Versi 2.0.3 yang Wajib Kalian Tahu
  • Performa Maksimal! Ini Cara Manual Update Ubuntu ke Linux Kernel 6.18 LTS
  • Ubuntu 26.04 LTS Resmi Gunakan Kernel Terbaru!
  • Apa Itu AI Kill Switch di Firefox? Ini Pengertian dan Detail Fitur Terbarunya
  • Apa Itu Platform Modular Intel Alder Lake N (N100)? Ini Pengertian dan Spesifikasinya
  • Apa Itu Armbian Imager? Pengertian Utilitas Flashing Resmi untuk Perangkat ARM Kalian
  • Apa Itu OpenShot 3.4? Pengertian dan Fitur LUT Terbaru untuk Grading Warna
  • Flatpak 1.16.2: Sandbox Baru untuk GPU Intel Xe dan VA-API
  • Apa Itu EmmaUbuntu Debian 6? Pengertian Distro Ringan Berbasis Trixie untuk PC Lawas
  • Apa Itu LocalSend? Pengertian dan Definisi Solusi Transfer File Lintas Platform
  • Apa Itu Microservices Playbook untuk AI Agent? Ini Definisi dan Strategi Penerapannya
  • Apa Itu Firefox AI Engine? Definisi dan Pengertian Strategi Baru Mozilla
  • Belum Tahu? Inilah Suku Bajau Punya Gen “Mutan” Mirip Fishman One Piece, Ini Faktanya!
  • Inilah Paket PLTS Hybrid 6kVA Aspro DML 600 yang Paling Powerful!
  • Suku Tsaatan: Suku Mongolia Penggembala Rusa Kutub
  • Game Happy Rush Terbukti Membayar atau Cuma Scam Iklan?
  • Cara Nonton Drama Dapat Duit di Free Flick, Tapi Awas Jangan Sampai Tertipu Saldo Jutaan!
  • Cara Membuat AI Voice Agent Cerdas untuk Layanan Pelanggan Menggunakan Vapi
  • Inilah Cara Belajar Cepat Model Context Protocol (MCP) Lewat 7 Proyek Open Source Terbaik
  • Inilah Cara Menguasai Tracing dan Evaluasi Aplikasi LLM Menggunakan LangSmith
  • Begini Cara Menggabungkan LLM, RAG, dan AI Agent untuk Membuat Sistem Cerdas
  • Cara Buat Sistem Moderasi Konten Cerdas dengan GPT-OSS-Safeguard
  • Apa itu CVE-2020-12812? Ini Penjelasan Celah Keamanan Fortinet FortiOS 2FA yang Masih Bahaya
  • Apa itu CVE-2025-14847? Ini Penjelasan Lengkap MongoBleed
  • Ini Kronologi & Resiko Kebocoran Data WIRED
  • Apa itu Grubhub Crypto Scam? Ini Pengertian dan Kronologi Penipuan yang Catut Nama Grubhub
  • Apa Itu CVE-2025-59374? Mengenal Celah Keamanan ASUS Live Update yang Viral Lagi
©2026 Tutorial emka | Design: Newspaperly WordPress Theme