Ahead of AI (Sebastian Raschka)

items21 latest

▶ ai·13:33

Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention

From Gemma 4 to DeepSeek V4, How New Open-Weight LLMs Are Reducing Long-Context Costs

Ahead of AI (Sebastian Raschka)

▶ ai·13:24

My Workflow for Understanding LLM Architectures

A learning-oriented workflow for understanding new open-weight model releases

Ahead of AI (Sebastian Raschka)

▶ ai·13:45

Components of A Coding Agent

How coding agents use tools, memory, and repo context to make LLMs work better in practice

Ahead of AI (Sebastian Raschka)

▶ ai·12:55

A Visual Guide to Attention Variants in Modern LLMs

From MHA and GQA to MLA, sparse attention, and hybrid architectures

Ahead of AI (Sebastian Raschka)

▶ ai·14:26

A Dream of Spring for Open-Weight LLMs: 10 Architectures from Jan-Feb 2026

A Round Up And Comparison of 10 Open-Weight LLM Releases in Spring 2026

Ahead of AI (Sebastian Raschka)

▶ ai·12:23

Categories of Inference-Time Scaling for Improved LLM Reasoning

And an Overview of Recent Inference-Scaling Papers

Ahead of AI (Sebastian Raschka)

▶ ai·13:22

The State Of LLMs 2025: Progress, Problems, and Predictions

A 2025 review of large language models, from DeepSeek R1 and RLVR to inference-time scaling, benchmarks, architectures, and predictions for 2026.

Ahead of AI (Sebastian Raschka)

▶ ai·13:15

LLM Research Papers: The 2025 List (July to December)

In June, I shared a bonus article with my curated and bookmarked research paper lists to the paid subscribers who make this Substack possible.

Ahead of AI (Sebastian Raschka)

▶ deepseek·13:03

From DeepSeek V3 to V3.2: Architecture, Sparse Attention, and RL Updates

Understanding How DeepSeek's Flagship Open-Weight Models Evolved

Ahead of AI (Sebastian Raschka)

▶ ai·14:06

Beyond Standard LLMs

Linear Attention Hybrids, Text Diffusion, Code World Models, and Small Recursive Transformers

Ahead of AI (Sebastian Raschka)

▶ ai·13:12

Understanding the 4 Main Approaches to LLM Evaluation (From Scratch)

Multiple-Choice Benchmarks, Verifiers, Leaderboards, and LLM Judges with Code Examples

Ahead of AI (Sebastian Raschka)

▶ ai·13:10

Understanding and Implementing Qwen3 From Scratch

A Detailed Look at One of the Leading Open-Source LLMs

Ahead of AI (Sebastian Raschka)

▶ gpt·13:23

From GPT-2 to gpt-oss: Analyzing the Architectural Advances

And How They Stack Up Against Qwen3

Ahead of AI (Sebastian Raschka)

▶ ai·13:11

The Big LLM Architecture Comparison

From DeepSeek-V3 to Kimi K2: A Look At Modern LLM Architecture Design

Ahead of AI (Sebastian Raschka)

▶ ai·13:11

LLM Research Papers: The 2025 List (January to June)

A topic-organized collection of 200+ LLM research papers from 2025

Ahead of AI (Sebastian Raschka)

▶ ai·12:55

Understanding and Coding the KV Cache in LLMs from Scratch

KV caches are one of the most critical techniques for efficient inference in LLMs in production.

Ahead of AI (Sebastian Raschka)

▶ ai·13:03

Coding LLMs from the Ground Up: A Complete Course

Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they had a lot of fun doing it.

Ahead of AI (Sebastian Raschka)

▶ ai·13:02

The State of Reinforcement Learning for LLM Reasoning

Understanding GRPO and New Insights from Reasoning Model Papers

Ahead of AI (Sebastian Raschka)

▶ ai·12:11

First Look at Reasoning From Scratch: Chapter 1

Welcome to the next stage of large language models (LLMs): reasoning. LLMs have transformed how we process and generate text, but their success has been largely driven by statistical pattern recognition. However, new…

Ahead of AI (Sebastian Raschka)

▶ ai·13:11

The State of LLM Reasoning Model Inference

Inference-Time Compute Scaling Methods to Improve Reasoning Models

Ahead of AI (Sebastian Raschka)

▶ ai·13:11

Understanding Reasoning LLMs

Methods and Strategies for Building and Refining Reasoning Models

Ahead of AI (Sebastian Raschka)