§ local-llm · storyline

Adds support for IBM Granite Embedding multilingual R2 models

llama.cpp adds support for IBM Granite Embedding multilingual R2 models (97m and 311m parameters), including SwiGLU FFN handling and updated tokenizer configurations.

Jun 2 · 18:57:07 · primary fetch1 sourceupdated Jun 2 · 18:57:07

model : support granite multilingual embeddings R2 (ibm-granite/granite-embedding-{97,311}m-multilingual-r2) (#22716) Add support for the ibm-granite/granite-embedding-{97m,311m}-multilingual-r2 embedding models: Added a version of the gpt4o tokenizer that has a fixed regex (better handling of marks), and different token merging setting for the 97m model Reused gemma4 tokenizer for the 311m model granite-embedding--multilingual-r2 : add support SwiGLU FFN for Granite Embedding Multilingual R2 added new GGUF key .hidden_activation (LLM_KV_HIDDEN_ACT) + writer added a forward declaration of llm_ffn_op_type to llama-hparams.h added llm_ffn_op in hparams added LLM_FFN_NONE = 0 sentinel to llm_ffn_op_type (value-initialization), modern-bert: explicitly assigns LLM_FFN_GEGLU before reading GGUF (unchanged).

centralized hidden_act mapping in llama-model.cpp, added llm_ffn_op_type_from_string() helper, mirroring rope_scaling_type/llama_rope_scaling_type_from_string() modern-bert reads the GGUF key (when present) and uses the resulting op in its FFN graph Added granite-embedding-{97m,311m}-multilingual-r2 to the converter code Added the hashes for the granite embedding multilingual R2…

read full article on github.com ↗

§ sources1 publication · timeline below

github.comllama.cpp b9481primary18:57:07