Transformers: Release v5.9.0
Release v5.9.0 New Model additions Cohere2Moe Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model…
github.com·sdk·22 items·last fetched
Release v5.9.0 New Model additions Cohere2Moe Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model…
Patch release v5.8.1 This release is mainly to fix the Deepseek V4 integration!!! [fix] Add fatal_error to ContinuousBatchingManager so the serving... by @qgallouedec, @remi-or Fix WeightConverter regex incorrectly…
Release v5.8.0 New Model additions DeepSeek-V4 DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The…
Release v5.7.0 New Model additions Laguna Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing…
Patch release v5.6.2 Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this :saluting_face: Fix configuration reading and error handling for kernels…
Patch release v5.6.1 Flash attention path was broken! Sorry everyone for this one 🤗 Fix AttributeError on s_aux=None in flash_attention_forward (https://github.com/huggingface/transformers/pull/45589) by @jamesbraza
Release v5.6.0 New Model additions OpenAI Privacy Filter OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended…
Patch release v5.5.4 This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex Attribute… (#45305) by ArthurZucker For training: ** Fix…
Small patch release to fix `device_map` support for Gemma4! It contains the following commit: [gemma4] Fix device map auto (#45347) by @Cyrilvallez
Small patch dedicated to optimizing gemma4, fixing inference with `use_cache=False` due to k/v states sharing between layers, as well as conversion mappings for some models that would inconsistently serialize their…
Patch release v5.5.1 This patch is very small and focuses on vLLM and Gemma4! ** Fix export for gemma4 and add Integration tests (#45285) by @Cyrilvallez ** Fix vllm cis (#45139) by @ArthurZucker
Release v5.5.0 New Model additions Gemma4 Gemma 4 is a multimodal model with pretrained and instruction-tuned variants, available in 1B, 13B, and 27B parameters. The architecture is mostly the same as the previous…
New Model additions VidEoMT Video Encoder-only Mask Transformer (VidEoMT) is a lightweight encoder-only model for online video segmentation built on a plain Vision Transformer (ViT). It eliminates the need for…
New Model additions EuroBERT EuroBERT is a multilingual encoder model based on a refreshed transformer architecture, akin to Llama but with bidirectional attention. It supports a mixture of European and widely spoken…
New Model additions VoxtralRealtime VoxtralRealtime is a streaming speech-to-text model from Mistral AI, designed for real-time automatic speech recognition (ASR). Unlike the offline Voxtral model which processes…
New Model additions EXAONE-MoE K-EXAONE is a large-scale multilingual language model developed by LG AI Research. Built using a Mixture-of-Experts architecture, K-EXAONE features 236 billion total parameters, with 23…
Transformers v5 release notes Highlights Significant API changes: dynamic weight loading, tokenization Backwards Incompatible Changes Bugfixes and improvements We have a migration guide that will be continuously…
Release candidate v5.0.0rc3 New models: [GLM-4.7] GLM-Lite Supoort by @zRzRzRzRzRzRzR in https://github.com/huggingface/transformers/pull/43031 [GLM-Image] AR Model Support for GLM-Image by @zRzRzRzRzRzRzR in…
What's Changed Another fix for qwen vl models that prevented correctly loading the associated model type - this works together with https://github.com/huggingface/transformers/pull/41808 of the previous patch release…
What's Changed Should not have said last patch :wink: These should be the last remaining fixes that got lost in between patches and the transition to v5. QwenVL: add skipped keys in setattr as well by @zucchini-nlp in…
What's Changed Last patch release for v4: We have a few small fixes for remote generation methods (e.g. group beam search), vLLM, and an offline tokenizer fix (if it's already been cached). Grouped beam search from…
What's Changed This release candidate is focused on fixing `AutoTokenizer`, expanding the dynamic weight loading support, and improving performances with MoEs! MoEs and performances: batched and grouped experts…