Shipfeed. AI News Channel

items22 latest

▶ ai·16:12

Transformers: Release v5.9.0

Release v5.9.0 New Model additions Cohere2Moe Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model…

transformers — Releases

▶ ai·05:21

Transformers: Patch release v5.8.1

Patch release v5.8.1 This release is mainly to fix the Deepseek V4 integration!!! [fix] Add fatal_error to ContinuousBatchingManager so the serving... by @qgallouedec, @remi-or Fix WeightConverter regex incorrectly…

transformers — Releases

▶ ai·18:52

transformers v5.8.0

Release v5.8.0 New Model additions DeepSeek-V4 DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The…

transformers — Releases

▶ ai·20:32

transformers v5.7.0

Release v5.7.0 New Model additions Laguna Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing…

transformers — Releases

▶ ai·20:36

transformers v5.6.2

Patch release v5.6.2 Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this :saluting_face: Fix configuration reading and error handling for kernels…

transformers — Releases

▶ ai·10:20

transformers v5.6.1

Patch release v5.6.1 Flash attention path was broken! Sorry everyone for this one 🤗 Fix AttributeError on s_aux=None in flash_attention_forward (https://github.com/huggingface/transformers/pull/45589) by @jamesbraza

transformers — Releases

▶ ai·17:52

transformers v5.6.0

Release v5.6.0 New Model additions OpenAI Privacy Filter OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended…

transformers — Releases

▶ ai·18:58

transformers v5.5.4

Patch release v5.5.4 This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex Attribute… (#45305) by ArthurZucker For training: ** Fix…

transformers — Releases

▶ ai·17:53

transformers v5.5.3 — Patch release: v5.5.3

Small patch release to fix `device_map` support for Gemma4! It contains the following commit: [gemma4] Fix device map auto (#45347) by @Cyrilvallez

transformers — Releases

▶ ai·16:05

transformers v5.5.2 — Patch release: v5.5.2

Small patch dedicated to optimizing gemma4, fixing inference with `use_cache=False` due to k/v states sharing between layers, as well as conversion mappings for some models that would inconsistently serialize their…

transformers — Releases

▶ ai·07:53

transformers v5.5.1

Patch release v5.5.1 This patch is very small and focuses on vLLM and Gemma4! ** Fix export for gemma4 and add Integration tests (#45285) by @Cyrilvallez ** Fix vllm cis (#45139) by @ArthurZucker

transformers — Releases

▶ ai·18:15

transformers v5.5.0

Release v5.5.0 New Model additions Gemma4 Gemma 4 is a multimodal model with pretrained and instruction-tuned variants, available in 1B, 13B, and 27B parameters. The architecture is mostly the same as the previous…

transformers — Releases

▶ mistral·01:33

transformers v5.4.0 — Release v5.4.0: PaddlePaddle models 🙌, Mistral 4, PI0, VidEoMT, UVDoc, SLANeXt, Jina Embeddings v3

New Model additions VidEoMT Video Encoder-only Mask Transformer (VidEoMT) is a lightweight encoder-only model for online video segmentation built on a plain Vision Transformer (ViT). It eliminates the need for…

transformers — Releases

▶ ai·18:42

transformers v5.3.0 — v5.3.0: EuroBERT, VibeVoice ASR, TimesFM2.5, PP-DocLayoutV2, OlmoHybrid, ModernVBert, Higgs Audio V2

New Model additions EuroBERT EuroBERT is a multilingual encoder model based on a refreshed transformer architecture, akin to Llama but with bidirectional attention. It supports a mixture of European and widely spoken…

transformers — Releases

▶ ai·19:55

transformers v5.2.0 — v5.2.0: GLM-5, Qwen3.5, Voxtral Realtime, VibeVoice Acoustic Tokenizer

New Model additions VoxtralRealtime VoxtralRealtime is a streaming speech-to-text model from Mistral AI, designed for real-time automatic speech recognition (ASR). Unlike the offline Voxtral model which processes…

transformers — Releases

▶ ai·16:44

transformers v5.1.0 — v5.1.0: EXAONE-MoE, PP-DocLayoutV3, Youtu-LLM, GLM-OCR

New Model additions EXAONE-MoE K-EXAONE is a large-scale multilingual language model developed by LG AI Research. Built using a Mixture-of-Experts architecture, K-EXAONE features 236 billion total parameters, with 23…

transformers — Releases

▶ ai·11:17

transformers v5.0.0

Transformers v5 release notes Highlights Significant API changes: dynamic weight loading, tokenization Backwards Incompatible Changes Bugfixes and improvements We have a migration guide that will be continuously…

transformers — Releases

▶ ai·11:02

transformers v5.0.0rc3

Release candidate v5.0.0rc3 New models: [GLM-4.7] GLM-Lite Supoort by @zRzRzRzRzRzRzR in https://github.com/huggingface/transformers/pull/43031 [GLM-Image] AR Model Support for GLM-Image by @zRzRzRzRzRzRzR in…

transformers — Releases

▶ ai·11:40

transformers v4.57.6

What's Changed Another fix for qwen vl models that prevented correctly loading the associated model type - this works together with https://github.com/huggingface/transformers/pull/41808 of the previous patch release…

transformers — Releases

▶ ai·14:29

transformers v4.57.5

What's Changed Should not have said last patch :wink: These should be the last remaining fixes that got lost in between patches and the transition to v5. QwenVL: add skipped keys in setattr as well by @zucchini-nlp in…

transformers — Releases

▶ ai·12:07

transformers v4.57.4

What's Changed Last patch release for v4: We have a few small fixes for remote generation methods (e.g. group beam search), vLLM, and an offline tokenizer fix (if it's already been cached). Grouped beam search from…

transformers — Releases

▶ ai·11:33

transformers v5.0.0rc2

What's Changed This release candidate is focused on fixing `AutoTokenizer`, expanding the dynamic weight loading support, and improving performances with MoEs! MoEs and performances: batched and grouped experts…

transformers — Releases