shipfeedAI news, curated daily

23:07:37 CET
20 MAY23:07:37shipfeed
pull to refreshlast sync
Just in — 30 new
§ source

transformers — Releases

github.com·sdk·22 items·last fetched

items22 latest

ai·

Transformers: Release v5.9.0

Release v5.9.0 New Model additions Cohere2Moe Command A+ is a Mixture-of-Experts (MoE) language model from Cohere that features a hybrid attention pattern combining sliding window and full attention layers. The model…

transformers — Releases
ai·

transformers v5.8.0

Release v5.8.0 New Model additions DeepSeek-V4 DeepSeek-V4 is the next-generation MoE (Mixture of Experts) language model from DeepSeek that introduces several architectural innovations over DeepSeek-V3. The…

transformers — Releases
ai·

transformers v5.7.0

Release v5.7.0 New Model additions Laguna Laguna is Poolside's mixture-of-experts language model family that extends standard SwiGLU MoE transformers with two key innovations. It features per-layer head counts allowing…

transformers — Releases
ai·

transformers v5.6.2

Patch release v5.6.2 Qwen 3.5 and 3.6 MoE (text-only) were broken when using with FP8. It should now work again with this :saluting_face: Fix configuration reading and error handling for kernels…

transformers — Releases
ai·

transformers v5.6.1

Patch release v5.6.1 Flash attention path was broken! Sorry everyone for this one 🤗 Fix AttributeError on s_aux=None in flash_attention_forward (https://github.com/huggingface/transformers/pull/45589) by @jamesbraza

transformers — Releases
ai·

transformers v5.6.0

Release v5.6.0 New Model additions OpenAI Privacy Filter OpenAI Privacy Filter is a bidirectional token-classification model for personally identifiable information (PII) detection and masking in text. It is intended…

transformers — Releases
ai·

transformers v5.5.4

Patch release v5.5.4 This is mostly some fixes that are good to have asap, mostly for tokenizers; ** Fix Kimi-K2.5 tokenizer regression and _patch_mistral_regex Attribute… (#45305) by ArthurZucker For training: ** Fix…

transformers — Releases
ai·

transformers v5.5.1

Patch release v5.5.1 This patch is very small and focuses on vLLM and Gemma4! ** Fix export for gemma4 and add Integration tests (#45285) by @Cyrilvallez ** Fix vllm cis (#45139) by @ArthurZucker

transformers — Releases
ai·

transformers v5.5.0

Release v5.5.0 New Model additions Gemma4 Gemma 4 is a multimodal model with pretrained and instruction-tuned variants, available in 1B, 13B, and 27B parameters. The architecture is mostly the same as the previous…

transformers — Releases
ai·

transformers v5.0.0

Transformers v5 release notes Highlights Significant API changes: dynamic weight loading, tokenization Backwards Incompatible Changes Bugfixes and improvements We have a migration guide that will be continuously…

transformers — Releases
ai·

transformers v5.0.0rc3

Release candidate v5.0.0rc3 New models: [GLM-4.7] GLM-Lite Supoort by @zRzRzRzRzRzRzR in https://github.com/huggingface/transformers/pull/43031 [GLM-Image] AR Model Support for GLM-Image by @zRzRzRzRzRzRzR in…

transformers — Releases
ai·

transformers v4.57.6

What's Changed Another fix for qwen vl models that prevented correctly loading the associated model type - this works together with https://github.com/huggingface/transformers/pull/41808 of the previous patch release…

transformers — Releases
ai·

transformers v4.57.5

What's Changed Should not have said last patch :wink: These should be the last remaining fixes that got lost in between patches and the transition to v5. QwenVL: add skipped keys in setattr as well by @zucchini-nlp in…

transformers — Releases
ai·

transformers v4.57.4

What's Changed Last patch release for v4: We have a few small fixes for remote generation methods (e.g. group beam search), vLLM, and an offline tokenizer fix (if it's already been cached). Grouped beam search from…

transformers — Releases
ai·

transformers v5.0.0rc2

What's Changed This release candidate is focused on fixing `AutoTokenizer`, expanding the dynamic weight loading support, and improving performances with MoEs! MoEs and performances: batched and grouped experts…

transformers — Releases