shipfeedAI news, curated daily

01:14:34 CET
21 MAY01:14:34shipfeed
pull to refreshlast sync
Just in — 30 new
§ feed · storyline

Machine Learning Questions

Machine Learning Questions

May 12 · · primary fetch1 sourceupdated May 12 ·

Machine Learning Questions Skip to main content Machine Learning Questions [...] r/ML [...] Machine Learning from Scratch - Python Tutorials by Patrick Loeber [...] Help finding baseline results for small language models on WikiText-2? [...] 7 hr. ago Help finding baseline results for small language models on WikiText-2? [...] Hi! I'm pretty new to ML and want to start tinkering with language models :3 I keep reading papers that mention WikiText-2 results, but I'm having trouble finding benchmark numbers for smaller models (like 3-10M params). Most papers seem to focus on the bigger configs! Does anyone know where I can find: Mamba's WikiText-2 performance for small model sizes? Standard transformer baselines at this scale? Any other efficient architectures tested on WikiText-2? I want to make sure I'm comparing things fairly when I start experimenting.

Thanks for any help! 🥺 [...] How to split a dataset into 2 to check for generalization over memorization? [...] How to split a dataset into [...] I wish to ensure that a neural network does generalization rather than memorization. in terms of using 1 dataset that is a collection of social media chats, would it be sufficent to…

read full article on reddit.com
§ sources1 publication · timeline below
  1. reddit.comMachine Learning Questionsprimary