Machine Learning Questions
Machine Learning Questions
Machine Learning Questions Skip to main content Machine Learning Questions [...] r/ML [...] Machine Learning from Scratch - Python Tutorials by Patrick Loeber [...] Help finding baseline results for small language models on WikiText-2? [...] 7 hr. ago Help finding baseline results for small language models on WikiText-2? [...] Hi! I'm pretty new to ML and want to start tinkering with language models :3 I keep reading papers that mention WikiText-2 results, but I'm having trouble finding benchmark numbers for smaller models (like 3-10M params). Most papers seem to focus on the bigger configs! Does anyone know where I can find: Mamba's WikiText-2 performance for small model sizes? Standard transformer baselines at this scale? Any other efficient architectures tested on WikiText-2? I want to make sure I'm comparing things fairly when I start experimenting.
Thanks for any help! 🥺 [...] How to split a dataset into 2 to check for generalization over memorization? [...] How to split a dataset into [...] I wish to ensure that a neural network does generalization rather than memorization. in terms of using 1 dataset that is a collection of social media chats, would it be sufficent to…
- reddit.comMachine Learning Questionsprimary