§ feed · storyline
DSGym: A holistic framework for evaluating and training data science agents
DSGym releases a framework for evaluating and training LLM-based data science agents, featuring 90-plus bioinformatics tasks, 92 Kaggle competitions, and a 4B open-source model claiming top benchmark performance.
Introducing DSGym—a holisti evaluation and training framework for LLM-based data science agents. Features 90+ bioinformatics tasks, 92 Kaggle competitions, and synthetic trajectory generation.
Our 4B model achieves state-of-the-art performance among open-source models through exe
§ sources1 publication · timeline below