§ feed · storyline
Improving language model behavior by training on a curated dataset
Anthropic publishes research showing that fine-tuning on a small, curated dataset can improve language model alignment with specific behavioral values.
Our latest research finds we can improve language model behavior with respect to specific behavioral values by fine-tuning on a small, curated dataset.
§ sources1 publication · timeline below