Researchers train AI model that hits near-full performance with just 12.5 percent of its experts
Allen Institute for AI and UC Berkeley release EMO, a mixture-of-experts model that retains near-full performance using only 12.5% of its experts by specialising them on content domains rather than word types.
Researchers at the Allen Institute for AI and UC Berkeley have built EMO, a mixture-of-experts model whose experts specialize in content domains instead of word types. That lets you strip out three-quarters of the experts while losing only about one percentage point of performance, a step that could make MoE models practical for memory-constrained settings for the first time.
The article Researchers train AI model that hits near-full performance with just 12.5 percent of its experts appeared first on The Decoder.