§ feed · storyline

CodeLLama 70B beats GPT4 on HumanEval

Meta AI releases CodeLlama, an open-source code model available for local use on platforms including Ollama and MLX, with the 70B variant reported to surpass GPT-4 on the HumanEval benchmark.

Jan 30 · 22:10:01 · primary fetch1 sourceupdated Jan 30 · 22:10:01

Meta AI surprised the community with the release of CodeLlama, an open-source model now available on platforms like Ollama and MLX for local use. The Miqu model sparked debate over its origins, possibly linked to Mistral Medium or a fine-tuned Llama-2-70b, alongside discussions on AI ethics and alignment risks. The Aphrodite engine showed strong performance on A6000 GPUs with specific configurations. Role-playing AI models such as Mixtral and Flatdolphinmaid faced challenges with repetitiveness, while Noromaid and Rpcal performed better, with ChatML and DPO recommended for improved responses. Learning resources like fast.ai's course were highlighted for ML/DL beginners, and fine-tuning techniques with optimizers like Paged 8bit lion and adafactor were discussed.

At Nous Research AI, the Activation Beacon project introduced a method for unlimited context length in LLMs using "global state" tokens, potentially transforming retrieval-augmented models. The Eagle-7B model, based on RWKV-v5, outperformed Mistral in benchmarks with efficiency and multilingual capabilities. OpenHermes2.5 was recommended for consumer hardware due to its quantization methods. Multimodal and domain-specific…

read full article on news.smol.ai ↗

§ sources1 publication · timeline below

news.smol.aiCodeLLama 70B beats GPT4 on HumanEvalprimary22:10:01