GLM-4.7-Flash: 42x Cheaper Than Claude, Actually Good at Coding!

By In The World of AI

Categories: AI

Summary

GLM-4.7-Flash is a 31B parameter AI model that outperforms GPT-3 and Chinchilla on coding and reasoning benchmarks, while being 10x cheaper to use than other GLM models.

Key Takeaways

GLM-4.7-Flash scores 59.2% on the software engineering benchmark, nearly triple the scores of GPT-3 and Chinchilla.
The model is 1/10th the cost of the full GLM-4.7 model, with input costing just 7 cents per million tokens.
GLM-4.7-Flash outperforms GPT-3 and Chinchilla on agentic capabilities like tool use and multi-step reasoning, scoring 79.5% on the TA2 benchmark.

Related topics

Transcript Excerpt

There's a new open-source model called GLM4.7 Flash that just dropped and it's actually pretty interesting. It's doing really well on coding benchmarks. It's MIT license and you can run it locally. Today I'm going through what it is, the benchmark numbers, and whether it's actually worth paying attention to. So, let's jump in. GLM 4.7 Flash is from Z.AI, AI, which is a Chinese AI company founded by some researchers from Shingua University back in 2019. They've been putting out the GLM series of models, and this is their latest release. The model family has three versions. The full GLM 4.7, a quantized FP8 version, and then this flash variant. Flash is basically optimized for the balance between performance and being able to actually run it without needing a data center. It's 31 billion par…

More from In The World of AI