DApp Store | Web3 Hub for Events & Games

Trending topics

The new 1 Trillion parameter Kimi K2 Thinking model runs well on 2 M3 Ultras in its native format - no loss in quality! The model was quantization aware trained (qat) at int4. Here it generated ~3500 tokens at 15 toks/sec using pipeline-parallelism in mlx-lm:

It generated a fully functional space invaders game no problem. It only used a few hundred thinking tokens and 3500 overall which is quite nice.

328.05K

Top

Ranking

Favorites