The speed at which LLMs perform are starting to become bottlenecks in scaling up usage. It is not unusual to wait for minutes, or even tens of minutes, for your AI code assistant to spit out some code.
Recently, I played around with Mercury Coder Mini, a Diffusion LLM (as