I built this late last year, all the commentary is being generated on the fly, based on the events happening in the game.
The inference is still slow, but voice inference has advanced so much since then. If built today, with
@retellai
's sub-second latency, this can ship with the