This is a visualization of a finetuned LLaMA LLM model solving an ARC-AGI problem. Utilizing the code from the arc prize 2024 winners, where the LLM is used in conjunction with a Depth-first search tree to explore possibilities ordinary sampling would miss. A lower score means the LLM is more certain, if adding the next possible token would make the score go over ~2.2 then that route is no longer valid. Notice how the tree branches at point where the LLM has to make a colour decision.