The Top P, also known as nucleus or top-k, is a parameter that controls the diversity of the generated responses by limiting the set of tokens considered at each step. Specifically, Top P sampling selects from the smallest set of tokens whose cumulative probability exceeds a predefined threshold. Here’s an explanation of different Top P settings:

Low Top P (e.g., 0.1 - 0.4)

Effect: Low Top P values