Required
Recommended
Optional
token_limit
parameter to make this easy. Just set it to the context length of your generation model with some buffer room for the system prompt.
For case 2, you set a relevance threshold based on the recall you want to hit.
Strategy | Recall | Threshold |
---|---|---|
Conservative | 99%+ | ≤ 0.02 |
Balanced | 95%+ | ≤ 0.08 |
Aggressive | 80-90% | 0.15-0.25 |