Flags and Configs
Sometimes you might want to customize the behavior of different settings for evaluate() and assert_test(), and this can be done using "configs" (short for configurations) and "flags".
For example, if you're using a custom LLM judge for evaluation, you may wish to ignore_errors to not interrupt evaluations whenever your model fails to produce a valid JSON, or avoid rate limit errors entirely by lowering the max_concurrent value.
Configs for evaluate()
Async Configs
The AsyncConfig controls how concurrently metrics, observed_callback, and test_cases will be evaluated during evaluate().
from deepeval.evaluate import AsyncConfig
from deepeval import evaluate
evaluate(async_config=AsyncConfig(), ...)
There are THREE optional parameters when creating an AsyncConfig:
- [Optional]
run_async: a boolean which when set toTrue, enables concurrent evaluation of test cases AND metrics. Defaulted toTrue. - [Optional]
throttle_value: an integer that determines how long (in seconds) to throttle the evaluation of each test case. You can increase this value if your evaluation model is running into rate limit errors. Defaulted to 0. - [Optional]
max_concurrent: an integer that determines the maximum number of test cases that can be ran in parallel at any point in time. You can decrease this value if your evaluation model is running into rate limit errors. Defaulted to20.
The throttle_value and max_concurrent parameter is only used when run_async is set to True. A combination of a throttle_value and max_concurrent is the best way to handle rate limiting errors, either in your LLM judge or LLM application, when running evaluations.
Display Configs
The DisplayConfig controls how results and intermediate execution steps are displayed during evaluate().
from deepeval.evaluate import DisplayConfig
from deepeval import evaluate
evaluate(display_config=DisplayConfig(), ...)
There are FOUR optional parameters when creating an DisplayConfig:
- [Optional]
verbose_mode: a optional boolean which when IS NOTNone, overrides each metric'sverbose_modevalue. Defaulted toNone. - [Optional]
display: a str of either"all","failing"or"passing", which allows you to selectively decide which type of test cases to display as the final result. Defaulted to"all". - [Optional]
show_indicator: a boolean which when set toTrue, shows the evaluation progress indicator for each individual metric. Defaulted toTrue. - [Optional]
print_results: a boolean which when set toTrue, prints the result of each evaluation. Defaulted toTrue. - [Optional]
file_output_dir: a string which when set, will write the results of the evaluation to the specified directory. Defaulted toNone.