BenchLLM-Python-based open-source library, testing of Large Language Models (LLMs) and AI-powered applications
Evaluate your LLMs on the fly. Build test suites for your models and generate quality reports. Choose between automated, interactive, or custom evaluation strategies.