Lightweight AI Evaluation
December 2024
For quick and easy evaluation or comparison of AI responses in .NET applications, particularly tests. We can leverage autoevals excellent 'LLM-as-a-Judge' prompts with the help of Semantic Kernel.
Sample code
Note that you need to setup semantic kernel with chat completion first. It is also recommended to set 'Temperature' to 0.
var json =
"""
{
"humor" : {
"output" : "this maybe funny"
}
}
""";
await foreach (var result in
kernel.Run(json, executionSettings: executionSettings))
{
Console.WriteLine($"[{result.Key}]: result: {result.Value?.Item1}, score: {result.Value?.Item2}");
}
While Microsoft.Extensions.AI.Evaluation is in the making, it currently involves a little too much 'ceremonies' for simple use cases.