Skip to main content
Hi there, I am Johnny

Lightweight AI Evaluation

December 2024

For quick and easy evaluation or comparison of AI responses in .NET applications, particularly tests. We can leverage autoevals excellent 'LLM-as-a-Judge' prompts with the help of Semantic Kernel.

Sample code

Note that you need to setup semantic kernel with chat completion first. It is also recommended to set 'Temperature' to 0.

var json = 
    """
    {
        "humor" : {
            "output" : "this maybe funny"
        }
    }
    """;
await foreach (var result in 
        kernel.Run(json, executionSettings: executionSettings))
{
    Console.WriteLine($"[{result.Key}]: result: {result.Value?.Item1}, score: {result.Value?.Item2}");
}

Source

While Microsoft.Extensions.AI.Evaluation is in the making, it currently involves a little too much 'ceremonies' for simple use cases.