Kernel Memory default tokenizer
December 2024
Choosing the right tokenizer for in AI is crucial because it directly impacts the accuracy and efficiency. There is an optional 'tokenizer' parameter when configure Kernel Memory for both text generation and embedding.
If no 'tokenizer' specified, [Kernel Memory](https://github.com/microsoft/kernel-memory) attempts to pick up 'default' one. But it does this by the AI model name (Depolyment in Azure OpenAI) like thisTokenizerFactory shows.
My take away:
-
Always prefix your deployment name by the actual model name For example, for 'gpt-4o' model, the name should be prefix with model like this 'gpt-4o-bot'
-
Leave the 'tokenizer' parameter to default and let Kernel Memory pick one automatically.