Jordi

Semantic Kernel plus Microsoft.Extensions.AI banner

1. Introduction

Semantic Kernel (SK) gives .NET developers an orchestration layer for building intelligent agents—prompt templates, plugins, planning, memory and more. Microsoft.Extensions.AI (ME.AI) sits one level lower, supplying lightweight contracts and a middleware pipeline for talking to any large-language-model back-end. By standing on ME.AI, SK inherits the same design principles that turned ASP.NET Core, Entity Framework Core and the wider Microsoft.Extensions stack into reliable foundations: dependency injection as a first-class citizen, composable middleware and testability out of the box.

2. Layered mental model

If you imagine building a web API, SK is equivalent to the MVC layer (controllers, views, filters) whereas ME.AI resembles the underlying HTTP abstractions such as HttpClientFactory. The following ASCII sketch places each piece in context.


        ┌───────────────────────────────────────────┐
        │          Your Chatbot / Copilot           │
        └──────────────▲────────────────────────────┘
        │ Plans, plugins, vector-memory
        ┌──────────────┴────────────────────────────┐
        │            Semantic Kernel (SK)           │
        │  (prompts, planning, agents, memories)    │
        └──────────────▲────────────────────────────┘
        │ IChatClient, IEmbeddingGenerator
        ┌──────────────┴────────────────────────────┐
        │         Microsoft.Extensions.AI           │
        │ (abstractions, DI, middleware, logging)   │
        └──────────────▲────────────────────────────┘
        │ HTTP / gRPC / WebSockets
        ┌──────────────┴────────────────────────────┐
        │        OpenAI, Azure AI, Ollama…          │
        └───────────────────────────────────────────┘
        

3. Dependency injection fundamentals

All ME.AI clients register with the same IServiceCollection you already use in ASP.NET Core. This means every SK feature that depends on a model—chat, embeddings or function calling—can resolve its collaborators through standard DI without any bespoke static factories.


        // Program.cs or Startup
        var kernel = Kernel.CreateBuilder()
        .AddOpenAIChatClient("gpt-4o", Environment.GetEnvironmentVariable("OPENAI_KEY"))
        .AddAzureOpenAIEmbeddingGenerator("text-embed", "https://my-ai.azure.com/", azureKey)
        .Build();

        // Anywhere later in the app
        var chat = kernel.Services.GetRequiredService();
        Console.WriteLine(await chat.GetResponseAsync([new(ChatRole.User, "Hello, world!")]));
        

Notice there is no SK-specific chat interface—SK leans directly on ME.AI.

4. Middleware and cross-cutting concerns

ME.AI borrows ASP.NET Core’s .UseXxx() convention, so you can slot in functionality such as caching, rate-limiting or telemetry with a single line and without touching business logic.


        var client = new ChatClientBuilder(new OpenAIChatClient("gpt-4o", key))
        .UseDistributedCache(redisCache)          // transparent response caching
        .UseRetryPolicy(RetryPolicy.Exponential())// resilience
        .UseOpenTelemetry(sourceName: "ai-demo")  // traces, metrics, logs
        .Build();
        

Because SK consumes that IChatClient, every plugin and planner automatically benefits from the middleware you configure.

5. Observability: traces, metrics and logs

OpenTelemetry support in ME.AI emits structured spans for each model call. You get duration, tokens in/out, temperature and any tool invocations—all correlated with the rest of your application traces. Hook it up to Application Insights, Jaeger or Zipkin and you can follow a request from an HTTP endpoint, through SK planning, down to the individual LLM round trips.

6. Caching and performance

LLM calls are often the slowest and most expensive hop. ME.AI’s built-in distributed-cache middleware lets you memoise prompts on Redis, SQL or even in-memory caches. When SK asks the same question twice, the second hit can be served in microseconds instead of seconds.

7. Multiple providers—a practical example

It is common to combine a hosted model such as OpenAI for production with a local model like Ollama for offline development. Service IDs make that trivial.


        services.AddOpenAIChatClient("gpt-4o", openAiKey, serviceId: "Prod");
        services.AddOllamaChatClient("llama3", new Uri("http://localhost:11434"), serviceId: "Dev");

        var kernel = services.BuildServiceProvider().GetRequiredService();

        // Use OpenAI in production paths
        Console.WriteLine(await kernel.InvokePromptAsync("Say hi", new(new PromptExecutionSettings { ServiceId = "Prod" })));

        // Switch to local model for quick dev inner loop
        Console.WriteLine(await kernel.InvokePromptAsync("Say hi", new(new PromptExecutionSettings { ServiceId = "Dev" })));
        

8. Testing strategies

Because clients are plain interfaces, you can inject a stub or mock when running unit tests. This keeps test runs fast, deterministic and offline.


        // xUnit test
        services.AddSingleton(new FakeChatClient("🚀")); // returns a canned response
        var kernel = services.BuildServiceProvider().GetRequiredService();

        string result = await kernel.InvokePromptAsync("ignored");
        Assert.Equal("🚀", result);
        

9. Security and governance

Enterprises often require content filtering, data-loss-prevention checks and policy enforcement before anything leaves the network. Implement that logic once in an IChatClient decorator or a middleware component. All downstream SK features inherit the same guard rails.

10. Migration at a glance


  1. Swap any custom HttpClient or model SDK calls for ME.AI client registrations.
  2. Remove obsolete SK-specific interfaces. Let SK resolve IChatClient and IEmbeddingGenerator through DI.
  3. Add optional middleware: caching, OpenTelemetry and retry policies.
  4. Write unit tests against the new abstractions using stubs or mocks.

11. Final thoughts

Microsoft.Extensions.AI acts as the standard plumbing layer for generative AI in .NET. Semantic Kernel builds orchestration, planning and agent logic on top of that foundation. By embracing both, you gain clean dependency injection, composable middleware, first-class observability and lighter tests—while keeping all the high-level features that made SK popular in the first place.

Comments


Comments are closed