Technology Engineering – Page 27 – C4: Container, Code, Cloud & Context

LLM Caching Strategies: From Exact Match to Semantic Similarity

Posted on September 12, 2024 by Nithin Mohan TK 11 min read

Introduction: LLM API calls are expensive and slow. Caching is your first line of defense against runaway costs and latency. But caching LLM responses isn’t straightforward—the same question phrased differently should return the same cached answer. This guide covers caching strategies for LLM applications: exact match caching for deterministic queries, semantic caching using embeddings for […]

Read more →

LLM Memory and Context Management: Building Conversational AI That Remembers

Posted on September 11, 2024 by Nithin Mohan TK 9 min read

Introduction: LLMs have no inherent memory—each API call is stateless. The model doesn’t remember your previous conversation, your user’s preferences, or the context you established five messages ago. Memory is something you build on top. This guide covers implementing different memory strategies for LLM applications: buffer memory for recent context, summary memory for long conversations, […]

Read more →

.NET 8 and C# 12: A Deep Dive into Native AOT, Primary Constructors, and Blazor United

Posted on September 9, 2024 by Nithin Mohan TK 9 min read

Introduction: .NET 8 represents a landmark release in Microsoft’s development platform evolution, bringing Native AOT to mainstream scenarios, unifying Blazor’s rendering models, and introducing C# 12’s powerful new features. Released in November 2023, this Long-Term Support version delivers significant performance improvements, reduced memory footprint, and enhanced developer productivity. After migrating several enterprise applications to .NET […]

Read more →

LLM Application Logging and Tracing: Building Observable AI Systems

Posted on September 3, 2024 by Nithin Mohan TK 11 min read

Introduction: Production LLM applications require comprehensive logging and tracing to debug issues, monitor performance, and understand user interactions. Unlike traditional applications, LLM systems have unique logging needs: capturing prompts and responses, tracking token usage, measuring latency across chains, and correlating requests through multi-step workflows. This guide covers practical logging patterns: structured request/response logging, distributed tracing […]

Read more →

AWS re:Invent 2023: Amazon Bedrock and Q Transform Enterprise AI with Foundation Models and Intelligent Assistants

Posted on August 30, 2024 by Nithin Mohan TK 14 min read

Introduction: AWS re:Invent 2023 delivered transformative announcements for enterprise AI adoption, with Amazon Bedrock reaching general availability and Amazon Q emerging as AWS’s answer to AI-powered enterprise assistance. These services represent AWS’s strategic vision for making generative AI accessible, secure, and enterprise-ready. After integrating Bedrock into production workloads, I’ve found its model-agnostic approach and native […]

Read more →

Guardrails and Safety for LLMs: Building Secure AI Applications with Input Validation and Output Filtering

Posted on August 26, 2024 by Nithin Mohan TK 12 min read

Introduction: Production LLM applications need guardrails to ensure safe, appropriate outputs. Without proper safeguards, models can generate harmful content, leak sensitive information, or produce responses that violate business policies. Guardrails provide defense-in-depth: input validation catches problematic requests before they reach the model, output filtering ensures responses meet safety standards, and content moderation prevents harmful generations. […]

Read more →

Searching in

Category: Technology Engineering

LLM Caching Strategies: From Exact Match to Semantic Similarity

LLM Memory and Context Management: Building Conversational AI That Remembers

.NET 8 and C# 12: A Deep Dive into Native AOT, Primary Constructors, and Blazor United

LLM Application Logging and Tracing: Building Observable AI Systems

AWS re:Invent 2023: Amazon Bedrock and Q Transform Enterprise AI with Foundation Models and Intelligent Assistants

Guardrails and Safety for LLMs: Building Secure AI Applications with Input Validation and Output Filtering