Skip to content

kksKishore K Sharma

Work Writing About Uses Contact

Get in touch→

/tag

#caching

← all writing

/footer · still here

If you're building something hard, let's talk.

Start a conversation→

Direct

via the contact form →
Noida, UP

Elsewhere

LinkedIn ↗
GitHub ↗
X ↗
Hashnode ↗
dev.to ↗
Bluesky ↗
Mastodon ↗
Instagram ↗
About
RSS feed ↗

Views are my own and do not represent any current or past employer. All work shown was completed under appropriate confidentiality and IP terms.

© 2026 Kishore · Built with restraint.·Privacy Termsv2 · System online

3 pieces

Jun 8, 20269 min read
Context Engineering: Managing the Window Like a Cache, Not a Prompt
Prompt engineering was about wording one message. Context engineering is about managing the entire context window as a scarce budget — what goes in, in what order, and what gets evicted. For a backend engineer, it's working-set management applied to an LLM.
- #context-engineering
- #prompt-engineering
- #llm
- #rag
- #agents
- #caching
- #ai-engineering
- #typescript
Jun 4, 202610 min read
Semantic Caching for LLMs: Cache on Meaning, Not on Strings
A normal cache keyed on the exact request string is almost useless for LLM calls, because every paraphrase is a miss. Semantic caching keys on meaning instead — embed the query, search for a near-identical past question, and return its answer with no model call. Here's the architecture, the threshold problem that makes or breaks it, and real pgvector code.
- #llm
- #caching
- #pgvector
- #redis
- #embeddings
- #cost-optimization
- #typescript
- #backend
May 11, 20263 min read
From N+1 to O(1): Optimizing Complex Billing Schedules
How resolving redundant API calls and leveraging caching transformed a sluggish billing generation process into a performant operation.
- #backend
- #performance
- #database
- #caching