Prompt engineering was about wording one message. Context engineering is about managing the entire context window as a scarce budget — what goes in, in what order, and what gets evicted. For a backend engineer, it's working-set management applied to an LLM.
Plain vector RAG can't answer multi-hop or 'across everything' questions — the answer is spread across chunks that no single chunk contains. GraphRAG extracts a knowledge graph instead. Here's how it works, the honest cost, and how to start in Postgres without a graph database.
Retrieval-Augmented Generation explained in plain English. The librarian analogy, the five steps, a working Python example you can run in 50 lines, and the mistakes every beginner makes the first time.
Retrieval-augmented generation has been wrapped in enough mystique to obscure that it's mostly an ETL problem. What the pipeline actually looks like, where the real engineering happens, and the failure modes that have nothing to do with the model.