#Langchain — blogs.social

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

Tokenization is the first step in any LLM pipeline: converting raw text into a sequence of integer IDs that the model actually processes. Understanding tokenization helps you reason about context window limits, API costs, and why LLMs sometimes struggle with tasks that seem simple.

How Tokens Work

Tokens are typically subword units, not quite characters, not quite words. Common English words…

Read more →

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

System Prompt

In the chat completions API format used by OpenAI, Anthropic, Google, and others, messages come in three roles: system, user, and assistant. The system message is the system prompt, it's processed first, with higher conceptual priority than user turns, and persists across the entire conversation.

What Goes in a System Prompt

Persona, "You are a senior Go developer with expertise in…

Read more →

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

Prompt engineering is the discipline of communicating effectively with large language models. Because LLMs are trained to predict plausible continuations of text, how you frame a request has an enormous impact on what you get back, the same underlying model can behave like an expert assistant or produce generic noise depending on prompt quality.

Foundational Techniques

Zero-Shot…

Read more →

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

OpenRouter

A unified API gateway for large language models that lets you call 100+ LLMs from different providers through a single OpenAI-compatible endpoint with automatic fallback and cost routing.

Read more →

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

OpenHands (formerly OpenDevin) is an open-source platform for AI software engineering agents. Unlike Cursor or Windsurf which are IDEs with AI assistance, OpenHands is a platform where AI agents operate autonomously, writing code, executing shell commands, browsing the web, and iterating until a task is complete.

How OpenHands Works

OpenHands runs agents inside isolated Docker containers. The…

Read more →

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

Ollama makes running open-source LLMs as straightforward as running a Docker container. You pull a model, and it starts serving a local REST API that your code can call, no cloud, no API key, no per-token billing.

How It Works

Ollama bundles model weights, a Go-based runtime, and a simple model definition format (Modelfiles) into a single binary. When you run ollama run llama3.2, it downloads…

Read more →

Sahil Kapoor's Playbook @sahil.sahilkapoor.com.ap.brid.gy

May 17

MCP (Model Context Protocol)

Model Context Protocol (MCP) is an open standard introduced by Anthropic in late 2024 that defines a universal way for large language models to communicate with external systems, files, databases, APIs, and developer tools, without requiring custom integrations for every combination.

The Problem MCP Solves

Before MCP, every AI application had to build its own integrations: Cursor connected to…

Read more →