LangGraph tutorial: your first multi-step agent for SaaS support automation

May 16, 20267 minAI, LangGraph, Agents, Tutorial, Python

Short answer (60 seconds): LangGraph lets you build agents with persistent state and conditional flows in Python. You build it as a graph: nodes are functions (typically calling an LLM), edges are transitions (can be conditional). In this tutorial you build a support agent that classifies an incoming ticket, decides whether to auto-respond or escalate, drafts the response with Claude Sonnet, and allows human-in-the-loop for sensitive tickets. ~3 hours to implement, USD 30-60/month operating 5K tickets.

LangGraph was the fastest-growing framework in 2025-2026 for building AI agents in Python. The reason is practical: it combines the good parts of LangChain (integrations) with a more predictable execution model (explicit graphs vs implicit chains).

This tutorial builds something real: a SaaS support agent that replaces first-contact with customers. The structure translates to almost any multi-step use case (document processing, report generation, automated onboarding).

What you'll have at the end

A Python process that:

Receives a support ticket via webhook.
Classifies it with Claude Haiku (category + urgency).
Decides whether to auto-respond or escalate to a human.
If auto-responding, drafts with Claude Sonnet using internal docs as context.
If escalating, opens a task with summary in Notion/Linear.
Persists each step so the flow survives crashes.

Stack: Python 3.11+, LangGraph, Anthropic SDK, Postgres for checkpointing.

Setup

~
# create venv and install
python -m venv .venv
source .venv/bin/activate
pip install langgraph langchain-anthropic psycopg python-dotenv

# environment variables
cat > .env <<'EOF'
ANTHROPIC_API_KEY=sk-ant-...
DATABASE_URL=postgresql://localhost/agent_dev
EOF

Step 1 · Define the State

The State is what the graph accumulates as it runs. In LangGraph it's defined as a TypedDict:

~
# agent/state.py
from typing import TypedDict, Annotated, Sequence
from langchain_core.messages import BaseMessage
import operator

class TicketCategory(TypedDict):
  category: str  # "billing" | "technical" | "general" | "urgent"
  confidence: float
  reasoning: str

class SupportAgentState(TypedDict):
  # Input
  ticket_id: str
  ticket_text: str
  customer_email: str

  # Computed by the agent
  classification: TicketCategory | None
  action: str | None  # "auto_respond" | "escalate_human" | "ask_clarification"
  response_draft: str | None
  response_sent: bool

  # History
  messages: Annotated[Sequence[BaseMessage], operator.add]

  # Safety
  iterations: int

Why TypedDict and not Pydantic: LangGraph expects dicts. Pydantic adds unnecessary overhead here. If you want stronger validation, use Pydantic in the handlers that receive the initial ticket, not in the internal state.

Step 2 · Classification node

Each node is a function that takes the current state and returns the fields it updated.

~
# agent/nodes/classify.py
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage
import json

from agent.state import SupportAgentState, TicketCategory

CLASSIFY_LLM = ChatAnthropic(
  model="claude-3-5-haiku-20241022",
  temperature=0,
  max_tokens=300,
)

SYSTEM = """You are a support ticket classifier. Given the ticket text, return JSON with:
- category: one of "billing", "technical", "general", "urgent"
- confidence: 0.0 to 1.0
- reasoning: short explanation (max 30 words)

JSON only, no other text."""

def classify_ticket(state: SupportAgentState) -> dict:
  msg = CLASSIFY_LLM.invoke([
      SystemMessage(content=SYSTEM),
      HumanMessage(content=state["ticket_text"]),
  ])
  parsed: TicketCategory = json.loads(msg.content)
  return {
      "classification": parsed,
      "iterations": state.get("iterations", 0) + 1,
  }

Key detail: the function returns only the fields it changes, not the full state. LangGraph merges automatically.

Step 3 · Conditional routing

This is where LangGraph's magic happens. Define a function that decides the next transition:

~
# agent/routing.py
from agent.state import SupportAgentState

def route_after_classification(state: SupportAgentState) -> str:
  classification = state["classification"]
  cat = classification["category"]
  confidence = classification["confidence"]

  # Urgent tickets always go to human
  if cat == "urgent":
      return "escalate"

  # Low classification confidence → escalate
  if confidence < 0.7:
      return "escalate"

  # Complex billing → human (compliance)
  if cat == "billing":
      return "escalate"

  # Rest: auto-respond
  return "respond"

Step 4 · Response generation node

~
# agent/nodes/respond.py
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, SystemMessage

from agent.state import SupportAgentState
from agent.knowledge_base import retrieve_relevant_docs

RESPONSE_LLM = ChatAnthropic(
  model="claude-3-5-sonnet-20241022",
  temperature=0.3,
  max_tokens=600,
)

SYSTEM_TEMPLATE = """You are a {company} support agent. Tone: friendly, clear, concise.
Use ONLY the information in the context below to answer. If you can't answer with what you have, say you'll escalate to a human and DO NOT make up information.

Context:
{context}"""

def generate_response(state: SupportAgentState) -> dict:
  docs = retrieve_relevant_docs(state["ticket_text"], top_k=5)
  context = "\n\n".join([d.content for d in docs])

  system = SYSTEM_TEMPLATE.format(company="YourSaaS", context=context)

  msg = RESPONSE_LLM.invoke([
      SystemMessage(content=system),
      HumanMessage(content=state["ticket_text"]),
  ])

  return {
      "response_draft": msg.content,
      "action": "auto_respond",
  }

Step 5 · Assemble the graph

Now you wire all the nodes into a StateGraph:

~
# agent/graph.py
from langgraph.graph import StateGraph, END
from langgraph.checkpoint.postgres import PostgresSaver

from agent.state import SupportAgentState
from agent.nodes.classify import classify_ticket
from agent.nodes.respond import generate_response
from agent.nodes.escalate import escalate_to_human
from agent.nodes.send import send_response
from agent.routing import route_after_classification

def build_graph(checkpointer=None):
  builder = StateGraph(SupportAgentState)

  # Register nodes
  builder.add_node("classify", classify_ticket)
  builder.add_node("respond", generate_response)
  builder.add_node("escalate", escalate_to_human)
  builder.add_node("send", send_response)

  # Entry point
  builder.set_entry_point("classify")

  # Edges
  builder.add_conditional_edges(
      "classify",
      route_after_classification,
      {"respond": "respond", "escalate": "escalate"},
  )
  builder.add_edge("respond", "send")
  builder.add_edge("escalate", END)  # human will take over
  builder.add_edge("send", END)

  return builder.compile(checkpointer=checkpointer)

Step 6 · Human-in-the-loop with interrupt

For sensitive tickets, we want a human to approve the response before sending. LangGraph supports this natively with interrupt_before:

~
# agent/graph.py (modified)
def build_graph(checkpointer=None):
  # ... same code as above ...

  return builder.compile(
      checkpointer=checkpointer,
      interrupt_before=["send"],  # pause before sending
  )

Now the graph pauses before the send node. Your API exposes two endpoints:

~
# api/main.py
from fastapi import FastAPI
from langgraph.checkpoint.postgres import PostgresSaver

from agent.graph import build_graph

app = FastAPI()
checkpointer = PostgresSaver.from_conn_string(os.environ["DATABASE_URL"])
graph = build_graph(checkpointer=checkpointer)

@app.post("/tickets/{ticket_id}/process")
async def process_ticket(ticket_id: str, body: dict):
  config = {"configurable": {"thread_id": ticket_id}}

  # Run up to the interrupt
  result = await graph.ainvoke(
      {"ticket_id": ticket_id, "ticket_text": body["text"], ...},
      config=config,
  )

  if result.get("response_draft"):
      return {"status": "pending_approval", "draft": result["response_draft"]}
  return {"status": "escalated"}

@app.post("/tickets/{ticket_id}/approve")
async def approve_response(ticket_id: str, body: dict):
  config = {"configurable": {"thread_id": ticket_id}}

  # If the human edits the response, update the state
  if body.get("edited_draft"):
      await graph.aupdate_state(config, {"response_draft": body["edited_draft"]})

  # Continue from the interrupt
  result = await graph.ainvoke(None, config=config)
  return {"status": "sent"}

Step 7 · Observability

Three minimum logs per execution:

~
# agent/observability.py
import structlog

logger = structlog.get_logger()

def log_transition(state, node_name, decision=None):
  logger.info(
      "agent_transition",
      ticket_id=state["ticket_id"],
      node=node_name,
      decision=decision,
      iterations=state.get("iterations", 0),
      classification=state.get("classification"),
  )

And configure LangSmith for full tracing (free up to 5K traces/month):

~
export LANGCHAIN_TRACING_V2=true
export LANGCHAIN_API_KEY=ls__...
export LANGCHAIN_PROJECT=support-agent

Each execution leaves you with a visual timeline of every node, its input, output, latency, and tokens consumed. Essential for debugging.

Production costs

For 5,000 tickets/month with this pipeline:

Component	Estimated monthly cost
Claude Haiku (classification, 5K calls)	USD 5-10
Claude Sonnet (generation, ~3K calls — 60% auto-respond)	USD 25-50
Python hosting (Modal/Railway)	USD 10-30
Postgres (Supabase free tier usually fits)	USD 0-25
Total	USD 40-115/month

At 50K tickets/month, scale linearly to ~USD 400-1,000/month. If you hit that volume, it's worth starting to cache classifications and capping Sonnet output with max_tokens.

Pitfalls you'll hit

The agent decides "respond" when it should "escalate" — tune the confidence threshold in routing. Start conservative (0.8) and lower only if you see many unnecessary escalations.
Infinite loops when you add retrying nodes — always use recursion_limit and a counter in state.
The checkpointer grows unbounded — add a job that cleans up completed threads after N days.
LangSmith in production without sampling — at high volume, sampling at 10% keeps visibility without blowing the free tier.

Let's talk about your case

If you're considering building an AI agent for your SaaS and want to review architecture before committing 3-4 weeks of a dev, book a 30-minute call at no cost. 30 minutes usually clarifies whether LangGraph is the right tool or whether your case is better solved with a linear script or an n8n workflow.

Read also:

Integrating OpenAI without blowing up costs — cost discipline applicable to any stack, including LangGraph.
RAG step by step for your SaaS — if your agent needs to consult your own documentation.
More articles on AI — guides and comparisons.
Back to the blog — all articles.

Frequently asked questions

LangGraph vs CrewAI vs building my own orchestrator?

LangGraph for agents with defined flows (classify → decide → act). CrewAI for multiple agents collaborating with roles. Custom code when the flow is simple (3-5 linear steps) or very specific. For SaaS support automation, LangGraph is the sweet spot — state management and human-in-the-loop come built-in.

Why Python and not TypeScript with LangChain.js?

LangGraph has functional parity between Python and JS but the community and plugin ecosystem are more mature in Python. If your SaaS is Next.js, expose the agent as a separate Python service (FastAPI/Modal) and call it from your API. The separation also helps you scale the agent independently.

How much does this agent cost to run in production?

For 5K tickets/month with classification + response: USD 30-60/month in API costs (Claude Haiku classification + Sonnet generation). For 50K tickets: USD 200-500. Hosting the Python process: USD 5-30/month on Modal/Railway/Fly.io. Postgres can be the one you already have.

How do I manage state between agent invocations?

LangGraph's checkpointer (Postgres or SQLite). Each time the graph runs, it persists state in a table with a thread_id. If the process crashes, you can resume from the last checkpoint. Essential for human-in-the-loop where the agent waits for human input for hours or days.

When is it NOT a good idea to use agents?

When a single LLM call solves the problem (simple classification, extraction). When the flow steps are always the same (a linear script is simpler and more debuggable). When you need synchronous response under 500ms (multi-step agents take 2-10 seconds). Agents shine when the flow branches conditionally.

How do I prevent the agent from infinite-looping?

Three safeguards: (1) recursion_limit in graph config (default 25 is usually enough); (2) an iteration counter in state that the agent checks before continuing; (3) total thread timeout (15-30 min max). Log when each fires to detect prompts that break.