LangChain has become the de facto standard for building LLM-powered applications — from simple chains to complex multi-step agents with tool use and retrieval. It also presents one of the more interesting observability challenges: the execution model is highly dynamic, the data flow is non-linear, and the things that go wrong are often several steps removed from the model call that triggered them. Today we're releasing native LangChain support in the Starseer SDK — no wrappers, no monkey-patching, deep integration with LangChain's callback system.
If you've tried to add observability to a LangChain application before, you've probably encountered the same friction we did. LangChain's execution model is not a simple request-response cycle. A single user interaction might trigger a retrieval step, several sequential model calls, tool invocations, error handling branches, and output parsers — all nested inside each other in ways that a traditional request trace doesn't capture well.
Previous approaches to LangChain observability have generally worked by wrapping LangChain components — replacing the LLM object with an instrumented version, for example. This works for the happy path but breaks down in several common scenarios: when LangChain updates introduce new execution paths, when chains are deeply nested, and when the application uses LangChain Expression Language (LCEL) in ways that bypass the standard component interfaces.
The right approach is LangChain's built-in callback system, which fires events at every meaningful execution step. This is how LangSmith does it, and it's how we've implemented the Starseer integration. The advantage is that the callback system is stable, comprehensive, and explicitly maintained by the LangChain team as a first-class integration point.
The Starseer LangChain integration instruments the full execution graph, not just the LLM calls. Every chain invocation, every agent action, every tool call, every retrieval step, and every output parser operation generates a span in the Starseer trace. These spans are correlated into a single execution trace for each user interaction, giving you a complete picture of what happened during a single chain run.
For LLM calls specifically, we capture the full prompt (after template substitution), the raw model response, token counts, latency at the model API level (separate from the chain-level latency), and any model-reported metadata. For retrieval steps, we capture the query, the number of documents retrieved, and the retrieval latency. For tool calls, we capture the tool name, input arguments, output, and whether the call succeeded or produced an error.
All of this is attached to the same trace identifier, which means you can answer questions like: "For the interactions where the agent made more than three tool calls, what was the retrieval quality like at step 1?" These are the kinds of questions that matter for debugging agentic systems, and they're only answerable with correlated, full-execution traces.
Instrumentation is the foundation, but the Starseer integration goes further. LangChain agents are particularly interesting from a policy enforcement perspective because they can take actions in external systems — writing to databases, calling APIs, sending messages — based on model decisions. The stakes for behavioral guardrails are higher than for a simple question-answering chain.
The Starseer SDK exposes a callback hook that fires before any agent action is executed. Policy checks run synchronously at this point, before the tool call executes. If the proposed action violates a defined policy — for example, attempting to write data that exceeds a configured scope, or calling a tool with arguments that match a prohibited pattern — the action is blocked, logged, and the chain is notified to handle the failure gracefully.
This is the right place to enforce guardrails for agents: not at the model call level (where it's too late to prevent an action from being planned), and not at the output parsing level (which is too late to prevent the action from executing), but at the action invocation boundary, where blocking is still possible and the full context of the planned action is available.
The integration requires two additional lines beyond standard Starseer SDK initialization. After initializing the Starseer client, call starseer.langchain.get_callback() and pass the returned handler to your LangChain chain or agent's callbacks parameter. That's it — traces begin flowing immediately, and the Starseer dashboard shows the full execution graph within seconds of the first chain invocation.
For teams using LCEL, the callback handler can be passed at the chain definition level using .with_config(callbacks=[handler]). For teams using legacy chain classes, the handler attaches the same way it has always worked with LangChain callbacks.
LangChain applications have always deserved first-class observability tooling. The dynamic execution model, the multi-step agent behavior, and the high stakes of tool-using agents all make observability more important here than in most other AI deployment contexts. We're excited to offer this as a fully supported integration and to continue investing in deeper LangChain support as the framework evolves.
The LangChain integration is available now in Starseer SDK v0.4.0 and later. Get in touch to start your free trial, or explore the platform to see all supported integrations.