Tools: Tool Calling in LLMs: How Models Talk to the Real World

Tools: Tool Calling in LLMs: How Models Talk to the Real World

Source: Dev.to

Thinking vs. Doing ## What “Tool Calling” Actually Means ## The Tool Calling Loop ## Implementation: The NodeLLM Way ## How the LLM Decides Which Tool to Call ## Where Things Get Complicated ## Tool Calling vs “Agents” ## Common Failure Modes ## Closing Thoughts When I first started building with LLMs, I treated them like magic boxes. But I quickly realized: they're great at text, but terrible at doing actual work. They don’t know your internal docs. They can’t query your database. They definitely can’t send emails or trigger workflows on their own. Yet, in practice, this is exactly what we need them to do. Tool calling is the architectural bridge that connects the model’s reasoning to your application’s capabilities. So far, this sounds simple. It isn’t. At its heart, tool calling separates Thinking from Doing. Asking “What’s the weather?” is thinking. Calling a weather API is doing. This separation sounds obvious, but many early implementations mix the two—and pay for it later. It's easy to blur this line and end up debugging prompt behavior instead of system behavior. The model never executes code. It only requests that something be done. Keeping this boundary clear simplifies both reasoning and debugging. Tool calling is not the model doing anything. It’s the model asking your application to do something. When the LLM determines it needs external information, it doesn't return a human-readable sentence. Instead, it stops generating text and returns a Tool Call object—or even multiple tool calls at once if it determines it can perform several actions in parallel. Think of it as the model answering: “To answer this, I need to call the document_search tool with query = refund policy.” Under the hood, that response looks like a structured JSON object: This is the "intent" phase. The model isn't "doing" the search; it is asking you to do it. Your application parses this JSON, runs the actual search, and provides the results back in the next turn. Crucially, models can request multiple tool calls in a single response (Parallel Tool Calling) if they identify multiple independent actions that need to be taken to fulfill the request. In practice, tool calling looks like a loop: This loop is iterative. A single user request can trigger a "chain" of tool calls, where each step depends on the result of the previous one. If this loop feels familiar, it’s because it looks a lot like a request–response cycle you already understand. Tool calling works because it replaces guesswork with a structured protocol. The model must specify the exact tool and the exact parameters required. If you’ve ever wondered why your model keeps calling the same tool over and over, this is usually why—the parameters it's getting back don't satisfy its next reasoning step. In NodeLLM, tools are defined as classes with a structured schema. This gives the model clear instructions and gives you full type safety in your application. NodeLLM handles the heavy lifting—like parallel tool calls and error handling—under the hood. You can find the full API reference in the official documentation. This part is subtle, and small mistakes here lead to confusing behavior later. It’s not just about the code; it's about how the model "knows" which tool to use. It doesn't have access to your source code; it only has access to the metadata you provide. When you define a tool in NodeLLM or any other LLM framework, you are essentially writing a small instruction manual for the model: Think of these as semantic prompts. Before every response, the LLM maps the user's request against your tool descriptions. This is where you can easily trip up. A vague description is a recipe for hallucinations. If the request matches the description, the model "invokes" the tool by generating a structured JSON object matching your schema. This is why descriptions are as important as implementation. If your description is vague, the model will hallucinate calls or ignore the tool entirely. Precise descriptions guide the model’s reasoning environment. Most examples online show a single tool call. Real systems are rarely that simple. In production, you often need to handle: None of these problems show up in examples. They appear once you start composing features. This orchestration logic lives outside the model, in your runtime. Tool calling is often confused with agents. They are related, but different. Tool calling is the mechanism. Agents are a pattern built on top of it. An agent is simply a loop where the model reasons, calls tools, receives results, and decides what to do next. You can use tool calling without building agents, but you cannot build agents without tool calling. To see this in action, check out Building Your First AI Agent in Node.js. Tool calling does not magically remove problems. Some common issues teams run into: This is where things usually go wrong—and it's almost always a design problem, not a model problem. Tool calling isn’t new or mysterious. What matters is treating it like architecture instead of a prompt trick. By separating reasoning from execution, you gain better control, easier debugging, and the freedom to evolve your architecture over time. As LLMs become embedded deeper into real software, this boundary will matter more than any specific model or provider. Templates let you quickly answer FAQs or store snippets for re-use. Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment's permalink. Hide child comments as well For further actions, you may consider blocking this person and/or reporting abuse CODE_BLOCK: { "tool_calls": [ { "id": "call_abc123", "type": "function", "function": { "name": "document_search", "arguments": "{\"query\": \"refund policy\"}" } } ] } Enter fullscreen mode Exit fullscreen mode CODE_BLOCK: { "tool_calls": [ { "id": "call_abc123", "type": "function", "function": { "name": "document_search", "arguments": "{\"query\": \"refund policy\"}" } } ] } CODE_BLOCK: { "tool_calls": [ { "id": "call_abc123", "type": "function", "function": { "name": "document_search", "arguments": "{\"query\": \"refund policy\"}" } } ] } COMMAND_BLOCK: import { Tool, z, NodeLLM } from "@node-llm/core"; // 1. Define the Tools class DocumentSearch extends Tool { name = "document_search"; description = "Searches knowledge base for relevant information"; schema = z.object({ query: z.string() }); async handler({ query }) { const docs = await db.documents.search(query).limit(3); return docs.map(doc => `${doc.title}: ${doc.content}`).join("\n\n"); } } class SlackNotification extends Tool { name = "send_slack"; description = "Sends a message to a Slack channel"; schema = z.object({ message: z.string(), channel: z.string() }); async handler({ message, channel }) { await slack.send(channel, message); return "Notification sent successfully"; } } // 2. Usage const chat = NodeLLM.chat("gpt-4o") .withTools([DocumentSearch, SlackNotification]) .withInstructions("Search for context. If you find a security issue, notify Slack."); const response = await chat.ask("What is our security policy?"); Enter fullscreen mode Exit fullscreen mode COMMAND_BLOCK: import { Tool, z, NodeLLM } from "@node-llm/core"; // 1. Define the Tools class DocumentSearch extends Tool { name = "document_search"; description = "Searches knowledge base for relevant information"; schema = z.object({ query: z.string() }); async handler({ query }) { const docs = await db.documents.search(query).limit(3); return docs.map(doc => `${doc.title}: ${doc.content}`).join("\n\n"); } } class SlackNotification extends Tool { name = "send_slack"; description = "Sends a message to a Slack channel"; schema = z.object({ message: z.string(), channel: z.string() }); async handler({ message, channel }) { await slack.send(channel, message); return "Notification sent successfully"; } } // 2. Usage const chat = NodeLLM.chat("gpt-4o") .withTools([DocumentSearch, SlackNotification]) .withInstructions("Search for context. If you find a security issue, notify Slack."); const response = await chat.ask("What is our security policy?"); COMMAND_BLOCK: import { Tool, z, NodeLLM } from "@node-llm/core"; // 1. Define the Tools class DocumentSearch extends Tool { name = "document_search"; description = "Searches knowledge base for relevant information"; schema = z.object({ query: z.string() }); async handler({ query }) { const docs = await db.documents.search(query).limit(3); return docs.map(doc => `${doc.title}: ${doc.content}`).join("\n\n"); } } class SlackNotification extends Tool { name = "send_slack"; description = "Sends a message to a Slack channel"; schema = z.object({ message: z.string(), channel: z.string() }); async handler({ message, channel }) { await slack.send(channel, message); return "Notification sent successfully"; } } // 2. Usage const chat = NodeLLM.chat("gpt-4o") .withTools([DocumentSearch, SlackNotification]) .withInstructions("Search for context. If you find a security issue, notify Slack."); const response = await chat.ask("What is our security policy?"); - The Model decides what to do. - Your App actually does it. - User asks a question. - Application sends: Conversation history + Tool definitions. - Model responds: Either with text or with one or more tool calls. - Application executes the tool(s). - Tool result(s) are sent back to the model. - Model continues reasoning: Based on the results, it may provide a final answer or it may trigger another tool call if the results revealed that further action is needed. - The Name: Signals the high-level intent (e.g., document_search). - The Description: Provides the "why" and "when". - The Schema: Defines the "what" (the specific parameters required). - Multiple tool calls in a single turn - Tool failures and retries - Timeouts and state management - Streaming partial results - The model calls tools too often (looping): This happens when the tool's output doesn't give the model what it needs to stop, so it tries again. This can quickly spiral if you don't enforce a maxSteps or recursion limit in your implementation. - Tool descriptions are too vague: See the table above—this is the #1 cause of "dumb" model behavior. - Business logic leaks into prompts: Keep your handlers for logic and your descriptions for intent. - The model assumes state that no longer exists: Models are stateless; they only know what's in the current window.