What programming language should I use to build an AI agent?

TypeScript and Python are the two best choices. TypeScript works well if your stack is already JavaScript based and you want type safety across your agent logic. Python has the largest ecosystem of AI libraries and most tutorials use it. Both languages have mature SDKs for OpenAI, Anthropic, and other LLM providers.

How much does it cost to run an AI agent in production?

LLM API costs depend on usage volume and model choice. A support agent handling 100 conversations per day with Claude Sonnet costs roughly $50 to $150 per month in API fees. Hosting the agent application itself runs $5 to $20 per month on Railway or a similar platform. The total is usually far less than the salary of the person whose work the agent replaces.

Can an AI agent use external APIs and databases?

Yes. Tool use is the core mechanism that makes agents useful beyond conversation. You define tool schemas that describe available actions (query a database, call an API, send an email) and the LLM decides when to invoke them. The agent runtime executes the tool call, returns the result to the LLM, and the LLM decides what to do next.

How do I prevent an AI agent from making mistakes?

Use guardrails at multiple levels. Input validation catches malformed requests before they reach the LLM. Output validation ensures the agent response matches expected formats. Human in the loop approval gates require a person to confirm high stakes actions like sending emails or modifying data. Rate limiting prevents runaway loops. Start with conservative guardrails and loosen them as you build confidence.

What is the difference between an AI agent and a chatbot?

A chatbot responds to messages in a conversation. An AI agent takes actions. A chatbot can answer questions about your product. An AI agent can look up the customer record, check their subscription status, apply a discount, send a confirmation email, and update the CRM. The difference is autonomy and tool use.

How to Build an AI Agent in 2026 — Step by Step Guide

What Is an AI Agent

An AI agent is software that uses a large language model to make decisions and take actions. Unlike a chatbot that just generates text responses, an agent has access to tools (APIs, databases, file systems) and can execute multi step workflows without human intervention.

The core loop of every agent is simple:

Receive a task or message
Decide what action to take
Execute the action using a tool
Observe the result
Decide if the task is complete or if another action is needed
Repeat until done

This loop is what separates agents from simple LLM wrappers. The LLM is the brain. The tools are the hands. The loop is the autonomy.

Step 1: Define the Agent's Purpose

Before writing code, define exactly what your agent should do. Agents that try to do everything do nothing well.

Write a one sentence purpose statement: This agent [action] for [user] by [method].

Examples:

This agent triages support tickets for customer service teams by reading the ticket, classifying urgency, and routing to the correct department.
This agent generates weekly reports for sales managers by querying the CRM, calculating metrics, and formatting the output as a PDF.
This agent monitors competitor pricing for ecommerce teams by scraping product pages daily and alerting when prices change.

A focused agent with 3 to 5 tools outperforms a general purpose agent with 50 tools every time. Narrow scope means better prompts, fewer errors, and easier testing.

For industry specific examples, see our AI agent development service or browse agents built for customer support, sales, and data analysis.

Step 2: Choose Your LLM and SDK

The two leading options for agent development in 2026 are:

Provider	Best Model	Strengths
Anthropic	Claude Sonnet 4.6	Tool use reliability, long context, structured output
OpenAI	GPT 4.1	Broad ecosystem, function calling, vision

For most agents, Claude Sonnet 4.6 offers the best balance of capability and cost. It follows tool schemas precisely and handles complex multi step reasoning well.

SDK Setup (TypeScript)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

SDK Setup (Python)

import anthropic

client = anthropic.Anthropic(
    api_key=os.environ["ANTHROPIC_API_KEY"]
)

Step 3: Define Your Tools

Tools are the actions your agent can take. Each tool is a function with a schema that tells the LLM what it does, what parameters it accepts, and what it returns.

const tools = [
  {
    name: "lookup_customer",
    description: "Look up a customer by email address. Returns name, plan, and account status.",
    input_schema: {
      type: "object" as const,
      properties: {
        email: {
          type: "string",
          description: "The customer email address to look up",
        },
      },
      required: ["email"],
    },
  },
  {
    name: "send_email",
    description: "Send an email to a customer. Use for follow ups and confirmations only.",
    input_schema: {
      type: "object" as const,
      properties: {
        to: { type: "string", description: "Recipient email" },
        subject: { type: "string", description: "Email subject line" },
        body: { type: "string", description: "Email body in plain text" },
      },
      required: ["to", "subject", "body"],
    },
  },
];

Good tool descriptions are critical. The LLM uses the description to decide when to call each tool. Vague descriptions lead to wrong tool selection. Be specific about what the tool does and when to use it. For a deeper treatment of how tool use works across providers, see our AI agent development service.

Step 4: Build the Agent Loop

The agent loop is the runtime that orchestrates the LLM and tools. Here is a complete working example:

async function runAgent(userMessage: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userMessage },
  ];

  const systemPrompt = `You are a customer support agent for an online SaaS product.
You help customers with account questions, billing issues, and product guidance.
Always look up the customer record before answering account-specific questions.
Never make up information. If you cannot find the answer, say so.`;

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-6-20250514",
      max_tokens: 1024,
      system: systemPrompt,
      tools,
      messages,
    });

    // Check if the agent wants to use a tool
    if (response.stop_reason === "tool_use") {
      const toolUseBlock = response.content.find(
        (block) => block.type === "tool_use"
      );

      if (toolUseBlock && toolUseBlock.type === "tool_use") {
        // Execute the tool
        const toolResult = await executeTool(
          toolUseBlock.name,
          toolUseBlock.input
        );

        // Feed the result back to the agent
        messages.push({ role: "assistant", content: response.content });
        messages.push({
          role: "user",
          content: [
            {
              type: "tool_result",
              tool_use_id: toolUseBlock.id,
              content: JSON.stringify(toolResult),
            },
          ],
        });
      }
    } else {
      // Agent is done — return the final response
      const textBlock = response.content.find(
        (block) => block.type === "text"
      );
      return textBlock?.type === "text" ? textBlock.text : "";
    }
  }
}

This loop continues until the LLM returns a text response instead of a tool call. The LLM decides when it has enough information to answer.

Step 5: Add Memory

Without memory, your agent starts every conversation from scratch. There are two types of memory:

Short term memory is the conversation history. The messages array in the loop above handles this automatically. Each tool call and result stays in context so the agent can reference earlier steps.

Long term memory persists across conversations. This is where you store user preferences, past interactions, and learned patterns.

// Simple long-term memory using PostgreSQL
async function saveMemory(userId: string, key: string, value: string) {
  await db.insert(agentMemory).values({
    userId,
    key,
    value,
    createdAt: new Date(),
  });
}

async function getMemory(userId: string, key: string) {
  return db
    .select()
    .from(agentMemory)
    .where(
      and(eq(agentMemory.userId, userId), eq(agentMemory.key, key))
    );
}

Inject relevant memories into the system prompt before each conversation. The agent will use them to personalize its responses. For agents that need to retrieve from large knowledge stores, pair memory with a vector database for semantic search over past interactions.

Step 6: Add Guardrails

Production agents need safety rails. Without them, you will eventually get an agent that sends the wrong email to the wrong customer.

Input validation

function validateInput(message: string): boolean {
  if (message.length > 10000) return false;
  if (containsInjectionAttempt(message)) return false;
  return true;
}

Output validation

Check that the agent's final response meets your requirements before returning it to the user. Flag responses that contain competitor names, pricing promises, or legal claims for human review.

Human in the loop gates

For high stakes actions (sending emails, modifying accounts, issuing refunds), require human approval:

async function executeTool(name: string, input: unknown) {
  if (HIGH_STAKES_TOOLS.includes(name)) {
    const approved = await requestHumanApproval(name, input);
    if (!approved) return { error: "Action requires approval" };
  }
  return toolHandlers[name](input);
}

Rate limiting

Prevent runaway agent loops by capping the number of tool calls per conversation:

const MAX_TOOL_CALLS = 10;
let toolCallCount = 0;

// Inside the agent loop:
if (toolCallCount >= MAX_TOOL_CALLS) {
  return "I have reached the maximum number of actions for this conversation.";
}
toolCallCount++;

Step 7: Test Thoroughly

Agent testing is different from traditional software testing. You need to test both the deterministic parts (tool execution, input validation) and the non deterministic parts (LLM decision making).

Unit test your tools. Every tool function should have standard unit tests with known inputs and expected outputs.

Create evaluation datasets. Build a set of 50 to 100 test conversations with expected outcomes. Run the agent against them and measure accuracy.

Test edge cases. What happens when a tool returns an error? When the user asks something outside the agent's scope? When the LLM hallucinates a tool that does not exist?

Monitor in production. Log every agent conversation, tool call, and result. Review a random sample weekly to catch issues that automated tests miss. See our production-ready AI agent checklist for a complete list of what to verify before launch.

Step 8: Deploy to Production

Deploy your agent as an API endpoint that your application calls:

import { Hono } from "hono";

const app = new Hono();

app.post("/api/agent", async (c) => {
  const { message, userId } = await c.req.json();

  if (!validateInput(message)) {
    return c.json({ error: "Invalid input" }, 400);
  }

  const memories = await getMemory(userId, "preferences");
  const response = await runAgent(message, memories);

  return c.json({ response });
});

export default app;

Host on Railway for $5 per month. Set up environment variables for your API keys. Enable auto deploy from GitHub so every push goes live.

For a production ready agent that handles real workloads, you will also want request queuing (BullMQ), structured logging, and alerting for failed tool calls.

DIY vs Hire an Agency

Build it yourself when:

You are comfortable with TypeScript or Python
Your agent has a narrow scope (3 to 5 tools)
You want to iterate quickly on prompt engineering
The agent is internal facing (lower risk tolerance)

Hire an agency when:

The agent is customer facing and needs to be reliable from day one
You need complex integrations (CRM, ERP, databases)
HIPAA, SOC 2, or other compliance requirements apply
You want production grade guardrails, monitoring, and human in the loop workflows

At HouseofMVPs, we build custom AI agents starting at $3,000 with a 14 day delivery. Each agent includes tool integration, guardrails, monitoring, and deployment.

What Comes After Your First Agent

Once your first agent is working, you will see opportunities to build more. Use the AI Agent ROI Calculator to estimate the time savings before investing in additional agents. Common next steps:

Add more tools to expand what the agent can do
Build a second agent for a different workflow
Connect agents to each other (agent orchestration) — see our multi agent systems guide
Add a user facing chat interface
Build an AI integration that connects the agent to your existing tools

The best agents are the ones that handle the tasks your team does every day but nobody enjoys. Start there.

How to Build an AI Agent: A Practical Guide for 2026