Building Autonomous Agents in Azure: A Tool-First Approach

The biggest shift in Gen AI right now is moving from Chatbots (Passively answering) to Agents (Actively doing). An Agent uses an LLM as a reasoning engine to decide which tools to call.

But deploying agents to production is tricky. Loops can get stuck. State is lost. Tools fail.

Here is our battle-tested architecture for deploying AI Agents on Azure.

1. The Core: Function Calling with OpenAI

Azure OpenAI models (GPT-4o, GPT-3.5-Turbo) support Function Calling. You define a JSON schema for your tools, and the model returns structured JSON arguments instead of text.

The Agent Loop

User: “Check the status of order #12345.”
LLM: Thought: I need the check_order_status tool. -> Action: {"order_id": "12345"}
System: Executes check_order_status("12345") -> Returns “Shipped”
LLM: Observation: Order is shipped. -> Response: “Your order #12345 has shipped!“

2. Defining Tools in Python

We encapsulate business logic as Pydantic models (for validation).

from langchain.tools import tool
from pydantic import BaseModel, Field

class CheckOrderInput(BaseModel):
    order_id: str = Field(description="The unique order ID (e.g., #12345)")

@tool("check_order_status", args_schema=CheckOrderInput)
def check_order_status(order_id: str) -> str:
    """Check the shipping status of an order."""
    # Simulate API call
    return f"Order {order_id} is SHIPPED via UPS."

3. Hosting Agents on Azure Durable Functions

An agent conversation can last minutes or hours. HTTP timeouts kill standard Azure Functions.

Use Durable Functions (Orchestrator Pattern):

Stateful: Remembers conversation history automatically.
Resilient: If the function crashes, it replays from last checkpoint.
Async: Can wait for long-running tool execution (e.g., “Generate Report”).

Architecture Pattern

import azure.durable_functions as df

def orchestrator_function(context: df.DurableOrchestrationContext):
    history = context.get_input()
    user_message = history[-1]

    # Call LLM via Activity Function
    agent_response = yield context.call_activity("CallLLM", history)

    if agent_response.get("tool_calls"):
        # Parallel Tool Execution (Fan-out)
        tasks = []
        for tool_call in agent_response["tool_calls"]:
            tasks.append(context.call_activity(tool_call["name"], tool_call["args"]))
        
        tool_outputs = yield context.task_all(tasks)
        
        # Append outputs to history and loop back (recursion)
        history.append({"role": "tool", "content": tool_outputs})
        context.continue_as_new(history)

    return agent_response["content"]

4. Human-in-the-Loop (Approval)

Crucial for enterprise agents. Before taking a high-risk action (e.g., “Refund User”), the agent must pause.

Durable Functions Implementation:

if tool_name == "refund_user":
    yield context.wait_for_external_event("ManagerApproval")

The agent literally pauses execution for days until a manager clicks “Approve” in a dashboard.

5. Security Guardrails

Agents are unpredictable.

Never allow drop table tool. Read-only tools by default.
Limit loops: Set max_iterations=5 to prevent infinite loops burning tokens.
System Prompt: Always instruct the agent what it cannot do.

Conclusion

Building a production agent combines DevOps (State management, Retries) with Prompt Engineering. By using Azure Durable Functions, we solve the hardest part of agent development: reliability and long-running state.

StackMindset