Skip to content
← Back to Blog

OpenAI Assistants API: Build a Custom AI Assistant That Knows Your Business

The Assistants API lets you build persistent AI assistants with their own knowledge bases, code execution environments, and function-calling...

Featured cover graphic for: OpenAI Assistants API: Build a Custom AI Assistant That Knows Your Business

The raw OpenAI API is powerful but stateless — every call is independent, you manage conversation history manually, and there is no built-in way to give the model persistent access to your documents or tools. The Assistants API solves this at the platform level.

An Assistant is a persistent AI configuration with its own instructions, model selection, tools, and knowledge base. Conversations happen in Threads that automatically manage context. The assistant can search uploaded documents, run code, and call external functions — without you rebuilding these capabilities from scratch for every application.

🔗 This is Post #10 in the ChatGPT Unlocked series. Requires familiarity with the OpenAI API (Post #9). For no-code versions of assistants, see Custom GPTs (Post #11).


The Core Concepts

Assistants

A persistent AI configuration with: instructions (system prompt), model, tools, and an optional knowledge base (file search index). Created once, reused across many conversations.

Threads

A conversation container. Each user gets their own thread. The thread maintains the full message history and automatically handles the context window — when a conversation grows long, it summarizes older messages automatically.

Messages

Individual user or assistant messages within a thread. You add user messages to a thread and retrieve assistant responses from it.

Runs

A Run is the execution of an assistant on a thread — this is when the model actually processes the conversation and generates a response. Runs can involve multiple steps: searching files, executing code, calling functions.


Creating Your First Assistant

from openai import OpenAI

client = OpenAI()

# Create an assistant — do this once, save the ID
assistant = client.beta.assistants.create(
    name="Customer Support Assistant",
    instructions="""You are a helpful customer support assistant for Acme Software.

Your job:
- Answer questions about our products using the knowledge base
- Help troubleshoot common issues
- Escalate complex technical problems by saying "I'll connect you with our technical team"
- Stay focused on support topics — politely redirect off-topic questions

Tone: Friendly, efficient, and clear. No jargon unless the user uses it first.""",
    model="gpt-5.4",
    tools=[{"type": "file_search"}]  # Enable knowledge base search
)

print(f"Assistant ID: {assistant.id}")
# Save this ID — you will reuse it, not recreate the assistant

# Create a vector store (the knowledge base)
vector_store = client.beta.vector_stores.create(
    name="Support Documentation"
)

# Upload your documentation files
import os

doc_files = [
    "docs/getting_started.pdf",
    "docs/faq.pdf",
    "docs/troubleshooting.pdf",
    "docs/api_reference.pdf"
]

file_streams = [open(path, "rb") for path in doc_files]

# Upload and index all files
file_batch = client.beta.vector_stores.file_batches.upload_and_poll(
    vector_store_id=vector_store.id,
    files=file_streams
)

print(f"Files indexed: {file_batch.file_counts.completed}")

# Attach the vector store to your assistant
client.beta.assistants.update(
    assistant.id,
    tool_resources={"file_search": {"vector_store_ids": [vector_store.id]}}
)

Now when users ask questions, the assistant automatically searches the uploaded documents and grounds its answers in the actual content — not just its training data.


Managing Conversations with Threads

def chat_with_assistant(assistant_id: str, user_message: str, thread_id: str = None):
    """
    Send a message to an assistant. Creates a new thread if none provided.
    Returns the response and thread_id (for continuing the conversation).
    """
    
    # Create or reuse a thread
    if thread_id is None:
        thread = client.beta.threads.create()
        thread_id = thread.id
    
    # Add the user's message
    client.beta.threads.messages.create(
        thread_id=thread_id,
        role="user",
        content=user_message
    )
    
    # Run the assistant and wait for completion
    run = client.beta.threads.runs.create_and_poll(
        thread_id=thread_id,
        assistant_id=assistant_id
    )
    
    if run.status == "completed":
        # Get the latest assistant message
        messages = client.beta.threads.messages.list(thread_id=thread_id)
        response = messages.data[0].content[0].text.value
        return response, thread_id
    else:
        return f"Run failed: {run.status}", thread_id

# Start a conversation
response, thread_id = chat_with_assistant(
    assistant.id,
    "How do I export my data from the dashboard?"
)
print(f"Assistant: {response}")

# Continue the same conversation (thread_id is reused)
response, thread_id = chat_with_assistant(
    assistant.id,
    "What format does the export come in?",
    thread_id=thread_id
)
print(f"Assistant: {response}")

Adding Code Interpreter

Code Interpreter lets the assistant run Python to analyze data, generate charts, and perform calculations:

# Update assistant to include Code Interpreter
client.beta.assistants.update(
    assistant.id,
    tools=[
        {"type": "file_search"},
        {"type": "code_interpreter"}
    ]
)

# Now you can upload data files and ask analytical questions
with open("sales_data.csv", "rb") as f:
    uploaded_file = client.files.create(file=f, purpose="assistants")

# Include the file in your thread message
client.beta.threads.messages.create(
    thread_id=thread_id,
    role="user",
    content="Analyze this sales data and create a chart of monthly revenue trend.",
    attachments=[{
        "file_id": uploaded_file.id,
        "tools": [{"type": "code_interpreter"}]
    }]
)

Function Calling in Assistants

Give the assistant the ability to call your own APIs or perform real actions:

# Define functions the assistant can call
tools_with_functions = [
    {"type": "file_search"},
    {
        "type": "function",
        "function": {
            "name": "create_support_ticket",
            "description": "Create a support ticket in the help desk system when a user has an issue that cannot be resolved immediately.",
            "parameters": {
                "type": "object",
                "properties": {
                    "user_email": {"type": "string", "description": "User's email address"},
                    "issue_summary": {"type": "string", "description": "Brief description of the issue"},
                    "priority": {"type": "string", "enum": ["low", "medium", "high", "urgent"]}
                },
                "required": ["user_email", "issue_summary", "priority"]
            }
        }
    }
]

client.beta.assistants.update(assistant.id, tools=tools_with_functions)

When the assistant calls this function, you receive the call in the run response and execute the actual ticket creation in your system.


Production Considerations

Thread lifecycle: Threads persist in OpenAI’s system. For production, map your user IDs to thread IDs in your database. Delete threads when users request data deletion.

Cost management: Each run costs tokens for the messages processed. Long threads with file search can be expensive — monitor usage in the Platform dashboard.

Rate limits: The Assistants API has rate limits separate from the completions API. Check platform.openai.com/docs for current limits.

Error handling: Runs can fail or require action (for function calls). Always handle non-completed run statuses.


When to Use Assistants API vs. Raw Completions

Use Assistants API Use Raw Completions
Need persistent knowledge base Simple one-off completions
Multi-turn conversations with same user Stateless transformations
Need code interpreter built-in Custom context management
Building user-facing chat product Batch processing
Want managed context windows Fine-grained conversation control

Conclusion

The Assistants API handles the scaffolding that makes AI applications real: persistent knowledge bases, conversation management, tool integration. It is the right foundation for user-facing applications where you need more than a stateless completion.

Your next step: Create an assistant using the code above with your instructions. Upload one document. Run a test conversation. The setup from zero to working assistant takes under an hour.


📚 Continue the Series:

Last updated: May 2026. Verify current Assistants API features at platform.openai.com/docs.

Frequently Asked Questions (FAQ)

How much does the Assistants API cost?
Charged at standard model rates plus storage fees for vector stores (file search). Check platform.openai.com/pricing for current rates.
Can I use GPT-5.5 with the Assistants API?
Yes — specify `model="gpt-5.5"` when creating or updating your assistant.
Is the Assistants API production-ready?
Yes — it is out of beta. Review the documentation for rate limits and SLA details before building production applications.

Disclaimer: The information contained on this blog is for academic and educational purposes only. Unauthorized use and/or duplication of this material without express and written permission from this site's author and/or owner is strictly prohibited. The materials (images, logos, content) contained in this web site are protected by applicable copyright and trademark law.