Signpost AI Agent Architecture, Infrastructure and Workflow
This document provides a guide to the infrastructure and architecture of the Signpost AI agent technology (Note: Agent, chatbot or bot is used interchangeably in this document).
This agent technology leverages RAG (Retrieval Augmented Generation) to provide client-centered, safe and PFA (psychological First-Aid) informed information to clients in need. It is a robust and flexible tool which can aid humanitarian efforts by retrieving relevant information, assessing it for relevance, safety and ethical considerations and presenting it to a user in a context-appropriate manner.
This documentation will detail three main parts:
Core Components
Architecture
WorkFlow
You can access the source code as well as relevant code documentation here.
Core Components:
The Signpost AI agent ecosystem infrastructure has a RAG pipeline to leverage the power of its vetted, user-needs based Signpost information. It is made out of the following interconnected components to create a cohesive system that (a) enables the agents to function effectively (b) uses Signpost’s articles users’ self-expressed information needs to answer their requests and (b) provides a foundational blueprint for further scaling and improvement.
These components are:
MySQL Database: MySQL is a reliable, open-source relational database management system (RDBMS), which uses linked tables to organize and structure data to create relationships between data points. In the Signpost AI context, MySQL database is used to store configurations for various, context-specific agents
Node.js Server: Node.js is a Javascript runtime environment which in Signpost AI’s case, manages infrastructure by providing HTTP support for API REST requests and is built inside a container hosted on Azure App Service
Azure Kubernetes Service (AKS): Kubernetes offers simplified deployment, management, and scaling of containerized applications in the Azure cloud. For Singpost AI, it hosts two containers:
Weviate Vector Database: Weaviate is an open source, AI-native vector database which Signpost AI uses to facilitate data ingestion, creation and storage of vector embeddings and semantic search
Ollama: Ollama is an open-source, on-device AI platform which Signpost AI uses to run large language models (LLMs) locally. It is mostly used for testing and proof of concept
Directus (Content Management System): Directus is an open-source headless CMS which Signpost AI uses to provide a user-friendly interface for the creation and maintenance of agents. This is also containerized on the Azure App Service. Directus allows creation of agents with multiple characteristics all of which can be configured. Some examples of items which can be modified include:
Title: Identifying agent name
Type: Type of knowledge base the agent uses
LLM: Which LLM is the agent connected to
Temperature: Parameter which controls how random/creative LLM response should be (between 0-1, 1 being fully random or “creative”)
System Prompts: Also known as model prompts or instruction prompts, list of items which will be added to the prompt
Prompt: Local context-specific list of items appended to prompt
Vectorless: Toggle where in this mode, the vector database will not be queried
All of these components are highly interconnected and can be said to have two main features:
Data Flow: Data has a logical and set flow path. Directus manages agent configurations which are stored in the MySQL Database. Data of these configurations flow from MySQL Database, to the Node.js server which retrieves and applies these configurations. The server then interacts with the AKS hosted containers to perform vector searches and generate responses
Deployment: All components are containerized except for MySQL Database. This ensures flexibility and scalability as the majority of components can be managed separately and scaled if needed
These interconnections can be roughly diagrammed as following:
Architecture
The server using Express, acts as the intermediary managing HTTP requests, and providing endpoints for AI model interactions and agent/bot management.
It is worth looking selectively and briefly at the role of three typescript files, ai.ts, bot.ts and index.ts which together provide the core functionality for the chatbot. The full repository including these files and other supporting files can be found here.
index.ts: this file sets up the web server for routing, managing and handling requests, and provides endpoints for bot management and AI interactions. When a request is made, it executes the bot class to process it
ai.ts: schematizes an object that allows the agent to interact with LLMs from different providers including OpenAI, Gemini and Claude. It also provides functions to send prompts and receive responses.
bot.ts: This is the core agent file. This file defines a class “bot”, representing the chatbot which processes and responds to user requests by using the aforementioned ai object to interact with different AI models. The workflow of the bot can be broken down into
Invokes router to identify characteristics of request (e.g. is request for contact information or a general question)
Knowledge Base Search
Invoking LLMs by using ai.ts
Constitutional AI checks on LLM response
Response formatting
Logging information
Signpost AI Workflow
Based on the Signpost AI agent infrastructure and architecture provided above, the agent execution process and workflow goes like this (you can read more at a slightly higher-level on this topic in our Signpost AI Chatbot Non-Technical Explainer):
Request Made: User sends a message
Retrieval of Bot Configuration: Agent fetches configuration settings from the database
Prompt Construction: A final prompt is constructed based on
Default Prompt which is universally applied
Bot-Specific System Prompts
Country specific Local Prompts
Constitution Rules: At the same time, a finalized list of Constitutional Rules is also compiled using Default Constitutional Rules and any bot specific Constitutional Rules
Invoking the Router: Each bot has an internal AI tool called router; this tool detects and extracts key characteristics from user the message:
Checks if the user is requesting contact information
Extract key search terms which will be used to query vector database
What language is the user using
What is the geographic location of the user
Contact Information Request: If the router detects a request for contact information, the agent outputs a list of communication channels specific in its configuration. This ensures prompt and direct assistance to the user
Vector Database Query: If the agent is not set to “vectorless” in its configuration, the vector database is queried using the extracted search terms. This is search is refined by filtering through domains, services, and other knowledge bases specific in the agent configuration
Search Results Integration: Results from vector database query are added to the user message. Based on agent configuration, chat history could also be used here
Sent to LLMs: Prompts, search result context, chat history (if applicable) are all concatenated and sent to the LLM specified in the configuration, using the model specified
Constitutional Checks: Following the generation of LLM response, the agent applies constitutional rules individually, evaluating iteratively and modifying the answer according to Constitution rule
Final Answer: This Constitutionally checked result is then outputted to the user