Building scalable chatbots that can handle real-time data is critical for many businesses today. RASA, an open-source conversational AI framework, is a powerful tool for creating such chatbots.

This guide will walk you through the process of setting up RASA on an Ubuntu 24.04 GPU server, integrating it with real-time data, and deploying it for scalable usage.

Prerequisites

Before diving into the setup, ensure you have the following:

  • An Atlantic.Net Cloud GPU server running Ubuntu 24.04.
  • CUDA Toolkit and cuDNN Installed.
  • A root or sudo privileges.

Step 1: Install Python and Additional Dependencies

The default Python version in Ubuntu 24.04 is Python 3.12, which is incompatible with some of the libraries required by RASA and HuggingFace. Therefore, we need to install Python 3.10.

1. Add the Python Repository
First, add the deadsnakes PPA to your system to access Python 3.10:

add-apt-repository ppa:deadsnakes/ppa

2. Update the Package Index
Update the package index to ensure you have the latest package listings:

apt update -y

3. Install Python 3.10 and Essential Libraries
Install Python 3.10 along with essential libraries.

apt install python3.10 python3.10-venv python3.10-dev -y

4. Create a Virtual Environment
Create a virtual environment to isolate your RASA installation.

python3.10 -m venv rasa-env
source rasa-env/bin/activate

Step 2: Install RASA and Additional Packages

With the virtual environment activated, install RASA and other necessary packages.

pip install rasa
pip install fastapi uvicorn websockets

Step 3: Initialize and Train the RASA Model

1. Initialize a new RASA project.

rasa init --no-prompt

This command creates a new RASA project with default configurations and sample data.

2. Train the RASA model using the sample data.

rasa train

This command trains the model and saves it in the models directory.

Step 4: Create a FastAPI Server for Real-time Communication

To enable real-time communication with the chatbot, we’ll use FastAPI to create a WebSocket server.

Create a new file named rasa_server.py:

nano rasa_server.py

Add the following code to the file:

from fastapi import FastAPI, WebSocket
from rasa.core.agent import Agent
from rasa.utils.endpoints import EndpointConfig
import asyncio

# Load the Rasa agent
agent = Agent.load("models", action_endpoint=EndpointConfig(url="http://localhost:5055/webhook"))

app = FastAPI()

# Root endpoint
@app.get("/")
async def root():
    return {"message": "Welcome to the Chatbot Server!"}

@app.websocket("/chat")
async def chat(websocket: WebSocket):
    await websocket.accept()
    while True:
        data = await websocket.receive_text()
        response = await agent.handle_text(data)
        await websocket.send_text(response[0]['text'])

This script sets up a FastAPI server that loads the RASA model and handles WebSocket connections for real-time chat.

Step 5: Create Custom Actions

Custom actions allow your chatbot to perform specific tasks. Create a new file named actions.py:

nano actions.py

Add the following code to define a custom greeting action:

from rasa_sdk import Action, Tracker
from rasa_sdk.executor import CollectingDispatcher

class ActionGreet(Action):
    def name(self) -> str:
        return "action_greet"

    def run(self, dispatcher: CollectingDispatcher, tracker: Tracker, domain: dict) -> list:
        dispatcher.utter_message(text="Hello! How can I assist you today?")
        return []

Step 6: Run the RASA Action Server and API

Run the RASA action server in the background:

rasa run actions &

Run the RASA API server:

rasa run --enable-api &

Step 7: Start the FastAPI Server

Start the FastAPI server using Uvicorn:

uvicorn rasa_server:app --host 0.0.0.0 --port 8000 --reload &

This command starts the server on 0.0.0.0:8000 with auto-reload enabled.

Step 8: Install WebSocat for WebSocket Testing

WebSocat is a command-line tool for testing WebSocket connections. Download and install it:

wget https://github.com/vi/websocat/releases/download/v1.14.0/websocat.x86_64-unknown-linux-musl
mv websocat.x86_64-unknown-linux-musl /usr/bin/websocat
chmod 755 /usr/bin/websocat

Step 9: Test the Chatbot Server

1. Test the Root Endpoint
Ensure the FastAPI server is running by testing the root endpoint:

curl http://localhost:8000

You should see the following response:

INFO:     127.0.0.1:52546 - "GET / HTTP/1.1" 200 OK
{"message":"Welcome to the Chatbot Server!"}

2. Test the WebSocket Connection
Use WebSocat to test the WebSocket connection:

websocat ws://localhost:8000/chat

3. Start a conversation with the chatbot:

> Hello
< Hey! How are you? 
> I am fine. What are you doing?
< I am a bot, powered by Rasa.

Conclusion

You have successfully set up a scalable chatbot using RASA on an Ubuntu 24.04 GPU server. The integration with FastAPI enables real-time communication, making the chatbot suitable for various applications. With additional steps like containerization and orchestration, you can further enhance the scalability and reliability of your chatbot in a production environment.