Build a ChatGPT Alternative with Qwen3 + vLLM + Custom UI

Build a ChatGPT Alternative with Qwen3 + vLLM + Custom UI

Introduction: Your Own AI, No Subscriptions

With Qwen3 and open-source tools like vLLM, it’s easy to create a powerful ChatGPT alternative:

  • 100% local or private cloud

  • Full chat history, streaming, markdown

  • Use any Qwen3 model (7B, 14B, or even 480B)

  • Build with Python, JS, or Gradio

This guide shows how to deploy your own chatbot with:

  • Qwen3 (LLM)

  • vLLM (OpenAI-style API)

  • Simple frontend (HTML/JS or Gradio)


1. Set Up Qwen3 with vLLM

Install vLLM:

bash
pip install vllm

Start the API server:

bash
python -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen1.5-14B \ --port 8000 \ --enable-token-streaming

Now your local API is running at:

bash
http://localhost:8000/v1/chat/completions

2. Create a Basic HTML Chat UI

Save this as index.html:

html
<!DOCTYPE html> <html> <head><title>Qwen3 Chat</title></head> <body> <h2>Chat with Qwen3</h2> <div id="chat"></div> <textarea id="input" rows="4" cols="50"></textarea><br> <button onclick="send()">Send</button> <script> async function send() { let input = document.getElementById("input").value; const res = await fetch("http://localhost:8000/v1/chat/completions", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": "Bearer your-key" }, body: JSON.stringify({ model: "Qwen/Qwen1.5-14B", messages: [{ role: "user", content: input }] }) }); const data = await res.json(); document.getElementById("chat").innerHTML += "<p><b>You:</b> " + input + "</p>"; document.getElementById("chat").innerHTML += "<p><b>Qwen3:</b> " + data.choices[0].message.content + "</p>"; } </script> </body> </html>

βœ… Open this in your browser, and chat away—no cloud or account needed!


3. Optional: Use Gradio UI for Zero-Code Interface

python
import gradio as gr import openai openai.api_base = "http://localhost:8000/v1" openai.api_key = "qwen-key" def chat(prompt, history=[]): messages = [{"role": "system", "content": "You are a helpful assistant."}] for user, bot in history: messages.append({"role": "user", "content": user}) messages.append({"role": "assistant", "content": bot}) messages.append({"role": "user", "content": prompt}) response = openai.ChatCompletion.create( model="Qwen/Qwen1.5-14B", messages=messages ) answer = response['choices'][0]['message']['content'] history.append((prompt, answer)) return answer, history gr.ChatInterface(chat).launch()

Use this if you prefer no HTML/JS. Just run the script and launch in browser.


4. Why This Beats Hosted LLMs

Feature ChatGPT Your Qwen3 Bot
Cost per use πŸ’Έ Paid/subscription βœ… Free after setup
Privacy ❌ Cloud logs βœ… 100% private
Custom fine-tuning ❌ Not allowed βœ… Fully supported
Run offline ❌ No βœ… Yes (local)
Custom API limits ❌ None βœ… Full control

5. Bundle as an App (Optional)

You can bundle your chatbot with:

  • Electron.js (turn HTML into desktop app)

  • Tauri (for mobile/desktop hybrid)

  • Docker (for private cloud deployments)


Conclusion: Your Own AI Assistant—Fully Open

With just:

  • Qwen3

  • vLLM

  • Chat UI (HTML/Gradio)

You’ve built a ChatGPT-level assistant—no API keys, no monthly fees, and no limits.

βœ… Customize, fine-tune, or scale however you want. Qwen3 gives you full control.


Resources



Qwen3 Coder - Agentic Coding Adventure

Step into a new era of AI-powered development with Qwen3 Coder the world’s most agentic open-source coding model.