Build a ChatGPT Alternative with Qwen3 + vLLM + Custom UI
Introduction: Your Own AI, No Subscriptions
With Qwen3 and open-source tools like vLLM, it’s easy to create a powerful ChatGPT alternative:
-
100% local or private cloud
-
Full chat history, streaming, markdown
-
Use any Qwen3 model (7B, 14B, or even 480B)
-
Build with Python, JS, or Gradio
This guide shows how to deploy your own chatbot with:
-
Qwen3 (LLM)
-
vLLM (OpenAI-style API)
-
Simple frontend (HTML/JS or Gradio)
1. Set Up Qwen3 with vLLM
Install vLLM:
bashpip install vllm
Start the API server:
bashpython -m vllm.entrypoints.openai.api_server \ --model Qwen/Qwen1.5-14B \ --port 8000 \ --enable-token-streaming
Now your local API is running at:
bashhttp://localhost:8000/v1/chat/completions
2. Create a Basic HTML Chat UI
Save this as index.html:
html<!DOCTYPE html> <html> <head><title>Qwen3 Chat</title></head> <body> <h2>Chat with Qwen3</h2> <div id="chat"></div> <textarea id="input" rows="4" cols="50"></textarea><br> <button onclick="send()">Send</button> <script> async function send() { let input = document.getElementById("input").value; const res = await fetch("http://localhost:8000/v1/chat/completions", { method: "POST", headers: { "Content-Type": "application/json", "Authorization": "Bearer your-key" }, body: JSON.stringify({ model: "Qwen/Qwen1.5-14B", messages: [{ role: "user", content: input }] }) }); const data = await res.json(); document.getElementById("chat").innerHTML += "<p><b>You:</b> " + input + "</p>"; document.getElementById("chat").innerHTML += "<p><b>Qwen3:</b> " + data.choices[0].message.content + "</p>"; } </script> </body> </html>
β Open this in your browser, and chat away—no cloud or account needed!
3. Optional: Use Gradio UI for Zero-Code Interface
pythonimport gradio as gr import openai openai.api_base = "http://localhost:8000/v1" openai.api_key = "qwen-key" def chat(prompt, history=[]): messages = [{"role": "system", "content": "You are a helpful assistant."}] for user, bot in history: messages.append({"role": "user", "content": user}) messages.append({"role": "assistant", "content": bot}) messages.append({"role": "user", "content": prompt}) response = openai.ChatCompletion.create( model="Qwen/Qwen1.5-14B", messages=messages ) answer = response['choices'][0]['message']['content'] history.append((prompt, answer)) return answer, history gr.ChatInterface(chat).launch()
Use this if you prefer no HTML/JS. Just run the script and launch in browser.
4. Why This Beats Hosted LLMs
| Feature | ChatGPT | Your Qwen3 Bot |
|---|---|---|
| Cost per use | πΈ Paid/subscription | β Free after setup |
| Privacy | β Cloud logs | β 100% private |
| Custom fine-tuning | β Not allowed | β Fully supported |
| Run offline | β No | β Yes (local) |
| Custom API limits | β None | β Full control |
5. Bundle as an App (Optional)
You can bundle your chatbot with:
-
Electron.js (turn HTML into desktop app)
-
Tauri (for mobile/desktop hybrid)
-
Docker (for private cloud deployments)
Conclusion: Your Own AI Assistant—Fully Open
With just:
-
Qwen3
-
vLLM
-
Chat UI (HTML/Gradio)
You’ve built a ChatGPT-level assistant—no API keys, no monthly fees, and no limits.
β Customize, fine-tune, or scale however you want. Qwen3 gives you full control.
Resources
Qwen3 Coder - Agentic Coding Adventure
Step into a new era of AI-powered development with Qwen3 Coder the worldβs most agentic open-source coding model.