Host Qwen3 on Hugging Face Spaces with UI (Free + Fast Setup)
Introduction: Go From Model to Live Demo in Minutes
Want to make your own Qwen3 chatbot or coding assistant available online—free?
Hugging Face Spaces lets you:
-
Deploy Qwen3 (7B, 14B) easily
-
Build a frontend with Gradio
-
Share a live web app with a public link
-
No server or GPU setup needed
In this tutorial, you’ll:
-
Set up a Qwen3 model backend
-
Create a Gradio chat interface
-
Launch it on Spaces (in one click!)
1. Requirements
✅ Hugging Face account (free)
✅ A Qwen model (we’ll use Qwen/Qwen1.5-7B-Chat
)
✅ Basic Python + Git knowledge
2. Create Your Space
Go to:
https://huggingface.co/spaces
-
Click “Create new Space”
-
Choose:
-
SDK:
Gradio
-
Visibility: Public or Private
-
Name: e.g.
qwen3-chat-ui
-
3. Space File Structure
Upload or clone these 3 files:
/ (root) ├── app.py ├── requirements.txt └── README.md
4. Sample app.py
pythonimport gradio as gr from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "Qwen/Qwen1.5-7B-Chat" tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True) def chat(user_input, history=[]): prompt = ( "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n" "<|im_start|>user\n" + user_input + "<|im_end|>\n" "<|im_start|>assistant\n" ) input_ids = tokenizer(prompt, return_tensors="pt").input_ids output = model.generate(input_ids, max_new_tokens=256) response = tokenizer.decode(output[0], skip_special_tokens=True) return response.split("<|im_start|>assistant\n")[-1].strip() iface = gr.Interface(fn=chat, inputs="text", outputs="text", title="Qwen3 Chatbot") iface.launch()
5. requirements.txt
txttransformers gradio accelerate
Hugging Face Spaces will auto-install these dependencies.
6. Optional README.md
markdown# Qwen3 Chat UI This Space uses Qwen/Qwen1.5-7B-Chat with Gradio to create a lightweight chatbot. - Model: [Qwen1.5-7B-Chat](https://huggingface.co/Qwen/Qwen1.5-7B-Chat) - UI: Built with [Gradio](https://gradio.app) Try asking it anything!
7. Deploy and Share
Once uploaded:
-
Click “Build” → Hugging Face will provision your Space
-
After ~2–5 min, your app is live
-
Share your link:
https://huggingface.co/spaces/yourname/qwen3-chat-ui
Bonus: Add Chat History & Markdown
Want to improve the experience?
pythoniface = gr.ChatInterface(chat, title="Qwen3 Chatbot", theme="soft")
You can also:
-
Add Markdown output
-
Use
ChatInterface
with state -
Extend to Qwen Code / Tool use
Alternative: Use vLLM for Faster Backend
If latency is an issue, deploy Qwen3 on your own GPU or server with vLLM, then proxy Hugging Face UI to it.
Conclusion: Qwen3 Live in Minutes
You’ve now deployed a public Qwen3 chatbot:
-
No complex hosting
-
No API keys
-
Just Python + Hugging Face Spaces
From zero to open-source LLM demo in under 15 minutes.
Resources
Qwen3 Coder - Agentic Coding Adventure
Step into a new era of AI-powered development with Qwen3 Coder the world’s most agentic open-source coding model.