Host Qwen3 on Hugging Face Spaces with UI (Free + Fast Setup)

Introduction: Go From Model to Live Demo in Minutes

Want to make your own Qwen3 chatbot or coding assistant available online—free?

Hugging Face Spaces lets you:

Deploy Qwen3 (7B, 14B) easily
Build a frontend with Gradio
Share a live web app with a public link
No server or GPU setup needed

In this tutorial, you’ll:

Set up a Qwen3 model backend
Create a Gradio chat interface
Launch it on Spaces (in one click!)

1. Requirements

✅ Hugging Face account (free)
✅ A Qwen model (we’ll use Qwen/Qwen1.5-7B-Chat)
✅ Basic Python + Git knowledge

2. Create Your Space

Go to:
https://huggingface.co/spaces

Click “Create new Space”
Choose:
- SDK: Gradio
- Visibility: Public or Private
- Name: e.g. qwen3-chat-ui

3. Space File Structure

Upload or clone these 3 files:



/ (root)
├── app.py
├── requirements.txt
└── README.md

4. Sample `app.py`

python
import gradio as gr
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Qwen/Qwen1.5-7B-Chat"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", trust_remote_code=True)

def chat(user_input, history=[]):
    prompt = (
        "<|im_start|>system\nYou are a helpful assistant.<|im_end|>\n"
        "<|im_start|>user\n" + user_input + "<|im_end|>\n"
        "<|im_start|>assistant\n"
    )
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids
    output = model.generate(input_ids, max_new_tokens=256)
    response = tokenizer.decode(output[0], skip_special_tokens=True)
    return response.split("<|im_start|>assistant\n")[-1].strip()

iface = gr.Interface(fn=chat, inputs="text", outputs="text", title="Qwen3 Chatbot")
iface.launch()

5. `requirements.txt`

txt
transformers
gradio
accelerate

Hugging Face Spaces will auto-install these dependencies.

6. Optional `README.md`

markdown
# Qwen3 Chat UI

This Space uses Qwen/Qwen1.5-7B-Chat with Gradio to create a lightweight chatbot.

- Model: [Qwen1.5-7B-Chat](https://huggingface.co/Qwen/Qwen1.5-7B-Chat)
- UI: Built with [Gradio](https://gradio.app)

Try asking it anything!

7. Deploy and Share

Once uploaded:

Click “Build” → Hugging Face will provision your Space
After ~2–5 min, your app is live
Share your link: https://huggingface.co/spaces/yourname/qwen3-chat-ui

Bonus: Add Chat History & Markdown

Want to improve the experience?

python
iface = gr.ChatInterface(chat, title="Qwen3 Chatbot", theme="soft")

You can also:

Add Markdown output
Use ChatInterface with state
Extend to Qwen Code / Tool use

Alternative: Use vLLM for Faster Backend

If latency is an issue, deploy Qwen3 on your own GPU or server with vLLM, then proxy Hugging Face UI to it.

Conclusion: Qwen3 Live in Minutes

You’ve now deployed a public Qwen3 chatbot:

No complex hosting
No API keys
Just Python + Hugging Face Spaces

From zero to open-source LLM demo in under 15 minutes.

Qwen3 Coder - Agentic Coding Adventure

Step into a new era of AI-powered development with Qwen3 Coder the world’s most agentic open-source coding model.

Hugging Face GitHub Modelscope Discord