How to access Qwen - chat, apps & API
Three ways to use Alibaba's Qwen AI: a free browser chat, native desktop and mobile apps, and an OpenAI-compatible API. This guide walks through every path - plus a full step-by-step install guide.
No credit card to start chatting ยท ~5 minutes to a working API key ยท OpenAI-compatible
Three ways to access Qwen
Which door you walk through depends on one question: are you here to chat, or here to build?
Qwen is Alibaba Cloud's family of large language models, and there is no single "Qwen app" you have to install before you can use it. Instead there are three distinct front doors, each suited to a different kind of person. Most people only ever need one of them, so the first job is figuring out which one is yours.
Qwen Chat
The free web chatbot at chat.qwen.ai. No install, no payment, no API key. Open it in any browser and start typing - upload PDFs, paste screenshots, search the web.
Desktop & mobile apps
Native apps for Windows, macOS, Linux, iOS, and Android. Same account and chat history as the web, plus voice input and camera-based vision on mobile.
The Qwen API
Programmatic access via Alibaba Cloud Model Studio (DashScope). OpenAI-compatible, so existing code works with a base-URL swap. This is what the install guide below covers.
Access Qwen Chat in your browser
The fastest way to use Qwen needs no download, no install, and no payment - you can be chatting in under thirty seconds.
Qwen Chat is the official conversational interface for the Qwen models, built and maintained by Alibaba's Tongyi Lab. It is a free, web-based workspace where you can hold long multi-turn conversations, upload documents and images, generate and run code, search the live web with citations, and even produce interactive artifacts - all in the same window, with no setup required. For the overwhelming majority of people, this is the only access method they will ever need.
Open chat.qwen.ai
Go to chat.qwen.ai in any modern browser. It works on desktop, tablet, and phone - no app required. You can begin typing immediately as a guest.
Sign in (optional, but recommended)
Continue as a guest, or sign in with Google, GitHub, Apple, or email. Signing in is free and unlocks conversation history that syncs across devices, file uploads, and higher usage limits.
Pick a model
The default is the latest flagship Qwen model - the smartest all-rounder. Use the model picker at the top to switch to a specialist when it helps: a coding-tuned model for programming, a vision model for images, or a math model for proofs and equations.
Start working
Type a question, drag in a PDF, paste a screenshot, or enable web search for anything time-sensitive. Everything happens in one continuous conversation, so you can mix text, files, and images freely.
Desktop & mobile apps
The same account, chats, and prompts sync automatically across every platform - so you can start on your laptop and finish on your phone.
If you use Qwen often, a native app is more convenient than keeping a browser tab open. The desktop apps give you a dedicated window, keyboard shortcuts, and quicker file drops; the mobile apps add voice input and camera-based vision Q&A, so you can photograph a whiteboard or a document and ask about it on the spot. Whichever platform you choose, signing in with the same account keeps your full conversation history in sync.
Installing the desktop app is the usual three-step affair: download the installer for your operating system from the official Qwen site, run it (on macOS, drag the app into your Applications folder; on Linux, mark the AppImage executable and launch it), then sign in with your Qwen account. On mobile, search for the official Qwen app on the App Store or Google Play, or sideload the APK on Android if you prefer. Always download from the official source rather than a third-party mirror, since unofficial builds can be tampered with.
Set up the Qwen API, step by step
From zero to a working API call in about five minutes - get a key, install the SDK, point it at the right endpoint, and send your first request.
The Qwen API is delivered through Alibaba Cloud Model Studio, the unified developer platform for everything Qwen. Under the hood the API surface is called DashScope - you will see both names in the documentation, and they refer to the same service. The key thing for developers is that the API is OpenAI-compatible: it works with the standard openai Python package, the OpenAI JS SDK, LangChain, LiteLLM, and anything else that speaks the OpenAI protocol. Migrating existing code usually means changing just three things - the base URL, the API key, and the model name.
Part 1 ยท Get your API key
Create an Alibaba Cloud account
Go to alibabacloud.com and sign up with a valid email and phone number for verification. Use the international site unless you are explicitly targeting mainland China deployment.
Activate Model Studio
Open the Model Studio product page and click Activate, then accept the Terms of Service. This step also enables your free quota - 1 million input tokens plus 1 million output tokens, valid for 90 days on the Singapore (International) region.
Open the API Keys page
In the Model Studio console, find the sidebar item labelled API Keys (sometimes shown as Key Management).
Create and copy your key
Click Create API Key, optionally add a description to track which app it belongs to, then copy the key immediately. It starts with sk-. Store it in a password manager, a .env file, or your platform's secrets manager - and never commit it to a public Git repository.
Set your key as an environment variable so you never have to hardcode it:
# macOS / Linux - current session only export DASHSCOPE_API_KEY="sk-your-key-here" # Make it permanent - add to ~/.bashrc or ~/.zshrc echo 'export DASHSCOPE_API_KEY="sk-your-key-here"' >> ~/.bashrc # Windows PowerShell $env:DASHSCOPE_API_KEY = "sk-your-key-here"
Part 2 ยท Pick your endpoint
Model Studio is deployed in four regions, each with its own endpoint and its own keys. Choose the one closest to your users - or whichever your compliance team approves - and use that base URL everywhere.
| Region | Base URL | Best for |
|---|---|---|
| Singapore (International) | dashscope-intl.aliyuncs.com/compatible-mode/v1 | Default for non-China teams ยท has the free quota |
| US (Virginia) | dashscope-us.aliyuncs.com/compatible-mode/v1 | Lowest latency for US teams |
| China (Beijing) | dashscope.aliyuncs.com/compatible-mode/v1 | Mainland China deployments |
| Hong Kong | cn-hongkong.dashscope.aliyuncs.com/compatible-mode/v1 | Hong Kong region |
Part 3 ยท Install the SDK
For almost everyone, the OpenAI-compatible interface is the right choice - install the official openai package and you are done. Use a virtual environment to keep things tidy:
# (optional) create and activate a virtual environment python -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate # install the OpenAI-compatible SDK pip install openai # OR - only if you need DashScope-specific features # (batch invocation, advanced multimodal, real-time speech) pip install dashscope
dashscope package if you specifically need its extras like 50%-discounted batch invocation or real-time speech.Your first Qwen API calls
With a key, an endpoint, and the SDK installed, every standard feature works exactly as it does with OpenAI.
Your first request
import os from openai import OpenAI client = OpenAI( api_key=os.getenv("DASHSCOPE_API_KEY"), base_url="https://dashscope-intl.aliyuncs.com/compatible-mode/v1", ) completion = client.chat.completions.create( model="qwen-plus", messages=[ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Who are you?"} ] ) print(completion.choices[0].message.content)
Streaming responses
For chatbot-style apps, stream tokens as they are generated:
stream = client.chat.completions.create( model="qwen-plus", messages=[{"role": "user", "content": "Explain async I/O simply."}], stream=True, ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="", flush=True)
Vision input (multimodal)
For image tasks, switch to a vision-capable model and add an image block to the message:
response = client.chat.completions.create( model="qwen-vl-plus", messages=[{ "role": "user", "content": [ {"type": "image_url", "image_url": {"url": "https://example.com/chart.png"}}, {"type": "text", "text": "Extract this chart's data as JSON."} ] }] ) print(response.choices[0].message.content)
Choosing a model & what it costs
One of Qwen's strongest selling points is price - often an order of magnitude cheaper than GPT-4-class endpoints, with a free tier large enough to actually prototype on.
You access the whole Qwen catalogue through the same key and endpoint - you simply change the model string. A good default is a balanced general model; reach for cheaper high-volume models when latency and cost matter more than peak quality, a reasoning flagship for the hardest problems, and the specialists for code, vision, or math. Here is a snapshot of popular options on the International endpoint:
| Model | Input $/M | Output $/M | Context | Use for |
|---|---|---|---|---|
| Qwen-Max | $1.04 | $4.16 | 262K | Reasoning-heavy work |
| Qwen-Plus | $0.26 | $0.78 | 1M | Best general default |
| Qwen-Turbo | $0.05 | $0.20 | 1M | High-volume, cheap |
| Qwen-Flash | from $0.033 | from $0.13 | 1M | Tiered, lowest cost |
| Qwen-Coder | ~$0.30 | ~$1.50 | 128K | Programming |
| Qwen-VL | varies by model | large | Vision / images | |
Two billing nuances are worth knowing. First, several models are priced in tiers based on the input size of each request - a short request and a near-maxed-out one fall into different rate brackets, so keeping requests lean saves money. Second, batch invocation gets a 50% discount on both input and output tokens, ideal for non-real-time workloads like overnight document processing or dataset labelling, at the cost of asynchronous (non-instant) results.
Troubleshooting common errors
Most access problems trace back to one of a small handful of causes - here's how to fix them fast.
Almost always one of three things: the wrong base URL for your key's region (Singapore keys fail on Beijing endpoints), an environment variable that isn't actually loaded in your shell, or a sub-workspace key without model permissions. Check the URL first.
You've hit a rate limit - DashScope applies both requests-per-minute and requests-per-second caps. Implement exponential backoff in production, and request a quota increase from the Model Studio console once you have a billing history.
Confirm the key is from the same region as your base URL, that Model Studio is actually activated on the account, and that you copied the full key without trailing spaces. Regenerate the key if in doubt.
The model string may be wrong or unavailable in your region - check the official model catalogue for exact names. Preview models in particular can have names that change, so verify against current docs.
The 1M + 1M free allowance is a combined cap across most models and expires after 90 days. Reasoning models also emit far more output tokens than you might expect, which burns the output half quickly. Check usage in the console.
The API can be called from anywhere - AWS, GCP, Azure, your laptop - so a connection failure is usually a local network or firewall issue, or a region your network can't reach. Try a different network to isolate it.