Alibaba’s Qwen powers AI Singapore’s latest LLM to strengthen multilingual performance in Southeast Asia

Alibaba Cloud, the digital technology and intelligence firm of Alibaba Group, announced Monday its support for the release of Qwen-SEA-LION-v4, the latest version of a large language model developed by AI Singapore (AISG), to address the linguistic, cultural, and commercial needs of Southeast Asia.

The firm said in a statement that built on Alibaba’s Qwen3-32B foundation model, the launch represents a significant step in AISG’s efforts to deliver increasingly capable and accessible artificial intelligence (AI) solutions for the region.

Qwen-SEA-LION-v4 delivers significant improvement in multilingual accuracy and cultural contextual understanding while being efficient enough to run on a consumer-grade laptop with 32GB of RAM.

It currently ranks first on the leaderboard for Southeast Asian Holistic Evaluation of Language Models (SEA-HELM) among open-source models under 200B parameters, thanks to its advanced reasoning, multilingual support, and long-context understanding tailored for Southeast Asian languages.

The base Qwen3-32B model has been further trained on over 100 billion Southeast Asian language tokens to enhance its ability to interpret local expressions, conversational nuances and regional knowledge domains.

“Our collaboration with Alibaba on Qwen-SEA-LION-v4 is an important milestone in advancing AI inclusivity and to make it more representative of Southeast Asia,

“It embodies our shared vision of accelerating AI innovation across the region and ensuring that developers, enterprises, and public institutions have access to AI that is open, affordable, and locally relevant, and is designed to truly understand the languages, cultures, and communities of this region,” said Dr Leslie Teo, Senior Director of AI Products, AI Singapore.

Hon Keat Choong, General Manager of Singapore, Alibaba Cloud Intelligence, said through this partnership with AI Singapore, they are proud to see the Qwen foundation model empowering the next wave of AI innovation in Southeast Asia.

“By combining our model’s multilingual and reasoning strengths with AI Singapore’s deep regional expertise, Qwen-SEA-LION-v4 demonstrates how open collaboration can make advanced AI more inclusive and locally relevant,

“We look forward to enabling more developers, enterprises and public-sector partners to build applications that truly understand the languages and cultures of this region,” he added.

Through this collaboration, Alibaba provided the Qwen3-32B foundation model and technical support for advanced post-training, while AI Singapore contributed their open-source region-specific data curation, optimization and evaluation across Southeast Asian language tasks.

It is noted that despite rapid global advancements in generative AI, many commercially available models remain predominantly English-centric, creating gaps in accessibility for a region with more than 1,200 languages.

The base model of Qwen3, the latest iteration of the Qwen family, was pre-trained on a large and diverse database, spanning 119 languages and dialects, totaling 36 trillion tokens. This gives it broader linguistic exposure from the outset, including Southeast Asian languages that are typically under-represented in mainstream AI models.

To further improve the model’s performance on low-resource languages, the Qwen team increased the proportion of translation and cross-lingual training tasks during post-training, allowing the model to better handle real-world multilingual input, such as code-switched speech, informal chat, and mixed English-local language usage that is common across Southeast Asia.

The latest version of SEA-LION also introduces several major upgrades designed to boost both linguistic performance and developer accessibility.

The model adopts byte-pair encoding (BPE) instead of the earlier sentence-piece tokenizer, allowing for more efficient and accurate multilingual text processing across Southeast Asian languages.

Its post training has also been expanded to cover more regional datasets in Southeast Asian languages, including Burmese, Filipino, Indonesian, Malay, Tamil, Thai and Vietnamese, giving it greater contextual understanding and cultural fluency.

With a native 32k-token context length, Qwen-SEA-LION-v4 can now handle complex interactions such as document-level reasoning and summarization.

It is also available in 4-bit and 8-bit quantized versions, making it easier and more cost-effective for developers and enterprises to deploy on local infrastructure without significant performance trade-offs.

91 percent of Singaporean organizations reported AI-related role changes – Deel