Vietnam’s technology giant FPT Corporation and global chipmaker NVIDIA have released the Nemotron-Personas-Vietnam dataset, an open-source synthetic dataset designed to help developers build AI systems

In a statement last Friday, FPT said the dataset reflects Vietnam’s language, culture, workforce, and economic realities, making it available for commercial use by developers, researchers, and enterprises.

The dataset comprises 900,000 synthetic personas grounded in Vietnam’s official statistics and geographic structure, with each record containing 31 fields including persona attributes and contextual data. It is available on HuggingFace and is compatible with NVIDIA NeMo libraries across the full AI development lifecycle, from data curation and fine-tuning through to deployment.

The release extends NVIDIA’s Nemotron-Personas methodology, which builds population-scale synthetic datasets that are auditable and demographic-grounded, to the Vietnamese market. NVIDIA contributed the open model framework, NeMo Data Designer synthetic data library, and the Nemotron-Personas methodology.

FPT contributed local expertise, validation methodologies, data infrastructure, and AI research capabilities through three entities: FPT Smart Cloud, which provides NVIDIA-accelerated GPU cloud services; the Quantum AI and Cyber Security Institute, which led the technical methodology and validation; and FPT DC5, which contributed survey-collected persona data.

FPT said its sovereign AI stack covers NVIDIA-accelerated GPU cloud services, inference-ready AI platforms, and ready-to-use AI applications, forming an end-to-end infrastructure for building and deploying AI within regional boundaries.

Vietnam’s tech giant FPT, Viettel join Nvidia’s sovereign AI push