Tsinghua team open-sources large model inference engine "Chitu" to help domestic chips break through the cost and efficiency problems of FP8 model deployment and DeepSeek deployment
Professor Zhai Jidong's team from the Institute of High Performance Computing at Tsinghua University and Tsinghua-affiliated science and technology company Qingcheng Jizhi jointly announced the open source of the large model inference engine "Chitu", which is the first engine to natively run FP8 precision models on non-NVIDIA Hopper architecture GPUs and various domestic chips, bringing new breakthroughs to the widespread application and ecological construction of domestic AI chips.
Breaking the "hardware binding" dilemma, FP8 model deployment is no longer restricted
The development of DeepSeek has promoted the FP8 precision model to become the mainstream in the industry. With the continued popularity of DeepSeek, the demand for private deployment of large models in enterprises has also shown a blowout trend.
However, the current world-leading FP8 model has long relied on NVIDIA's H-series high-end GPUs, which has limited domestic companies in deploying large models due to the limitations of AI chips. On the one hand, the import of NVIDIA's H-series chips is restricted, making it difficult for domestic companies to obtain high-performance hardware support; on the other hand, most domestic chips do not support the FP8 data type and cannot fully utilize the performance of the new generation of AI models, making enterprise deployment costs high.
To break this dilemma, Tsinghua University and Tsinghua Unigroup jointly launched the open-source "Chitu" inference engine. Through the innovation of underlying technology, the engine has for the first time realized the efficient deployment of native FP8 models on non-H card devices (including GPU cards before NVIDIA's Hopper architecture and various domestic cards), getting rid of the dependence on specific hardware, and greatly reducing the threshold and cost of deploying AI models for enterprises.
Professor Zhai Jidong of Tsinghua University emphasized that Chitu condenses the team's years of accumulation of parallel computing and compilation optimization technology, and its goal is to "bridge the gap between advanced models and diversified hardware, so that domestic computing power can truly 'run' and provide key support for the implementation of China's large model industry." Qingcheng Jizhi CEO Tang Xiongchao said: "Chitu is positioned to become a bridge connecting diversified computing power and large model applications. We not only support NVIDIA's full range of GPUs, but also deeply optimize for domestic chips. In the future, we will gradually open source adaptation versions."