What is the dispute over the open and closed source of large models?

news

Debate does not negate each other's market value; both market demands will coexist for a long time. This year, entrepreneurs, investors, and entrepreneurs in the AI (Artificial Intelligence) industry of China and the United States have simultaneously sparked a debate: should large models be open-source or closed-source. In China, the focus of the debate is Baidu founder Robin Li. In April this year, he publicly stated, "In the past, people thought open-source was cheap, but in fact, in the context of large models, open-source is the most expensive. Open-source models will increasingly fall behind." This view has not been without opposition. Opponents include Alibaba Cloud CTO (Chief Technology Officer) Zhou Jingren, CEO (Chief Executive Officer) of Baidu Wang Xiaochuan, and CEO of Cheetah Mobile Fu Sheng. In May this year, Zhou Jingren said in a group interview with the media, "The contribution of open-source to global technology and ecology is undeniable. This has been proven many times worldwide, and there is no need to discuss it again." In the United States, the debate is more intense. Tesla founder Musk once sued AI startup OpenAI. Musk was one of the main founders and investors of OpenAI in 2015. He believes that OpenAI, led by current CEO Altman, has violated the commitment to "operate as a non-profit organization and make AI open and open-source." Two famous investors in Silicon Valley, a16z founder Andreessen and founder of Kleiner Perkins Caufield & Byers, have had multiple rounds of confrontations on social media. The former believes that closed-source models will lead to monopolies by giants and destroy academic research. The latter believes that large models are economic weapons and should not be open-source. Open-source is a software development model - the source code is freely published and survives on community donations. Developers can freely download, modify, and distribute, feedback software bugs (software defects or errors), and propose optimization suggestions. This collective innovation will accelerate software iteration. Open-source models refer to models that can be used for free and have published model parameters and other technical details; closed-source models refer to models that require payment and have not published technical details. Simply understood, open-source is roughly equivalent to free, but you have to buy your own groceries and cook; closed-source is roughly equivalent to paid, equivalent to going to a restaurant to eat, and can have better service. Should large models be open-source or closed-source? This involves commercial interests, technical views, and other factors, so many facts are confused - but there are several certain facts behind this debate. First, different business strategies have led companies to choose different technical routes. Companies such as Baidu and OpenAI that hope to quickly commercialize large model businesses have chosen closed-source; companies such as Alibaba Cloud and Meta that profit from cloud computing or advertising businesses have chosen to open-source to make the cake bigger. Second, the market demands for open-source and closed-source will coexist for a long time, and it is impossible to simply judge which is better. Open-source and closed-source models have their own applicable scenarios, and the choice of which model is related to market demand. This will not change with the will of the model manufacturers. Third, there is an essential difference between open-source models and open-source software. Open-source software has published source code and most technical details. Open-source models are more like a free technical black box - the model parameters are open, but rarely open source code, training data, training process, and other technical details. In addition, the debate between open-source and closed-source in China's AI industry is more about business competition. Open-source has no borders, and this concept has been widely recognized. However, against the backdrop of intensifying competition in the AI industry between China and the United States, the voices opposing open-source in the U.S. industry are getting louder and louder.

Who is open-source, and who is closed-source?

The development of large models is still in its early stages and still needs to explore trial and error. Open-source and closed-source are not clear-cut. When companies face the choice between open-source and closed-source, they have taken three different paths.

The most extreme is to only do open-source models. There are relatively few companies that take this path, and Meta is one of the few. The advantage is that it will attract more users, but the problem is that there is no profit model, and only large companies can afford it. Meta's Llama 3 is the most widely used open-source model in the world. Meta's main business is social media (such as Facebook, Instagram), and its net profit in 2023 was as high as $39 billion. Meta has the impulse to explore new businesses and does not have the pressure to profit from models. Therefore, it can only do open-source models and temporarily not consider profit issues. A middle route is to run open-source and closed-source models in parallel, which is very flexible. Companies can attract users through open-source and obtain income through closed-source; they give developers the space to choose, and the companies themselves also have room for error.Companies that have chosen this path include Microsoft, Google, Alibaba Cloud, Tencent Cloud, as well as AI startups like Mistral Al, ZhiPu AI, and Baichuan Intelligence. The common practice of running open-source and proprietary models in parallel is to attract users with free open-source models and guide them to use larger, more powerful proprietary models. For instance, Microsoft's main commercial model is the GPT-4 series from OpenAI, but it has also open-sourced the smaller model Phi-3 Mini; Alibaba Cloud has open-sourced more than a dozen models with parameters ranging from 500 million to 110 billion, while also offering proprietary foundational and industry-specific models; Google has open-sourced the Gemma series of smaller models and provides proprietary Gemini series foundational models; startups like Mistral Al have open-sourced previous-generation models that are less performant, guiding users to pay for the current generation's more powerful models. The issue with running open-source and proprietary models in parallel is that commercialization can sometimes lead to internal competition. Some customers, after using the free open-source models, may not opt for the paid proprietary models, leading to a loss of revenue for model vendors. A technical professional from a Chinese AI software service provider told Caijing in July that they recently used Alibaba Cloud's Tongyi Qianwen open-source model (Qwen2) for secondary training and fine-tuning, serving a local city's tourism bureau. This order exceeded ten million yuan, and they were the beneficiaries, but Alibaba Cloud did not earn any revenue. Caijing inquired about the license agreement for Qwen2 on Github (the world's largest code hosting platform). The agreement states "no need to submit a commercial use request," meaning that once Qwen2 is trained and fine-tuned for commercial use, there is no need to pay. The long-term value of open-source is to expand the model market. A person from Alibaba Cloud told Caijing that it is normal for users to modify open-source models for commercial use, and those who do open-source must be prepared for this. Although Alibaba Cloud has not yet enjoyed all the benefits, it has expanded the industry's market. In the long run, it will ultimately benefit. The big model industry needs to establish an ecosystem and form a growth flywheel when it is widely used by different customers such as governments, large and small enterprises, and developers. Alibaba Cloud's AI open-source community, ModelScope, can see this trend. As of July this year, the ModelScope community has more than 5.6 million developers, over 5,500 high-quality models, and thousands of datasets, making it the largest open-source model community in China. A more optimistic view is that open-source and proprietary models can even form an upstream and downstream relationship. Open-source is at the technical upstream, responsible for community participation, technological iteration, and attracting customers, ensuring technological leadership. Proprietary models are at the downstream, responsible for commercial monetization. Lanzhou Technology is a Chinese big model startup company. Li Jingmei, a partner and co-CEO of Lanzhou Technology, told Caijing that open-source is both a technical strategy and a business strategy. It can influence the developer community and the minds of the technical teams of potential customers. There is no contradiction between open-source and proprietary models. The customer feedback cycle for proprietary models is relatively long, but community developers of open-source models can provide feedback quickly. This can help the company quickly iterate its products. An AI strategy planner from a top Chinese technology company believes that for leading cloud manufacturers like Alibaba Cloud, running open-source and proprietary models in parallel is better than just doing proprietary models. Alibaba Cloud's revenue mainly comes from the four major public cloud components (computing, storage, networking, and databases). Free open-source models will promote the consumption of customer business data, thereby driving the sales of the aforementioned basic cloud products. Only doing proprietary models is a simple, direct, and clear logic. Large companies that follow this route believe that big models must be proprietary to be commercialized, otherwise, there is no commercial closed loop. AI startups like OpenAI (with the GPT-4 series model), Amazon (which has invested in AI startup Anthropic, with the Claude 3.5 series model), Huawei (with the PanGu large model), and Baidu (with the Wenxin large model) have all chosen this path. Enterprises usually pay for the use of big models according to the number of API (Application Programming Interface) calls, which is similar to paying for utilities like water, electricity, and gas based on usage. The business model of proprietary models is theoretically the healthiest. The revenue growth of Microsoft Azure, Amazon AWS, and Google Cloud has increased by about 5 percentage points in the past year, and the profit level has also slightly improved. This is believed to be the result of the big model's pull. However, in China, it is difficult for proprietary models to make a real profit in the short term. In May this year, the Chinese model market began a price war. The purpose of the price reduction is to stimulate customer demand and expand the market size. ByteDance's cloud service, Volcano Engine, Alibaba Cloud, Tencent Cloud, and Baidu's smart cloud have successively reduced the call price of big models by more than 90%. The gross profit margin of big model calls has dropped from over 60% to below 0%. A person in charge of the big model business of a Chinese cloud manufacturer believes that the big model call has entered the "negative gross profit era." The more times it is used, the greater the loss. The difference is that big factories like Alibaba, ByteDance, and Baidu can afford the loss, while small and medium-sized enterprises and startups cannot. He and a senior executive of a big model startup company expressed a similar view - different companies have different genes, and the business strategies of models are also different. Cloud is the core business of Alibaba Cloud, and the ultimate goal of model open-source is to sell more cloud. Volcano Engine is backed by ByteDance, and the parent company's advertising business can provide blood transfusion. Volcano Engine's market share in cloud computing is far lower than Alibaba Cloud, "the barefoot is not afraid of the shoe-wearing," and hopes to seize more market share through the price war. AI is Baidu's core business, and Baidu hopes to make a profit from big models, so it emphasizes the value of proprietary models.

What is the debate? What is the consensus? There are several focuses in the debate on open-source and proprietary models in China - first, is there a difference between open-source models and open-source software? Second, which is stronger, open-source models or proprietary models? Third, which is more expensive, open-source models or proprietary models? The first debate, is there a difference between open-source models and open-source software? The answer is, there is a big difference. The vast majority of open-source models are not fully open-source. They are more like free black boxes that can be used, rather than transparent boxes like open-source software. Open-source software will publish the source code, and developers can master most of the technical details of the software through the source code. The core logic of open-source software being free is that developers from all over the world can help software manufacturers find product bugs and make optimization suggestions. Social development can not only reduce the R&D costs of software but also speed up the iteration speed of software. Mobile operating systems like Android and database software like MySQL have achieved success in this way. The complexity of open-source models far exceeds that of open-source software, and the projects that can be open-sourced include source code, parameter weights, model structure, training data, and training process. Two scholars from Radboud University in the Netherlands, Lisonfield and Dingmans, published a paper in March this year comparing the degree of open-source models. The paper shows that the most performant open-source models usually only open-source parameter weights. One explanation is that model manufacturers cannot disclose the "recipe" in full to ensure that the model's performance is leading. Taking the world's most performant open-source model Llama3 as an example, it only partially open-sourced parameter weights and model structure, and the source code, training data, and training process have not been open-sourced. The value of the open-source concept to the industrial ecosystem is undoubted. Xin Zhou, General Manager of Baidu's Smart Cloud AI and Large Model Platform, told Caijing in July that open-source models will make model applications and industry models richer. However, he opposes confusing open-source models with open-source software. Because there is an essential difference between the two - open-source models cannot improve product performance and reduce R&D costs by the participation of social developers like open-source software. The base model can only be improved by the model manufacturer's own training, and the fine-tuning and inference optimization of open-source models are not as good as commercial models. The technical requirements for developers are high, and the actual usage cost is not low. The second debate, which is stronger, open-source models or proprietary models? The fact is that proprietary models usually have stronger performance than open-source models, but the performance gap between open-source models and proprietary models is narrowing. The Stanford University Basic Model Research Center (CRFM) has been conducting global large model testing rankings for a long time. As of the MMLU test rankings announced on July 24, only Llama3.1 is an open-source model in the top ten, and Claude3.5 (invested by Amazon), GPT-4o (invested by Microsoft), Gemini1.5 Pro (developed by Google), etc., are all proprietary models. Li Jingmei believes that the proprietary models of the same company must be stronger than the open-source models. However, in the industry's horizontal comparison, proprietary models are not necessarily stronger than open-source models. Because large models iterate every 6 to 12 months, the evolution speed of some open-source models may be faster. The rankings of the evaluation organization show this trend. The LMSYS organization (Large Model System Research Organization), initiated by the University of California, Berkeley, also conducts long-term global model performance evaluation and ranking. Meta's Llama3.1 and Alibaba Cloud's Qwen2 are rapidly rising in the rankings of this evaluation. Llama3.1 even surpassed most proprietary models. A person in charge of the large model business of a Chinese cloud manufacturer analyzed that there are two reasons for the narrowing gap between open-source models and proprietary models - in the past year, the basic large models have generally entered a bottleneck period for performance improvement. Open-source models have attracted a large number of developers. Although they cannot directly improve the model's performance through code feedback, they have improved the overall level of model research, which indirectly helps open-source models improve their performance. The third debate, which is more expensive, open-source models or proprietary models? The conclusion is that performance is the decisive factor. The cost of model usage is directly related to the model's performance. The stronger the performance, the lower the long-term usage cost, because the number of calls to complete the task is fewer. Open-source models are free and usually give people the impression of being cheaper and having lower costs. Xin Zhou explained that the application of large models is a comprehensive solution that includes "technology + service," and enterprises need to calculate the "total account." Proprietary model manufacturers not only provide complete models and toolchains but also provide training and technical services to help enterprises get started quickly. Open-source models seem to be free, but to achieve the same effect as proprietary models, a lot of manpower, funds, and time need to be invested later, and the comprehensive cost is actually higher. In the long run, the decisive factor for the application cost of open-source and proprietary models is the inference cost. Proprietary models with the same parameter level usually perform better than open-source models, and the comprehensive cost is also lower. Xin Zhou calculated an account, if an enterprise deploys open-source models for free, and it costs 500,000 yuan to deploy proprietary models. In the early stage of investment, open-source models are cheaper. In the later stage of use, if the proprietary model is 20% stronger in comprehensive performance than the open-source model, the proprietary model can save tens of thousands of yuan a day in some enterprises with large usage. In the end, the long-term usage cost must be far lower than the open-source model.

Who is using open-source models? Who is using proprietary models? Is the open-source model good or the proprietary model good? This question is not determined by the supply side's model manufacturers, but by the demand side's enterprise customers. In public, there are constant disputes among enterprises. However, many cloud manufacturer technicians told Caijing that these disputes cannot deny each other's market value. These two kinds of demands will coexist for a long time. Looking at it from another perspective, the disputes are more likely to jointly increase the market's voice. In fact, most enterprise customers do not care whether the model needs to be open-sourced. Xin Zhou summarized that after communicating with many large enterprise customers, there are many factors for the head of the IT department to decide whether to use a model, usually ranked by priority: effect, performance, price, and security. Open-source and proprietary are not decisive factors.

In the "toolbox" of most enterprises, open-source models and proprietary models are complementary. Large enterprises usually implement large models in different stages.

In the early stage, the IT department will sort out the performance and characteristics of open-source models and proprietary models on the market. Different models have different advantages, some have strong language and voice capabilities, and some have strong data statistics capabilities. In the early stage, free open-source model POC (Proof of Concept) tests are used to verify business effects. In the middle stage, the first phase of the project is done in business scenarios with low difficulty and fast results, such as marketing, customer service, and knowledge bases. It is necessary to purchase proprietary models and train and fine-tune a set of their own open-source models. Let internal and external models "race horses," compare the effects and costs of different models, and switch the amount at any time. In the later stage, according to the landing effect, plan the second and third phases of the project in business scenarios with high difficulty and slow results step by step. At this time, it may even cost tens of millions of yuan to establish a set of controllable basic large models or industry large models.Open-source models are free, but they are not plug-and-play and require time and effort to set up, with no one to take responsibility for any issues that may arise. Proprietary models, on the other hand, offer a mature product that comes with full-service support from pre-sale to post-sale. To simplify, open-source models are like buying groceries and cooking at home, while proprietary models are like dining out at a restaurant where you pay for the convenience and service. Xin Zhou's perspective is that open-source models are suitable for academic research, small and medium-sized enterprises (SMEs) with extremely limited IT budgets, and some large enterprises for internally controlled self-research projects. However, they are not suitable for large-scale commercial projects facing the public. In serious commercial projects that can cost millions or even tens of millions of dollars, proprietary models remain the best choice. Open-source models are not a free lunch. Large enterprises face many hidden costs when using open-source models, such as purchasing computing power and software adaptation. A technical head of a Chinese outbound intelligent marketing service provider told "Finance" in July that his company heavily relies on cloud services, with an annual R&D expenditure exceeding 80 million yuan. In the past two years, the company has been using more than ten proprietary models, but none of them are open-source. In his view, open-source models require time and manpower to work with, and most of them cannot be used out of the box, with no one to take responsibility, so they can only be considered as "toys". He prefers to manage more than ten proprietary models, switching at any time according to price and performance. This is the most cost-effective approach. A large joint-stock commercial bank's IT head believes that the inability to use open-source models out of the box is not a big problem. He told "Finance" in December 2023 that his team has been using Alibaba (Tongyi open-source model), Meta (Llama open-source model), Baidu (Wenxin series), and Zhishu (GLM series) for self-developed compliance report audit applications. Open-source models are suitable for such small projects, allowing for free POC testing and modifications as needed. His IT team consists of dozens of people, and there are also outsourced IT service companies, with enough manpower to handle these issues. However, he also believes that in large projects worth millions or tens of millions of dollars, proprietary models are more appropriate because they are stable and reliable, and there are model companies that can take responsibility. Training a complete industry model with open-source models requires tens of millions of dollars, and it is also necessary to purchase AI chips and build your own computer rooms. The above AI software service provider's technical personnel summarized that open-source models are suitable for some central and state-owned enterprises that have high requirements for data security and self-control, and are not so sensitive to costs. They will use open-source models to train their own industry models. Because "open-source models + private cloud" meets the data security and self-control demands of many central and state-owned enterprises.

How to move forward in the future?

The debate between open-source and proprietary models in the Chinese market is purely a commercial issue. However, in the international market, the debate between open-source and proprietary models involves more factors such as antitrust and national interests. After the price war in May of this year, the call for large models in China has entered the "negative gross profit era". Both open-source and proprietary models face a problem - large models cannot make a profit directly. "The elimination competition in the large model market has begun," said a person in charge of large model business at a Chinese cloud manufacturer. The negative gross profit of large model calls means that the more calls in the short term, the greater the loss for cloud manufacturers. Chinese cloud manufacturers are betting that after the price of large model calls is reduced by 90%, the number of large model calls will increase exponentially in the next 1-2 years. In the long run, the cost of computing power for cloud manufacturers will be spread out with the growth of customer demand, and it will eventually achieve a positive profit. Even if this bet does not hold, there will be a group of model manufacturers that will die in the price war, and the surviving manufacturers will clean up the mess.

Many people in the industry expressed the same view to "Finance" that this round of elimination competition will last for 1-2 years, and only 3-5 basic model companies can continue to survive. An Xiaopeng, the head of the Alibaba Cloud Intelligence Technology Research Center and a member of the Chinese Informationization Hundred People's Committee, said in July this year that there is no hundred model war in China, and there is not even a ten model war. Large models require continuous investment, and they need the ability to have ten thousand cards or even a hundred thousand cards, and they also need commercial returns. Many companies do not have such capabilities. In the future, there will only be three to five basic model manufacturers in the Chinese market. Who is the beneficiary of the price war? Who will be the last one to laugh? The above AI strategic planning person of the Chinese top technology company believes that in this round of price war, Alibaba Cloud and ByteDance's Volcano Engine have the thickest blood. Alibaba Cloud can make a profit from the cloud, and the Volcano Engine has the advertising business of ByteDance to provide blood. In terms of price war, Baidu is not as good as Alibaba and ByteDance. However, Baidu's Wenxin large model technology is strong, and there will be a group of customers willing to pay for the technology. This is helpful for Baidu to withstand the price war. He further explained that in the next 1-2 years, several large model startups in the Chinese market will face a severe test. Large model startups either choose to become project-based model development companies or turn to vertical industry models.

The overall competition in the Chinese large model market is far more important than the local competition between open-source and proprietary models. The direction of overall competition will directly determine the results of local competition.

A person from Alibaba Cloud said frankly that both open-source and proprietary models have their own advantages, and Alibaba Cloud hopes to make AI more inclusive. Whether it is open-source or proprietary, the core purpose is to give developers more choices. Alibaba Cloud has chosen to walk on both legs of open-source and proprietary, with both full-size, full-modal open-source models and proprietary models. Another person in charge of the large model business of a Chinese cloud manufacturer believes that open-source does not have a business model. In the Chinese model market, only the top enterprises or a very few startups that can continue to finance can insist on open-source. The Chinese market may eventually only have 1-2 open-source models. Model manufacturers will train a new generation of models every 6-12 months. In the Chinese model market, as the pressure to make a profit increases, model open-source may become more and more "strategic" - companies will tend to open-source the previous generation of technology backward, smaller parameter models, and guide users to pay for the more technically updated, larger parameter proprietary models. The competition between open-source and proprietary models will not end in the short term. Some companies can even run both open-source and proprietary roads at the same time. In the IT industry, this is not without precedent. The database has been born for more than 60 years, and the first open-source database has been born for more than 50 years. The database market is still active with different proprietary and open-source databases, and new database brands continue to emerge. Database giant Oracle even has both proprietary RDBMS databases and open-source MySQL databases. Many technical personnel of cloud manufacturers believe that open-source models and proprietary models will coexist for a long time. The large model market will gradually grow in the competition of different technical routes.It seems like you've provided some non-standard characters and spaces that don't form a coherent sentence or phrase in Chinese. Could you please provide a clear and complete Chinese text that you would like me to translate into English? This will help me to give you an accurate translation.

What is the dispute over the open and closed source of large models?

Comments