Reading Stats

2522 words13 Minutes

Quant-investing originated in the United States, which is why almost all of the founding teams behind China’s leading quant funds have, to some extent, experience working at U.S. or European hedge funds. High-Flyer, however, is an exception: it was founded entirely by a local team and has grown independently through its own exploration. A leading Quant Fund founder remarked that High-Flyer "has never followed conventional paths" and do things "in their own way." Even if it’s unorthodox or controversial, they would "boldly articulate their views and act accordingly".

On Research and Exploration

"Do the most important and difficult things."

Waves: High-Flyer recently announced its entry into the large-model space. Why is a Quant Fund undertaking such an endeavor?

Liang Wenfeng: Our large-model project is unrelated to our quant and financial activities. We’ve established an independent company called DeepSeek, to focus on this.

Many in our High-Flyer team come from an AI background. Years ago, we experimented with various applications before entering the complex domain of finance. AGI may be one of the next most challenging frontiers, so for us, the question is not "why" but "how".

Waves: Are you training a general-purpose model, or focusing on vertical domains like finance?

Liang: We’re working on AGI—Artificial General Intelligence. Language models are likely a prerequisite for AGI and already exhibit some AGI characteristics. So we’ll start there and later expand into areas like computer vision.

Waves: Due to the entry of tech giants, many startup companies have abandoned the pursuit of solely developing general-purpose large models.

Liang: We won’t prematurely focus on applications. Our focus is solely on the large model itself.

Waves: Some say it’s too late for startups to enter this space after tech giants have reached a consensus.

Liang: Currently, neither tech giants nor startups have an unassailable lead. With OpenAI paving the way, everyone is working with published papers and open-source code. By next year, both groups will likely have their own large-language models.

Both major corporations and startups have their own opportunities. Existing vertical scenarios are not controlled by startups, making this phase less favorable for them. However, as these scenarios involve dispersed and fragmented niche demands, they are actually better suited to the flexibility of entrepreneurial organizations. In the long term, as the barriers to applying large models continue to lower, startups will have opportunities to enter the field at any time over the next 20 years.

Our goal is clear: to focus on research and exploration rather than vertical domains and applications.

Waves: Why do you define your goal as "to focus on research and exploration"?

Liang: It’s driven by curiosity. From a broader perspective, we want to validate certain hypotheses. For example, we hypothesize that the essence of human intelligence might be language, and human thought could essentially be a linguistic process. What you think of as "thinking" might actually be your brain weaving language. This suggests that human-like AGI could potentially emerge from large language models.

From a closer perspective, GPT-4 still holds many mysteries waiting to be unraveled. While reproducing it, we are also conducting research to uncover these secrets.

Waves: But research comes at a higher cost.

Liang: Reproduction alone is relatively cheap—based on public papers and open-source code, minimal times of training, or even fine-tuning, suffices. Research, however, involves extensive experiments, comparisons, and higher computational and talent demands.

Waves: How do you fund research?

Liang: High-Flyer is one of our investors, with ample R&D budgets. Additionally, we have several hundred million RMB allocated annually for philanthropy, which we could redirect if necessary.

Waves: However, building foundational large models requires at least two to three hundred million dollars just to get a seat at the table. How can we sustain such continuous investment?

Liang: We're in discussions with different funding sources. From our interactions so far, many VCs seem hesitant about investing in research. They have exit requirements and prioritize rapid product commercialization, which makes it difficult to secure funding from VCs given our research-first approach. But we already have computing power and an engineering team, which is equivalent to holding half the stakes in hand.

Waves: What analyses and projections have been made regarding the business model?

Liang: What we’re considering now is to make most of our training results publicly available in the future, which could also align with commercialization efforts. We hope that more people, even small app developers, can access large models at a low cost, rather than the technology being controlled by only a few individuals or companies, leading to monopolization.

Waves: Tech giants will also offer services at later stages. What differentiates you from them?

Liang: Giants may integrate their models with their platforms or ecosystems. Our offering is entirely open and independent.

Waves: After all, a commercial company embarking on limitless research seems irrational. Liang: It might be hard if we must find a commercial justification, because it’s not cost-effective.

From a business perspective, fundamental research has a very low return on investment. When early investors backed OpenAI, their motivation was certainly not about how much return they would get, but a genuine desire to pursue the mission.

Things we are sure now are that we want to do this, can do this, and are capable of doing this, so we’re among the best-suited candidates to tackle it at this moment.

Ten Thousand GPUs and Their Cost

"An exciting pursuit can’t always be measured in money."

Waves: GPUs are the scarce commodity in this wave of ChatGPT-related startups, yet you had the foresight to stockpile 10,000 of them as early as 2021. Why?

Liang: It was a gradual process—from a single card in the early days to 100 cards in 2015, 1,000 cards in 2019, and then 10,000 cards. Up to a few hundred cards, we relied on external Internet data centers. When the scale expanded, we began building our own facilities.

People may think there’s some hidden business logic behind this, but it’s mainly driven by curiosity.

Waves: What kind of curiosity?

Liang: Curiosity about the boundaries of AI capabilities. For many outsiders, the wave triggered by ChatGPT has been particularly disruptive; however, for those within the field, the impact of AlexNet in 2012 has ushered in a new era. AlexNet’s error rate was significantly lower than that of other models at the time, reviving neural network research that had been dormant for decades.

While specific technical directions have constantly evolved, the combination of models, data, and computing power has remained a constant. Especially after OpenAI released GPT-3 in 2020, the direction became clear: massive computing power would be essential. Yet even in 2021, when we were investing in the construction of Yinghuo Two, most people still couldn’t grasp the rationale.

Waves: So you did start paying attention to computational power in 2012?

Liang: Researchers have an insatiable hunger for computational resources. Small experiments often lead to a desire for larger-scale trials, prompting us to continuously expand our capacity.

Waves: Some assumed your clusters were primarily for financial market predictions.

Liang: If purely for Quant investing, even a small number of GPUs would suffice. Our broader research aims to understand what kind of paradigms can fully describe the entire financial market, whether there are simpler ways to express it, the boundaries of these paradigms’ capabilities, and whether they have broader applicability, among other questions.

Waves: But this process is also a money-burning endeavor.

Liang: An exciting endeavor perhaps cannot be measured purely in monetary terms. It’s like someone buying a piano for a home—first, they can afford it, and second, such a group of people are eager to play beautiful music on it.

Waves: GPUs typically depreciate at about 20% (annually).

Liang: We haven’t calculated precisely, but it’s likely less. NVIDIA GPUs hold their value well., and older cards still find buyers. Our previously retired GPUs still held decent value when sold second-hand, so we didn’t lose too much.

Waves: Clusters require significant expenses – maintenance, labor, and even electricity.

Liang: Electricity and maintenance are relatively inexpensive, constituting about 1% of hardware costs annually. Labor is more significant but represents an investment in our future and a key asset for the company. The people we choose tend to be relatively humble, driven by curiosity, and have the opportunity to conduct research here.

Waves: In 2021, High-Flyer was one of the first companies in the Asia-Pacific region to obtain A100 GPUs. How did you manage to acquire them earlier than some cloud providers?

Liang: We proactively tested and planned for new GPUs early on. Cloud providers historically catered to fragmented demands. It wasn’t until 2022 that some cloud providers began building the infrastructure, with the rise of autonomous driving and the need for rented machines to support training—along with the ability to pay for it. It is typically challenging for tech giants to focus purely on research or training, as their efforts are more driven by their business needs.

Waves: What’s your view of the large-model competition?

Liang: Giants certainly have their advantages. However, without rapid application deployment, they may struggle to sustain, as they are more driven by the need to see the outcome.

Leading startups also have solid technical foundations, but like the earlier wave of AI startups, they still face significant challenges in commercialization.

Waves: Some think High-Flyer’s AI emphasis is PR for its other businesses as a quant fund. Liang: In reality, our quant fund has mostly stopped external fundraising.

Waves: How do you distinguish AI believers from opportunists?

Liang: Believers were here before and will remain after the hype. They’re the ones buying GPUs in bulk or signing long-term agreements, not just renting short-term resources.

Enabling True Innovation

"Innovation often arises naturally; it is not orchestrated, nor can it be taught." Waves: How is DeepSeek’s recruitment progressing?

Liang: The initial team is in place. We are borrowing temporary support from High-Flyer due to a shortage of human resources in the early stages. Since ChatGPT-3.5’s surge last year, we’ve been hiring actively, but we still need more people.

Waves: Talent in large-model startups is scarce. Investors say top talent is often confined to AI labs at giants like OpenAI and Facebook AI Research. Will you recruit from overseas AI labs?

Liang: For short-term goals, hiring experienced individuals makes sense. But long-term success does not depend that much on past experiences. Rather, it depends more on foundational skills, creativity, and passion. In this sense, domestic candidates are abundant.

Waves: Why does experience matter less?

Liang: The right person doesn’t always need prior experience. High-Flyer prioritizes capability over credentials. Core technical roles are primarily filled by recent grads or those 1–2 years out.

Waves: Is experience sometimes a hindrance to innovation?

Liang: Experienced people will tell you how something should be done without hesitation, while those without experience will explore repeatedly, think carefully, and find a solution that fits the current situation.

Waves: High-Flyer starts from an outsider to a top-tier quant fund within several years. Is this hiring philosophy a secret to its success?

Liang: Our core team, including myself, initially lacked quant experience, which is unique. It’s not necessarily a "secret" but part of our culture. We don’t deliberately avoid experienced individuals, but we focus more on ability.

For example, our top two salespeople were outsiders—one came from exporting German machinery, and the other wrote backend code at a securities firm. When they entered this field, they had no experience, no resources, and no prior connections.

Today, we might be the only large private equity firm primarily relying on direct sales —- we don't need to share fees with intermediaries, resulting in higher profit margins at the same scale and performance. Many firms have tried to imitate us, but none have succeeded.

Waves: Why hasn’t this model been successfully replicated by others?

Liang: Because this alone isn’t enough to drive innovation. It requires alignment with the company’s culture and management.

In fact, our sales team achieved nothing in their first year, and it was only in the second year that they started to see some results. But our evaluation standards are quite different from those of most companies. We don’t have KPIs or so-called quotas.

Waves: So, what are your evaluation standards to them?

Liang: Unlike most companies that focus on order volume, we don’t predefine commissions based on sales figures. Instead, we encourage our salespeople to build their own networks, connect with more people, and create greater influence.

We believe that an honest and trustworthy salesperson may not immediately drive orders in the short term, but they can make clients see them as reliable and dependable.

Waves: After selecting the right person, how do you help them get into the groove?

Liang: Assign them important tasks and avoid interfering. Let them figure things out and unleash their potential.

In reality, a company’s core essence is incredibly difficult to replicate. For example, hiring inexperienced individuals requires judging their potential and figuring out how to help them grow after they join—none of which can be directly copied.

Waves: What do you think are the necessary conditions for building an innovative organization?

Liang: In our experience, innovation requires as little intervention and management as possible, giving everyone the space to explore and the freedom to make mistakes. Innovation often arises naturally—it’s not something that can be deliberately planned or taught.

Waves: This is unconventional. How do you ensure that people work efficiently and head in the desired direction under such circumstances?

Liang: We ensure value alignment when hiring and rely on culture to maintain direction. There’s no written corporate culture, as rules can stifle innovation. More often, it’s about leadership setting an example—how you make decisions can become an unspoken guideline.

Waves: In this AI wave, could such an innovative structure of startups be a decisive edge against tech giants?

Liang: Conventional wisdom often concludes that startups with such ambitions can't survive. However, in an ever-changing market, true success hinges on adaptability and the ability to adjust, rather than on fixed rules or conditions. Many giants struggle with inertia and can’t respond quickly to change, and this wave of AI will undoubtedly birth new companies.

True Madness

"Innovation is expensive, inefficient, and sometimes wasteful."

Waves: What excites you most about this endeavor?

Liang: Verifying whether our hypotheses are correct. If they are, that’s immensely satisfying. Waves: What are the must-have criteria for your hiring talent for large models this time? Liang: Passion and solid foundational skills. Everything else is secondary. Waves: Are such individuals easy to find?

Liang: Their passion usually shows—they genuinely want to do this and they are often the ones actively seeking you out as well.

Waves: Large models may require endless investment. Does the cost make you hesitant?

Liang: Innovation is inherently expensive and inefficient, often accompanied by waste. That’s why it only emerges when economic development reaches a certain level. When resources are scarce or in industries not driven by innovation, cost and efficiency become essential. Even OpenAI only succeeded after burning through substantial funding.

Waves: Do you see your endeavor as madness?

Liang: I’m unsure if it’s madness, but many inexplicable phenomena exist in this world. Take many programmers, for example—they’re passionate contributors to open-source communities. Even after an exhausting day, they still dedicate time to contributing code.

Waves: There is a sense of spiritual reward in it.

Liang: It's like walking 50 kilometers—your body is completely exhausted, but your spirit feels deeply fulfilled.

Waves: Do you think curiosity-driven madness lasts long-term?

Liang: Not everyone can stay passionate their entire life. But most people, in their younger years, can wholeheartedly dedicate themselves to something without any materialistic aims.