Will LLMs Scaling Hit the Wall?

Abstract

The success of foundation models relies on scaling laws, which show that model performance improves predictably with increased training data and model size. However, this scaling trajectory faces two critical challenges: depletion of high-quality public data and monopolization of computational power required for larger models by tech giants. These bottlenecks severely hinder AI advancement. In this position paper, we advocate leveraging massive distributed edge devices to break through these barriers. We reveal the vast untapped potential of data and computational resources on edge devices, and review recent technical advancements in distributed/federated learning that make this paradigm viable. Our analysis indicates that through edge device collaboration, anyone can participate in training large language models using small edge devices. This shift toward distributed edge training can democratize AI development and foster a more inclusive AI community.

The Challenge: Hitting the Wall

Data Depletion: Neural scaling laws establish that model performance improves predictably with data quantity, requiring exponentially growing datasets that expand by approximately 0.38 orders of magnitude (2.4×) annually. Despite the internet's vast resources, high-quality human-generated text remains bounded at approximately 4×10¹⁴ tokens, with researchers predicting exhaustion of public text data by 2028 (potentially accelerated to 2026 through excessive reuse).

Computational Monopoly: The AI computing landscape is dominated by a few tech giants (OpenAI, Google, Microsoft, Meta) controlling advanced hardware resources. This monopolization creates significant barriers for smaller organizations. Since 2010, AI training demands have grown at a super-exponential rate of 3.9× per year, further accelerating to 13.4× per year since 2022 with large language models (See Figure 1). This challenge is compounded by Moore's Law slowing as we approach silicon-based chip technology limits, while infrastructure capacity faces dual constraints from semiconductor manufacturing bottlenecks and limited production capacity against exponentially growing deployment needs.

Figure 1: AI computational demands have grown exponentially, reaching an unprecedented 13.4× annual growth rate since 2022, pushing against physical and economic limits.

The Opportunity: Edge Resources

Untapped Edge Data: Edge devices offer a crucial solution to data exhaustion challenges. By 2025, global data volume is projected to reach 182 ZB, with IoT devices contributing significantly—increasing from 13.6 ZB in 2019 to 79.4 ZB in 2025 (See Figure 2). Smartphone data volume specifically is forecast to grow from 5 EB in 2018 to 8 EB by 2028, with the accumulated smartphone data of the past 5 years (before 2025) estimated at approximately 33.1 EB (See Figure 3).

Figure 2: Global data volume is growing exponentially, with IoT devices contributing significantly to this growth (from 33.2% to 43.6% over 2015-2025).

Figure 3: Smartphone data volume is projected to grow from 5 EB in 2018 to 8 EB by 2028, while the edge computing market is forecasted to surge from $5.5 billion to $87.9 billion in the same period.

Key Insight: The smartphone data volume of the past 5 years (before 2025) is estimated to reach approximately 33.1 EB, offering unique advantages in privacy and real-time environments, demonstrating the enormous potential of edge devices for AI training.

Collective Computing Power: Computing capabilities across edge devices have shown sustained growth, with desktop, laptop, and mobile devices demonstrating annual growth rates of 1.29×, 1.20×, and 1.20× respectively (See Figure 4). This growth trajectory presents a stark contrast to the unsustainable computational demands of centralized AI training. In the smartphone sector specifically, the collective computing power has increased dramatically from 817 EFLOPS in 2020 to 2,758 EFLOPS in 2024, with Samsung (27.5%), Xiaomi (25.5%), and Apple (17.2%) contributing significantly to this growth through their advanced mobile processors (See Figure 5).

Figure 4: The yearly growth in compute requirements shows the unsustainable trajectory of centralized AI training, highlighting the need for more efficient and distributed approaches to advance AI capabilities.

Figure 5: The collective computing capabilities of smartphones have grown dramatically from 817 EFLOPS in 2020 to 2,758 EFLOPS in 2024, providing a formidable distributed training resource.

Key Insight: The cumulative smartphone computing power over the past 5 years totals 9,278 EFLOPS. Modern flagship devices achieve over 2 TFLOPS per unit, with remarkable efficiency—just 30 current-generation smartphones working in parallel can match the computational capacity of an H100 GPU (59.30 TFLOPS), demonstrating the potential for distributed AI training.

This vast distributed computing resource presents a practical path to democratizing AI development: training a state-of-the-art model comparable to DeepSeek-v3 would require approximately 60,723 edge devices working in parallel for one week. Given the billions of smartphones in use globally, this represents a highly feasible approach to distributed AI training, potentially breaking the computational monopoly of large tech companies.

Technical Approaches

Small Language Models at Edges: Deploying compact language models on edge devices significantly reduces computational and memory requirements while maintaining acceptable performance. Recent advancements in model compression, knowledge distillation, and quantization have enabled increasingly powerful yet efficient models on resource-constrained devices.

Collaborative Inference: Distributing inference across multiple devices allows more complex models to run without requiring any single device to handle the entire computational load. This approach enables larger models than possible on individual devices while maintaining low latency and reducing bandwidth requirements.

Collaborative Training: Federated learning enables model training across distributed devices without requiring data to leave the device, preserving privacy while leveraging collective computational power. Recent initiatives like Prime Intellect's INTELLECT-1 project (10B parameter model) utilize the OpenDiLoCo framework to significantly reduce inter-node communication costs, achieving 83% compute utilization across 112 H100 GPUs in five countries. Similarly, Flower Lab's FlowerLLM has successfully trained a 1.3B parameter model using novel federated learning methods.

Figure 5: Federated learning enables distributed training across edge devices while preserving privacy, allowing for collaborative model development without compromising sensitive data.

A key limitation of traditional federated learning is its requirement for each participant to maintain a complete model locally—problematic for large language models. Novel approaches are being developed to address this constraint, enabling distributed training across devices with varying computational capabilities, showing promise for large-scale collaborative training in privacy-sensitive domains with unevenly distributed computational resources.

Societal Impact

AI Democratization: By leveraging edge resources, we can create a more inclusive environment where diverse participants collectively develop and train powerful language models. This reduces dependence on major tech companies and promotes a more diverse AI research and applications ecosystem.

Key Insight: Distributed edge training significantly lowers barriers to AI development participation, enabling smaller organizations, academic institutions, and individual developers to contribute to large language model training and innovation.

Privacy and Data Ownership: Federated learning allows data to remain on user devices, reducing privacy risks and giving users greater control over their data—increasingly important with stringent global privacy regulations.

Environmental Sustainability: Edge computing utilizes idle computing capacity of existing devices, reducing energy consumption and the need for dedicated data centers. Leveraging billions of devices already in operation lowers the carbon footprint associated with AI training infrastructure.

Conclusion and Outlook

Distributed edge devices represent a vast untapped resource that can drive the next generation of large language model development. By leveraging these resources, we can overcome data depletion and computational monopoly challenges while enabling more efficient, private, and inclusive AI advancement.

Looking ahead, we anticipate:

Continuous enhancement of edge device hardware capabilities
More efficient distributed learning algorithms that minimize communication overhead
Specialized small language model architectures optimized for edge deployment
Advanced frameworks supporting secure, privacy-preserving collaborative learning

Key Outlook: The distributed capacity of edge devices will foster a democratized AI ecosystem where developers worldwide can participate in training and applying large language models, addressing broader societal needs and unlocking new possibilities for AI innovation.

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources onMassive Edge Devices

The Challenge: Hitting the Wall

The Opportunity: Edge Resources

Technical Approaches

Societal Impact

Conclusion and Outlook

Will LLMs Scaling Hit the Wall? Breaking Barriers via Distributed Resources on
Massive Edge Devices