TLDR
- Nebius (NBIS) has entered into an agreement to purchase model optimization and inference specialist Eigen AI in a deal valued at roughly $643 million, structured as a combination of cash and Class A shares.
- The acquisition will bring Eigen AI’s optimization capabilities into the Nebius Token Factory, the company’s enterprise-focused managed inference solution.
- MIT HAN Lab alumni who founded Eigen AI will launch Nebius’s inaugural engineering and research facility in the San Francisco Bay Area.
- Collaborative optimization efforts between both organizations have already achieved top rankings on Artificial Analysis performance benchmarks.
- Following the announcement, NBIS shares climbed 8.51% to reach $150.00, recovering from a 6.07% downturn in the previous week.
On May 1, 2026, Nebius (NBIS) revealed its intention to purchase Eigen AI in a transaction worth approximately $643 million. The acquisition will be financed through a combination of cash and Nebius Class A shares, calculated using the company’s 30-day volume-weighted average share price at the time of signing. Market response was immediate, with NBIS climbing 8.51% to $150.00.
The deal is anticipated to finalize in the coming weeks, subject to regulatory approval for antitrust concerns and customary completion requirements.
Eigen AI specializes in inference optimization and model performance enhancement. The company’s solutions enable AI development teams to deploy open-source models more efficiently and cost-effectively in live environments, eliminating the need for custom-built optimization infrastructure.
Nebius intends to integrate Eigen AI’s technology seamlessly into Token Factory, its managed inference offering. Token Factory delivers autoscaling API endpoints and fine-tuning capabilities for prominent open-source models such as Llama, DeepSeek, Qwen, Gemma, and additional architectures.
The partnership between the two organizations predates this acquisition. Prior to the announcement, they collaborated on optimized model deployments that achieved leading positions on Artificial Analysis, a prominent AI performance evaluation platform.
What Eigen AI Brings to the Table
Eigen AI emerged from MIT’s HAN Lab research group. The company’s co-founders, Ryan Hanrui Wang and Wei-Chen Wang, developed two highly influential methodologies in production AI infrastructure.
Ryan’s research on Sparse Attention (SpAtten) has become the most-referenced HPCA publication since 2020. Wei-Chen’s development of Activation-aware Weight Quantization (AWQ) earned the MLSys 2024 Best Paper Award and has established itself as the industry standard for 4-bit model deployment.
Third co-founder Di Jin earned his doctorate from MIT CSAIL and played a direct role in developing Meta’s Llama 3 and Llama 4 post-training processes. His work includes co-authoring the CGPO reinforcement learning from human feedback methodology.
Upon completion of the transaction, the Eigen AI team will establish operations in the San Francisco Bay Area, creating Nebius’s first American engineering and research center.
The Inference Market Context
Inference has emerged as the most rapidly expanding segment within the AI compute landscape. Current projections indicate it will account for approximately two-thirds of overall AI computational requirements throughout 2026.
Efficient inference deployment presents significant technical challenges. The process encompasses model representation, GPU kernel optimization, and dynamic workload management — capabilities that most organizations lack internally.
Open-source models compound these difficulties, as they are generally released without optimization. Contemporary architectures including Mixture-of-Experts and Compressed Sparse Attention present additional obstacles related to memory utilization and computational efficiency that demand specialized expertise.
Eigen AI’s comprehensive optimization methodology encompasses post-training refinement, fine-tuning procedures, and production-grade inference across all leading open-source model families. The company’s kernel-level and model-specific techniques are engineered to maximize performance from current hardware infrastructure without requiring additional development resources.



