UntetherAI: Record-Breaking MLPerf Benchmarks
Untether AI has excelled in MLPerf® Inference v4.1 benchmarks, achieving top performance and energy efficiency with its At-Memory architecture and speedAI®240 Slim accelerator cards, alongside the imAIgine® SDK for deployment.
Read original articleUntether AI has achieved record-breaking results in the MLPerf® Inference v4.1 benchmarks, showcasing the highest performance, lowest latency, and best energy efficiency in the Image Classification benchmark using ResNet-50 v1.5. The company’s innovative At-Memory architecture, scalable accelerators, and automated software tool flow enable it to meet the increasing demands of AI while minimizing energy consumption. Their speedAI®240 Slim accelerator cards are highlighted for their exceptional performance and cost efficiency, making them suitable for demanding inference tasks. Additionally, the imAIgine® Software Development Kit (SDK) facilitates the deployment of neural networks on Untether AI’s acceleration solutions, featuring a model garden and advanced analysis tools. MLPerf® benchmarks, developed by the MLCommons Association, serve as a standard for measuring the performance of AI hardware and software, helping developers assess advancements in AI technology.
- Untether AI leads in MLPerf® Inference v4.1 benchmarks for performance and efficiency.
- The company's At-Memory architecture supports high performance with low energy consumption.
- speedAI®240 Slim accelerator cards offer significant power and cost efficiency for AI tasks.
- The imAIgine® SDK simplifies neural network deployment on Untether AI's devices.
- MLPerf® benchmarks provide a standardized measure for AI hardware and software performance.
Related
Geekbench AI 1.0
Geekbench AI 1.0 has been released as a benchmarking suite for AI workloads, offering three performance scores, accuracy measurements, and support for multiple frameworks across various platforms, with future updates planned.
Cerebras Inference: AI at Instant Speed
Cerebras launched its AI inference solution, claiming to process 1,800 tokens per second, outperforming NVIDIA by 20 times, with competitive pricing and plans for future model support.
Cerebras reaches 1800 tokens/s for 8B Llama3.1
Cerebras Systems is deploying Meta's LLaMA 3.1 model on its wafer-scale chip, achieving faster processing speeds and lower costs, while aiming to simplify developer integration through an API.
Tenstorrent's Blackhole chips boast 768 RISC-V cores and almost as many FLOPS
Tenstorrent launched Blackhole AI accelerators with 768 RISC-V cores, achieving 24 petaFLOPS performance. The TT-Metalium model supports AI frameworks, enhancing deployment and usability compared to Nvidia's systems.
Cerebras Launches the Fastest AI Inference
Cerebras Systems launched Cerebras Inference, the fastest AI inference solution, outperforming NVIDIA GPUs by 20 times, processing up to 1,800 tokens per second, with significant cost advantages and multiple service tiers.
Related
Geekbench AI 1.0
Geekbench AI 1.0 has been released as a benchmarking suite for AI workloads, offering three performance scores, accuracy measurements, and support for multiple frameworks across various platforms, with future updates planned.
Cerebras Inference: AI at Instant Speed
Cerebras launched its AI inference solution, claiming to process 1,800 tokens per second, outperforming NVIDIA by 20 times, with competitive pricing and plans for future model support.
Cerebras reaches 1800 tokens/s for 8B Llama3.1
Cerebras Systems is deploying Meta's LLaMA 3.1 model on its wafer-scale chip, achieving faster processing speeds and lower costs, while aiming to simplify developer integration through an API.
Tenstorrent's Blackhole chips boast 768 RISC-V cores and almost as many FLOPS
Tenstorrent launched Blackhole AI accelerators with 768 RISC-V cores, achieving 24 petaFLOPS performance. The TT-Metalium model supports AI frameworks, enhancing deployment and usability compared to Nvidia's systems.
Cerebras Launches the Fastest AI Inference
Cerebras Systems launched Cerebras Inference, the fastest AI inference solution, outperforming NVIDIA GPUs by 20 times, processing up to 1,800 tokens per second, with significant cost advantages and multiple service tiers.