Model Inference API - Search News

Nvidia claims 10x cost savings with open-source inference models

Nvidia noted that cost per token went from 20 cents on the older Hopper platform to 10 cents on Blackwell. Moving to ...

The Next Platform

Taalas Etches AI Models Onto Transistors To Rocket Boost Inference

Adding big blocks of SRAM to collections of AI tensor engines, or better still, a waferscale collection of such engines, ...

Alibaba's Qwen 3.5 397B-A17 beats its larger trillion-parameter model — at a fraction of the cost

These speed gains are substantial. At 256K context lengths, Qwen 3.5 decodes 19 times faster than Qwen3-Max and 7.2 times ...

AI inference cast in silicon: Taalas announces HC1 chip

The startup Taalas wants to deliver a hardwired Llama 3.1 8B with almost 17,000 tokens/s with the HC1 – almost 10 times ...

Security Boulevard

From Shadow APIs to Shadow AI: How the API Threat Model Is Expanding Faster Than Most Defenses

The shadow technology problem is getting worse. Over the past few years, organizations have scaled microservices, ...

10d

NVIDIA Shows Blackwell Slashing AI Inference Costs By 10X With Open Models

Achieving that 10x cost reduction is challenging, though, and it requires a huge up-front expenditure on Blackwell hardware.

10d

OpenAI sidesteps Nvidia with unusually fast coding model on plate-sized chips

OpenAI has spent the past year systematically reducing its dependence on Nvidia. The company signed a massive multi-year deal ...

9don MSN

OpenAI’s Fast New Model Aims to Push Vibe Coding Toward Warp Speed

Check out Codex-Spark, a new AI model that Sam Altman said ‘sparks joy for me.’ ...

Cash-Strapped OpenAI Plans $600 Billion Compute Spend by 2030, Including IPO

OpenAI plans to spend about $600 billion on computing infrastructure by 2030 as it eyes an IPO and rapid AI growth.

Speechify's AI Voice Research Lab Launches SIMBA 3.0 Voice Model to Power Next Generation of Voice AI

Speechify's Voice AI Research Lab Launches SIMBA 3.0 Voice Model to Power Next Generation of Voice AI SIMBA 3.0 represents a major step forward in production voice AI. It is built voice-first for ...

The Architectural Decisions That Can Make Or Break Your AI Budget

Asking an engineer to refactor a large, tightly coupled AI pipeline to test an idea is almost guaranteed to fail. Monoliths don’t optimize well either. You’ll spend more time (and money) iterating on ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results