Elastic (NYSE: ESTC), the Search AI Company, today announced the availability of jina-embeddings-v5-text, a family of two small, Elasticsearch-native multilingual embedding models at 0.2B and 0.6B ...
AI safety tests found to rely on 'obvious' trigger words; with easy rephrasing, models labeled 'reasonably safe' suddenly fail, with attacks succeeding up to 98% of the time. New corporate research ...
AI API calls are expensive. After our always-on bot burned through tokens, we found seven optimization levers that cut costs ...
Office Productivity: The Apex Agents benchmark, which evaluates productivity in office-like environments, saw Gemini 3.1 Pro ...
Backboard.io announced it has achieved state-of-the-art performance across both leading AI memory benchmarks, a first ...
Google just released its most capable Gemini 3.1 Pro AI model that beats all frontier models on Humanity's Last Exam and ...
Aquant today released The 2026 Field Service KPI Benchmark Report, an industry-wide analysis of anonymized performance data from 161 service organizations. The report spans nearly 30 million service ...
Speechify's Voice AI Research Lab Launches SIMBA 3.0 Voice Model to Power Next Generation of Voice AI SIMBA 3.0 represents a major step forward in production voice AI. It is built voice-first for ...
The most significant advancement in Gemini 3.1 Pro lies in its performance on rigorous logic benchmarks. Most notably, the model achieved a verified score of 77.1% on ARC-AGI-2.
The second edition of the research firm’s InferenceMAX benchmark, now known as InferenceX, saw Nvidia’s GB300 NVL72 system as ...
OpenAI introduces EVMbench to measure AI crypto security. Benchmark evaluates detection, patching and exploit skills. OpenAI has launched a benchmarking system called EVMbench to evaluate how ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results