AI News — Sunday, June 14, 2026

WeaveBench: A Long-Horizon, Real-World Benchmark for Computer-Use Agents with Hybrid Interfaces

Researchers introduce WeaveBench, a new benchmark designed to evaluate computer-use agents on complex, long-horizon tasks in real-world environments using hybrid interfaces.

Hugging Faceresearch

Meta reportedly moves to unwind $2B Manus deal after Beijing’s demand

Meta is reportedly unwinding its $2 billion acquisition of AI startup Manus following demands from Beijing, highlighting increasing geopolitical tensions in tech mergers.

TechCrunchindustry

KPMG pulls report on AI usage due to apparent hallucinations

KPMG has withdrawn a recent report on AI usage after discovering that the AI models used to generate parts of the report produced significant hallucinations, raising concerns about AI reliability in professional services.

TechCrunchindustry

Amazon CEO reportedly raised Anthropic model concerns before government crackdown

Amazon's CEO reportedly expressed concerns about Anthropic's AI models prior to a government crackdown, suggesting internal awareness of potential issues before public intervention.

TechCrunchindustry

PRC-linked influence operations are targeting AI debates in the US

OpenAI reports that influence operations linked to the People's Republic of China are actively targeting AI-related discussions and debates within the United States.

OpenAI Blogindustry

HYDRA-X: Native Unified Multimodal Models with Holistic Visual Tokenizers

A new research paper introduces HYDRA-X, a framework for native unified multimodal models that utilize holistic visual tokenizers to enhance understanding across different data types.

Hugging Faceresearch

Access OpenAI models and Codex through your Oracle cloud commitment

OpenAI announces that its models, including Codex, are now accessible to customers through their existing Oracle cloud commitments, expanding enterprise integration options.

OpenAI Blogproduct

How Preply combines AI and human tutors to personalize learning

Preply is leveraging a combination of AI and human tutors to create personalized learning experiences, demonstrating a hybrid approach to educational technology.

OpenAI Blogproduct

N-GRPO: Embedding-Level Neighbor Mixing for Enhanced Policy Optimization

Researchers propose N-GRPO, a novel method that uses embedding-level neighbor mixing to significantly enhance policy optimization in reinforcement learning.

Hugging Faceresearch

Demystifying Hidden-State Recurrence: Switchable Latent Reasoning with On-Policy Reinforcement Learning

This paper explores hidden-state recurrence, introducing switchable latent reasoning combined with on-policy reinforcement learning to improve understanding and control of agent behavior.

Hugging Faceresearch

VideoMDM: Towards 3D Human Motion Generation From 2D Supervision

VideoMDM presents a new approach for generating realistic 3D human motion directly from 2D video supervision, simplifying the process of creating complex animations.

Hugging Faceresearch

DyCo-RL: Dynamic Cross-Modal Coordination for Visual Reasoning

DyCo-RL introduces a dynamic cross-modal coordination framework for visual reasoning, enabling AI systems to better integrate information from different sensory modalities.

Hugging Faceresearch

VIA-SD: Verification via Intra-Model Routing for Speculative Decoding

VIA-SD proposes a novel method for speculative decoding that uses intra-model routing for verification, potentially improving the efficiency and accuracy of language model outputs.

Hugging Faceresearch

The Most Powerful Model on the Market Got Pulled by the Government in 3 Days. Is It Real, or a Hype Bubble?

An article discusses the rapid government intervention to pull a supposedly powerful new AI model from the market, questioning whether it signifies genuine risk or an overhyped bubble.

Dev.toindustry

Why Testing MCP Servers With Real AI Models Matters (2026)

This article emphasizes the critical importance of testing MCP (Multi-Cloud Platform) servers using actual AI models to ensure robust performance and reliability in real-world scenarios.

Dev.toindustry

← Newer Older →