AI News — Friday, May 15, 2026

OpenAI's Codex AI Assistant Expands to Mobile Devices

OpenAI announces that its advanced coding assistant, Codex, will soon be available on mobile phones, enabling developers to work from anywhere.

TechCrunchproduct

MinT: Managed Infrastructure for Training and Serving Millions of LLMs

A new paper introduces MinT, a managed infrastructure designed to efficiently train and serve millions of large language models, addressing scalability challenges in AI deployment.

Hugging Faceresearch

Elon Musk vs. Sam Altman: Jury to Decide Key Aspects of AI Lawsuit

A TechCrunch report details the critical points the jury will deliberate in the high-profile legal battle between Elon Musk and Sam Altman, potentially shaping the future of AI leadership.

TechCrunchindustry

Listen Labs Secures $69M to Scale AI Customer Interview Platform After Viral Stunt

Listen Labs successfully raises $69 million in funding, following a viral hiring campaign, to expand its AI-powered platform for conducting and analyzing customer interviews.

VentureBeatindustry

SpaceXAI Experiences Significant Staff Departures Post-Merger

Reports indicate that Elon Musk's SpaceXAI has seen a substantial exodus of staff since its recent merger, raising questions about the company's stability and future direction.

TechCrunchindustry

MulTaBench: Benchmarking Multimodal Tabular Learning with Text and Image

Researchers introduce MulTaBench, a new benchmark for evaluating multimodal tabular learning models that integrate both text and image data, pushing the boundaries of data analysis.

Hugging Faceresearch

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

A new video diffusion model called AnyFlow is presented, capable of generating video at any step count using an on-policy flow map distillation technique, improving efficiency and quality.

Hugging Faceresearch

OpenAI Enhances ChatGPT's Context Recognition for Sensitive Conversations

OpenAI announces improvements to ChatGPT, enabling it to better understand and maintain context in sensitive discussions, aiming for more nuanced and appropriate responses.

OpenAI Blogproduct

Training Long-Context Vision-Language Models Effectively with Generalization Beyond 128K Context

A new study demonstrates effective methods for training vision-language models to handle extremely long contexts, achieving generalization beyond 128K tokens.

Hugging Faceresearch

Old PC vs New AI: Can a 2015 Desktop Actually Run Gemma 4? (2B vs 4B Benchmark)

A practical benchmark explores whether a 2015 desktop PC can effectively run Google's Gemma 4 (2B and 4B parameter models), offering insights into AI accessibility on older hardware.

Dev.toproduct

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

Researchers introduce EVA-Bench, a comprehensive end-to-end framework designed for robustly evaluating the performance of voice agents across various scenarios.

Hugging Faceresearch

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

A new paper explores a method for predicting the decisions of AI agents based on limited interactions, utilizing a novel text-tabular modeling approach.

Hugging Faceresearch

Qwen-Image-VAE-2.0 Technical Report

The technical report for Qwen-Image-VAE-2.0 is released, detailing advancements in image generation and compression through an improved Variational Autoencoder.

Hugging Faceresearch

AI Isn't Replacing Developers - It's Turning Us Into Underpaid Bot Babysitters

An opinion piece argues that AI is not replacing developers but rather shifting their roles to managing and overseeing AI agents, potentially leading to new challenges in compensation and job satisfaction.

Dev.toindustry

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Researchers introduce BenchJack, a new tool for systematically auditing AI agent benchmarks to uncover potential vulnerabilities and ensure robust evaluation of agent capabilities.

arXivresearch

← Newer Older →