-
MuSS: A Large-Scale Dataset and Cinematic Narrative Benchmark for Multi-Shot Subject-to-Video Generation
Paper • 2604.23789 • Published • 6 -
OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environments
Paper • 2605.18758 • Published • 16 -
GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration
Paper • 2605.31039 • Published • 38
amjad
heroali
·
AI & ML interests
None yet
Recent Activity
updated a collection 2 days ago
benchmarks updated a collection 2 days ago
benchmarks updated a collection 2 days ago
DatasetOrganizations
None yet
benchmarks
-
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning
Paper • 2604.24300 • Published • 67 -
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis
Paper • 2604.24198 • Published • 22 -
KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
Paper • 2605.04956 • Published • 7 -
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling
Paper • 2605.13062 • Published • 33
Dataset
-
MuSS: A Large-Scale Dataset and Cinematic Narrative Benchmark for Multi-Shot Subject-to-Video Generation
Paper • 2604.23789 • Published • 6 -
OmniGUI: Benchmarking GUI Agents in Omni-Modal Smartphone Environments
Paper • 2605.18758 • Published • 16 -
GGT-100K: Generative Ground Truth for Generalizable Real-World Image Restoration
Paper • 2605.31039 • Published • 38
benchmarks
-
ReVSI: Rebuilding Visual Spatial Intelligence Evaluation for Accurate Assessment of VLM 3D Reasoning
Paper • 2604.24300 • Published • 67 -
Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis
Paper • 2604.24198 • Published • 22 -
KernelBench-X: A Comprehensive Benchmark for Evaluating LLM-Generated GPU Kernels
Paper • 2605.04956 • Published • 7 -
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling
Paper • 2605.13062 • Published • 33
models 0
None public yet
datasets 0
None public yet