Abstract: With the increasing importance of computational thinking in the digital age, Scratch programming programs have performed well in developing students' computational thinking skills.
SecCodeBench is a benchmark suite for evaluating the security of AI-generated code, specifically designed for modern Agentic Coding Tool. It is jointly developed by Alibaba Group in collaboration with ...
We present Open3D-VQA, a novel benchmark for evaluating MLLMs' ability to reason about complex spatial relationships from an aerial perspective.The QAs are automatically generated from spatial ...