Information Technology
Fujitsu develops video analytics AI agent to support safe, secure, and efficient frontline workplaces
The AI agent is based on a multimodal large language model (LLM). The AI agent trains itself to recognize 3D images of the workplace using information from written documentation (i.e., safety rules, etc). Context memory technology uses written information to selectively retain only the relevant data, enabling the analysis of long-duration video content with world-leading accuracy.
The AI agent will be evaluated by FieldWorkArena, an evaluation environment newly developed by Fujitsu, under the supervision of Carnegie Mellon University . FieldWorkArena will be made available for the researcher community from December 2024 , with tasks being added to GitHub and the Fujitsu Research Portal in December 2024 .
This technology augments the AI agent's video data comprehension capabilities using information from written documentation to help the LLM understand what it cannot from video content alone.
This technology allows for the user to provide a prompt for a specific type of behavior to focus on in a video, i.e., "safe behavior in humans."
Under the supervision of Carnegie Mellon University's Associate Professor Graham Neubig and Assistant Professor Yonatan Bisk , Fujitsu has developed the FieldWorkArena, an evaluation environment for its video analytics AI agent service. The FieldWorkArena includes a bank of images and video content from actual frontline workplaces including plants and warehouses, documents such as rules and instruction manuals, simulations of business systems, and sets of tasks for the AI agent to solve.
View original content: https://www.prnewswire.co.uk/news-releases/fujitsu-develops-video-analytics-ai-agent-to-support-safe-secure-and-efficient-frontline-workplaces-302330070.html