场景自适应的在线多视图融合视频摘要方法研究 -- 相关工作分析（1）

场景自适应的在线多视图融合视频摘要方法研究 -- 相关工作分析

一、Video Summarization Application

视频摘要的好处

1. 在时间上对原始视频内容进行压缩，大幅度减少浏览时间

2. 在空间上对原始视频内容进行压缩，大大节约存储空间

3. 包含原始视频的主要信息，便于浏览和检索

【Video Summarization with Long Short-Term Memory】

Video has rapidly become one of the most common sources of visual information. The amount of video data is daunting — it takes over 82 years to watch all videos uploaded to YouTube per day! Automatic tools for analyzing and understanding video contents are thus essential.必要性

【Discovering Important People and Objects for Egocentric Video Summarization】

Its main value is in turning hours of video into a short summary that can be interpreted by a human viewer in a matter of seconds. Automatic video summarization methods would be useful for a number of practical applications, such as analyzing surveillance data, video browsing, action recognition, or creating a visual diary. 价值、应用

【Unsupervised Video Summarization with Adversarial LSTM Networks】

A wide range of applications require automated summarization of videos, e.g., for saving time of human inspection, or enabling subsequent video analysis. 价值

【Deep attentivevideo summarizationwith distribution consistency learning】

It can be widely used in applications of online video management, interactive browsing and searching, and intelligent video surveillance. Due to its great significance, video summarization has been a crucially urgent task, especially in the era of big video data.应用

视频摘要的应用

1. 电影高光生成：电影制片人制作短片或预告片，吸引观众以获取收入

2. 体育视频：建立检索和索引系统，并通过附加信息向用户呈现增强现实，让观众觉得自己也是比赛的一部分

3. 社交媒体视频

4. 监控视频：数据冗余，可以再短时间内获得重要的信息

【结论】基于视频摘要的普适性应用

【四种任务】：

（1）体育事件（多元摄像头，独有特点）

（2）电影片段

（3）安防

（4）热点事件、大家手机拍的

二、Deep Learning in Mobile & Edge Devices 移动端&边缘video 处理

基于深度学习视频的应用移动端泛在的

1. DL in augmented reality（AR）devices

2. Video super-resolution（视频超分辨率）

3. DL for video analytics

【DeepDecision: A Mobile Deep Learning Framework for Edge Video Analytics】

提出一个measurement-driven框架，根据准确性、帧率、能量和网络数据使用等应用需求选择在哪里和哪个深度学习模型运行（边 or 端）

【Real-Time Video Super-Resolution on Smartphones with Deep Learning, Mobile AI 2021 Challenge: Report】

开发一个端到端基于深度学习的视频超分辨率解决方案，可以在移动gpu上实现实时性能。所提出的解决方案完全兼容任何移动GPU，可以将视频以高达80帧/秒的速度提升到高清分辨率，同时展示高保真效果

【FastVA:Deep Learning VideoAnalytics Through Edge Processing and NPU in Mobile】

提出FastVA框架，该框架通过边缘处理和移动神经处理单元(Neural processing Unit, NPU)支持深度学习视频分析。主要的挑战是确定何时卸载计算以及何时使用NPU

【Utilizing Mobile-based Deep Learning Model for Managing Video in Knowledge Management System】

提出了一种嵌入深度学习模型的知识管理系统智能框架，减轻了知识管理系统中视频资料管理的负担

【结论】深度学习在移动 & 边缘设备的可行性深度强化学习

追溯、在终端实时处理是否合理、数据往边缘传？

可参考

大多数现有的技术通过计算卸载来解决这个问题。通过将数据/视频卸载到边缘服务器，并让边缘服务器运行这些深度学习模型，可以节省精力和处理时间。但是，这是在网络条件良好、输入数据量小的前提下进行的。在许多情况下，当网络条件很差或者对于视频分析等需要处理大量数据的应用程序，卸载可能需要更长的时间，因此可能不是最好的选择。

三、Video Summary Methods

长视频分析，为什么用RL

深度强化学习是一种获得最优决策策略和最大化长期回报的有效方法[Qiu X, Liu L, Chen W, et al. Online deep reinforcement learning for computation offloading in blockchain-empowered mobile edge computing[J]. IEEE Transactions on Vehicular Technology, 2019, 68(8): 8050-8062.]

RL可以根据过去选择的实际性能优化其策略，并且能够发现优于依赖固定启发式或使用不准确模型的算法策略。[Mao H, Chen S, Dimmery D, et al. Real-world video adaptation with reinforcement learning[J]. arXiv preprint arXiv:2008.12858, 2020.]

帧的选择是相互依赖的，因为一个帧的选择会对其他帧的选择产生影响。RL的探索-利用策略可以更好地指导摘要网络在探索帧的不同组合时，限制帧之间的相互依赖性。[Zhou K, Xiang T, Cavallaro A. Video summarisation by classification with deep reinforcement learning[J]. arXiv preprint arXiv:1807.03089, 2018.]

发表于 2023-02-22 15:49
阅读 ( 1594 )
分类：边端协同深度计算

场景自适应的在线多视图融合视频摘要方法研究 -- 相关工作分析（1）

你可能感兴趣的文章

相关问题

0 条评论

作家榜 »