You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
limiao edited this page Nov 22, 2017
·
5 revisions
本页主要介绍一下Reinforcement Training的使用方法。总体基于文章"Self-critical Sequence Training for Image Captioning" 的方法,使用cider作为reward,进行policy gradient训练。 (https://arxiv.org/pdf/1612.00563.pdf)