1
results found
Generating accurate natural language descriptions from images remains a challenging task, particularly when input images are captured under poor lighting conditions such as dim indoor environments, ni...
Image captioning
Recurrent Interface Networks
Feature wise Linear Modulation
Visual feature robustness
Self critical sequence training
Attention mechanism
SinoXiv