The vision, language and learning lab at the University of Virginia pursues fundamental research at the intersection of computer vision, natural language processing and machine learning. We aim to create intelligent systems that can learn from vast amounts of visual and textual information, that can integrate and enhance human experiences, and that can resolve complex tasks that typically require human intelligence.
Read about some of our work on bias in visual recognition in WIRED and Glamour. Some of our recent work on analyzing movies on TechXplore, and our work on generating images from text in the blogs of IBM and NVIDIA.
This demo attemps to make it difficult for a model to predict gender from an image by modifying it so that this task becomes harder while retaining most image information.
Search images by text in the SBU Captions Dataset which has 1 million images with captions from Flickr and has been used in numerous projects