Massimiliano Mancini
Published: 2022-11-30
Total Pages: 285
Get eBook
Despite being the leading paradigm in computer vision, deep neural networks are inherently limited by the visual and semantic information contained in their training set. In this thesis, we aim to design deep models operating with previously unseen visual domains and semantic concepts. We first describe different solutions for generalizing to new visual domains, applying variants of normalization layers to multiple challenging settings e.g. where new domain data is not available but arrives online or is described by metadata. In the second part, we incorporate new semantic concepts into pretrained deep models. We propose specific solutions for different problems such as multi-task/incremental learning and open-world recognition. Finally, we merge the two challenges: given images of multiple domains and categories, can we recognize unseen concepts in unseen domains? We propose an approach that is the first, promising step, towards solving this problem. Winner of the Competition “Prize for PhD Thesis 2020” arranged by Sapienza University Press.