Interpretability of deep learning machines is fundamental in human-centric processes that require human consumption of machine-generated inferences, especially of inherently subjective phenomena emerging in cyber-human systems. Interpretability is also needed to make deep models truly useful in interdisciplinary work with psychology or the humanities. The project will develop a framework to systematically assess the output of convolutional neural networks (CNNs), combining online crowdsourcing techniques and state-of-the-art CNN methods that generate visual explanations. We propose to cast deep network visual interpretability as a process of collective evaluation by non-experts. Two case studies will be used to demonstrate the framework: (1) recognition of place ambiance from social media images, i.e., learning to recognize a scene of an indoor place as being artsy, formal, romantic, or old-fashioned; and (2) recognition of complex Maya hieroglyphic categories, i.e., learning to recognize complex glyphs from over 200 glyph categories Maya ancient codices. Both case studies are interdisciplinary in nature and relevant testbeds for interpretable deep learning methods.