Humor refers to the affective responses and laughing behavior after successfully resolving the incongruity in meaning between two messages faced by the reader. Accoring to this definition, humor has three (ABC) components, including affecting response in humor appreciation, behavior of laughing, and cognition in humor comprehension.Scholars have built a consensus on the content of comprehension of humor, which is the process of incongruity-resolution. But scholars have a debate on the necessity of theory of mind within this process. In other words, researchers do not reach consensus on whether theory of mind, that makes the reader understand the intention of characters within, contribute to the behavior after humor comprehension or not. In addition, there is no clear classification and effect comparsion between text and picuter in experimental material of previous studies. Thus, the experiment 1 (N=33) of present study was aimed to directely compare the effects of stimulus modality (2, text or picture alone) X theory-of-mind (2, within the content or not) on humor comprehension and appreciation. Results showed that readers need longer processing time in picture alone condition and more cognitive capability in processing material with theory-of-mind winthin.The experiment 2 (N=35) was used to investigate the effct of text-picture mixture material on humor comprehension and appreciation. Results found higher subjective funny level and larger pupil size in theory-of-mind condition. In addition, reader took shorter duration to process material with theory-of-mind within. Taken together, readers take longer time to process humor picture and theory-of-mind content when it was presented alone, but shorter time when text and picture were presented in mixture.