Photorealistic stylization aims to transfer the style of a reference photo onto a content photo in a natural fashion, such that the stylized image looks like a real photo taken by a camera. State-of-the-art methods stylize the image locally within each matched semantic region and are prone to global color inconsistency across semantic objects/parts, making the stylized image less photorealistic. To tackle the challenging issues, we propose a non-local representation scheme, constrained with a mutual affine-transfer network (NL-MAT). Through a dictionary-based decomposition, NL-MAT is able to successfully decouple matched non-local representations and color information of the image pair, such that the context correspondence between the image pair is incorporated naturally, which largely facilitates local style transfer in a global-consistent fashion. To the best of our knowledge, this is the first attempt to address the photorealistic stylization problem with a non-local representation scheme, such that no additional models or steps for semantic matching are required during stylization. Experimental results demonstrate that the proposed method is able to generate photorealistic results with local style transfer while preserving both the spatial structure and global color consistency of the content image.