Since multiple MRI contrasts of the same anatomy contain redundant information, one contrast can be used as a prior for guiding the reconstruction of an undersampled subsequent contrast. To this end, several learning-based guided reconstruction methods have been proposed. However, two key challenges remain - (a) the requirement of large paired training datasets and (b) the lack of intuitive understanding of the model's internal representation and utilization of the shared information. We propose a modular two-stage approach for guided reconstruction, addressing these challenges. A content/style model of two-contrast image data is learned in a largely unpaired manner and is subsequently applied as a plug-and-play operator in iterative reconstruction. The disentanglement of content and style allows explicit representation of contrast-independent and contrast-specific factors. Based on this, incorporating prior information into the reconstruction reduces to simply replacing the aliased reconstruction content with clean content derived from the reference scan. We name this novel approach PnP-MUNIT. Various aspects like interpretability and convergence are explored via simulations. Furthermore, its practicality is demonstrated on the NYU fastMRI DICOM dataset and two in-house raw datasets, obtaining up to 32.6% more acceleration over learning-based non-guided reconstruction for a given SSIM. In a radiological task, PnP-MUNIT allowed 33.3% more acceleration over clinical reconstruction at diagnostic quality.