The data-intensive nature of supervised classification drives the interest of the researchers towards unsupervised approaches, especially for problems such as medical image segmentation, where labeled data is scarce. Building on the recent advancements of Vision transformers (ViT) in computer vision, we propose an unsupervised segmentation framework using a pre-trained Dino-ViT. In the proposed method, we leverage the inherent graph structure within the image to realize a significant performance gain for segmentation in medical images. For this, we introduce a modularity-based loss function coupled with a Graph Attention Network (GAT) to effectively capture the inherent graph topology within the image. Our method achieves state-of-the-art performance, even significantly surpassing or matching that of existing (semi)supervised technique such as MedSAM which is a Segment Anything Model in medical images. We demonstrate this using two challenging medical image datasets ISIC-2018 and CVC-ColonDB. This work underscores the potential of unsupervised approaches in advancing medical image analysis in scenarios where labeled data is scarce. The github repository of the code is available on [https://github.com/mudit-adityaja/UnSegMedGAT].