Urban region profiling is influential for smart cities and sustainable development. However, extracting fine-grained semantics and generating robust urban region embeddings from noisy and incomplete urban data is challenging. In response, we present EUPAC (Enhanced Urban Region Profiling with Adversarial Contrastive Learning), a novel framework that enhances the robustness of urban region embeddings through joint optimization of attentive supervised and adversarial contrastive modules. Specifically, region heterogeneous graphs containing human mobility data, point of interest information, and geographic neighborhood details for each region are fed into our model, which generates region embeddings that preserve intra-region and inter-region dependencies through graph convolutional networks and multi-head attention. Meanwhile, we introduce spatially learnable augmentation to generate positive samples that are semantically similar and spatially close to the anchor, preparing for subsequent contrastive learning. Furthermore, we propose an adversarial training method to construct an effective pretext task by generating strong positive pairs and mining hard negative pairs for the region embeddings. Finally, we jointly optimize attentive supervised and adversarial contrastive learning to encourage the model to capture the high-level semantics of region embeddings while ignoring the noisy and irrelevant details. Extensive experiments on real-world datasets demonstrate the superiority of our model over state-of-the-art methods.