Automotive Multiple-Input Multiple-Output (MIMO) radars have gained significant traction in Advanced Driver Assistance Systems (ADAS) and Autonomous Vehicles (AV) due to their cost-effectiveness, resilience to challenging operating conditions, and extended detection range. To fully leverage the advantages of MIMO radars, it is crucial to develop an Angle of Arrival (AOA) algorithm that delivers high performance with reasonable computational workload. This work introduces AAETR (Angle of Arrival Estimation with TRansformer) for high performance gridless AOA estimation. Comprehensive evaluations across various signal-to-noise ratios (SNRs) and multi-target scenarios demonstrate AAETR's superior performance compared to super resolution AOA algorithms such as Iterative Adaptive Approach (IAA). The proposed architecture features efficient, scalable, sparse and gridless angle-finding capability, overcoming the issues of high computational cost and straddling loss in SNR associated with grid-based IAA. AAETR requires fewer tunable hyper-parameters and is end-to-end trainable in a deep learning radar perception pipeline. When trained on large-scale simulated datasets then evaluated on real dataset, AAETR exhibits remarkable zero-shot sim-to-real transferability and emergent sidelobe suppression capability. This highlights the effectiveness of the proposed approach and its potential as a drop-in module in practical systems.