This paper addresses the medical imaging problem of joint detection in the upper limbs, viz. elbow, shoulder, wrist and finger joints. Localization of joints from X-Ray and Computerized Tomography (CT) scans is an essential step for the assessment of various bone-related medical conditions like Osteoarthritis, Rheumatoid Arthritis, and can even be used for automated bone fracture detection. Automated joint localization also detects the corresponding bones and can serve as input to deep learning-based models used for the computerized diagnosis of the aforementioned medical disorders. This in-creases the accuracy of prediction and aids the radiologists with analyzing the scans, which is quite a complex and exhausting task. This paper provides a detailed comparative study between diverse Deep Learning (DL) models - YOLOv3, YOLOv7, EfficientDet and CenterNet in multiple bone joint detections in the upper limbs of the human body. The research analyses the performance of different DL models, mathematically, graphically and visually. These models are trained and tested on a portion of the openly available MURA (musculoskeletal radiographs) dataset. The study found that the best Mean Average Precision (mAP at 0.5:0.95) values of YOLOv3, YOLOv7, EfficientDet and CenterNet are 35.3, 48.3, 46.5 and 45.9 respectively. Besides, it has been found YOLOv7 performed the best for accurately predicting the bounding boxes while YOLOv3 performed the worst in the Visual Analysis test. Code available at https://github.com/Sohambasu07/BoneJointsLocalization