Simultaneous mapping and localization (SLAM) in an real indoor environment is still a challenging task. Traditional SLAM approaches rely heavily on low-level geometric constraints like corners or lines, which may lead to tracking failure in textureless surroundings or cluttered world with dynamic objects. In this paper, a compact semantic SLAM framework is proposed, with utilization of both geometric and object-level semantic constraints jointly, a more consistent mapping result, and more accurate pose estimation can be obtained. Two main contributions are presented int the paper, a) a robust and efficient SLAM data association and optimization framework is proposed, it models both discrete semantic labeling and continuous pose. b) a compact map representation, combining 2D Lidar map with object detection is presented. Experiments on public indoor datasets, TUM-RGBD, ICL-NUIM, and our own collected datasets prove the improving of SLAM robustness and accuracy compared to other popular SLAM systems, meanwhile a map maintenance efficiency can be achieved.