Machine Learning has considerably improved medical image analysis in the past years. Although data-driven approaches are intrinsically adaptive and thus, generic, they often do not perform the same way on data from different imaging modalities. In particular Computed tomography (CT) data poses many challenges to medical image segmentation based on convolutional neural networks (CNNs), mostly due to the broad dynamic range of intensities and the varying number of recorded slices of CT volumes. In this paper, we address these issues with a framework that combines domain-specific data preprocessing and augmentation with state-of-the-art CNN architectures. The focus is not limited to optimise the score, but also to stabilise the prediction performance since this is a mandatory requirement for use in automated and semi-automated workflows in the clinical environment. The framework is validated with an architecture comparison to show CNN architecture-independent effects of our framework functionality. We compare a modified U-Net and a modified Mixed-Scale Dense Network (MS-D Net) to compare dilated convolutions for parallel multi-scale processing to the U-Net approach based on traditional scaling operations. Finally, we propose an ensemble model combining the strengths of different individual methods. The framework performs well on a range of tasks such as liver and kidney segmentation, without significant differences in prediction performance on strongly differing volume sizes and varying slice thickness. Thus our framework is an essential step towards performing robust segmentation of unknown real-world samples.