Multi-task learning (MTL) is an efficient solution to solve multiple tasks simultaneously in order to get better speed and performance than handling each single-task in turn. The most current methods can be categorized as either: (i) hard parameter sharing where a subset of the parameters is shared among tasks while other parameters are task-specific; or (ii) soft parameter sharing where all parameters are task-specific but they are jointly regularized. Both methods suffer from limitations: the shared hidden layers of the former are difficult to optimize due to the competing objectives while the complexity of the latter grows linearly with the increasing number of tasks. To mitigate those drawbacks, this paper proposes an alternative, where we explicitly construct an auxiliary module to mimic the soft parameter sharing for assisting the optimization of the hard parameter sharing layers in the training phase. In particular, the auxiliary module takes the outputs of the shared hidden layers as inputs and is supervised by the auxiliary task loss. During training, the auxiliary module is jointly optimized with the MTL network, serving as a regularization by introducing an inductive bias to the shared layers. In the testing phase, only the original MTL network is kept. Thus our method avoids the limitation of both categories. We evaluate the proposed auxiliary module on pixel-wise prediction tasks, including semantic segmentation, depth estimation, and surface normal prediction with different network structures. The extensive experiments over various settings verify the effectiveness of our methods.