Recent progress in material data mining has been driven by high-capacity models trained on large datasets. However, collecting experimental data (real data) has been extremely costly since the amount of human effort and expertise required. Here, we develop a novel transfer learning strategy to address small or insufficient data problem. This strategy realizes the fusion of real and simulated data, and the augmentation of training data in data mining procedure. For a specific task of image segmentation, this strategy can generate synthetic images by fusing physical mechanism of simulated images and "image style" of real images. The result shows that the model trained with the acquired synthetic images and 35% of the real images outperforms the model trained on all real images. As the time required to generate synthetic data is almost negligible, this strategy is able to reduce the time cost of real data preparation by roughly 65%.
updated: Mon Oct 28 2019 04:12:43 GMT+0000 (UTC)
published: Sun May 12 2019 12:37:57 GMT+0000 (UTC)