The convolutional neural network (CNN) has become a basic model for solving many computer vision problems. In recent years, a new class of CNNs, recurrent convolution neural network (RCNN), inspired by abundant recurrent connections in the visual systems of animals, was proposed. The critical element of RCNN is the recurrent convolutional layer (RCL), which incorporates recurrent connections between neurons in the standard convolutional layer. With increasing number of recurrent computations, the receptive fields (RFs) of neurons in RCL expand unboundedly, which is inconsistent with biological facts. We propose to modulate the RFs of neurons by introducing gates to the recurrent connections. The gates control the amount of context information inputting to the neurons and the neurons' RFs therefore become adaptive. The resulting layer is called gated recurrent convolution layer (GRCL). Multiple GRCLs constitute a deep model called gated RCNN (GRCNN). The GRCNN was evaluated on several computer vision tasks including object recognition, scene text recognition and object detection, and obtained much better results than the RCNN. In addition, when combined with other adaptive RF techniques, the GRCNN demonstrated competitive performance to the state-of-the-art models on benchmark datasets for these tasks. The codes are released at https://github.com/Jianf-Wang/GRCNNhttps://github.com/Jianf-Wang/GRCNN.
updated: Sat Jun 05 2021 10:14:59 GMT+0000 (UTC)
published: Sat Jun 05 2021 10:14:59 GMT+0000 (UTC)