TY - GEN
T1 - Layer Removal for Transfer Learning with Deep Convolutional Neural Networks
AU - Zhi, Weiming
AU - Chen, Zhenghao
AU - Yueng, Henry Wing Fung
AU - Lu, Zhicheng
AU - Zandavi, Seid Miad
AU - Chung, Yuk Ying
N1 - Publisher Copyright:
© 2017, Springer International Publishing AG.
PY - 2017
Y1 - 2017
N2 - It is usually difficult to find datasets of sufficient size to train Deep Convolutional Neural Networks (DCNNs) from scratch. In practice, a neural network is often pre-trained on a very large source dataset. Then, a target dataset is transferred onto the neural network. This approach is a form of transfer learning, and allows very deep networks to achieve outstanding performance even when a small target dataset is available. It is thought that the bottom layers of the pre-trained network contain general information, which are applicable to different datasets and tasks, while the upper layers of the pre-trained network contain abstract information relevant to a specific dataset and task. While studies have been conducted on the fine-tuning of these layers, the removal of these layers have not yet been considered. This paper explores the effect of removing the upper convolutional layers of a pre-trained network. We empirically investigated whether removing upper layers of a deep pre-trained network can improve performance for transfer learning. We found that removing upper pre-trained layers gives a significant boost in performance, but the ideal number of layers to remove depends on the dataset. We suggest removing pre-trained convolutional layers when applying transfer learning on off-the-shelf pre-trained DCNNs. The ideal number of layers to remove will depend on the dataset, and remain as a parameter to be tuned.
AB - It is usually difficult to find datasets of sufficient size to train Deep Convolutional Neural Networks (DCNNs) from scratch. In practice, a neural network is often pre-trained on a very large source dataset. Then, a target dataset is transferred onto the neural network. This approach is a form of transfer learning, and allows very deep networks to achieve outstanding performance even when a small target dataset is available. It is thought that the bottom layers of the pre-trained network contain general information, which are applicable to different datasets and tasks, while the upper layers of the pre-trained network contain abstract information relevant to a specific dataset and task. While studies have been conducted on the fine-tuning of these layers, the removal of these layers have not yet been considered. This paper explores the effect of removing the upper convolutional layers of a pre-trained network. We empirically investigated whether removing upper layers of a deep pre-trained network can improve performance for transfer learning. We found that removing upper pre-trained layers gives a significant boost in performance, but the ideal number of layers to remove depends on the dataset. We suggest removing pre-trained convolutional layers when applying transfer learning on off-the-shelf pre-trained DCNNs. The ideal number of layers to remove will depend on the dataset, and remain as a parameter to be tuned.
KW - Convolutional neural networks
KW - Deep learning
KW - Transfer learning
UR - http://www.scopus.com/inward/record.url?scp=85035116486&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85035116486&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-70096-0_48
DO - 10.1007/978-3-319-70096-0_48
M3 - Conference paper
AN - SCOPUS:85035116486
SN - 9783319700953
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 460
EP - 469
BT - Neural Information Processing - 24th International Conference, ICONIP 2017, Proceedings
A2 - Zhao, Dongbin
A2 - El-Alfy, El-Sayed M.
A2 - Liu, Derong
A2 - Xie, Shengli
A2 - Li, Yuanqing
PB - Springer-Verlag Italia Srl
T2 - 24th International Conference on Neural Information Processing, ICONIP 2017
Y2 - 14 November 2017 through 18 November 2017
ER -