International Journal of Engineering and Manufacturing(IJEM)

ISSN: 2305-3631 (Print), ISSN: 2306-5982 (Online)

Published By: MECS Press

IJEM Vol.12, No.5, Oct. 2022

High Accuracy Swin Transformers for Image-based Wafer Map Defect Detection

Full Text (PDF, 1193KB), PP.10-21

Views:13   Downloads:0


Thahmidul Islam Nafi, Erfanul Haque, Faisal Farhan, Asif Rahman

Index Terms

Wafer defects; Transformer models; Machine learning; Swin transformer model.


A wafer map depicts the location of each die on the wafer and indicates whether it is a Product, Secondary Silicon, or Reject. Detecting defects in Wafer Maps is crucial in order to ensure the integrity of the chips processed in the wafer, as any defect can cause anomalies thus decreasing the overall yield. With the current advances in anomaly detection using various Computer Vision Techniques, Transformer Architecture based Vision models are a prime candidate for identifying wafer defects. In this paper, the performance of Four such Transformer based models – BEiT (BERT Pre-Training of Image Transformers), FNet (Fourier Network), ViT (Vision Transformer) and Swin Transformer (Shifted Window based Transformer) in wafer map defect classification are discussed. Each of these models were individually trained, tested and evaluated with the “MixedWM38” dataset obtained from the online platform, Kaggle. During evaluation, it has been observed that the overall accuracy of the Swin Transformer Network algorithm is the highest, at 97.47%, followed closely by Vision Transformer at 96.77%. The average Recall of Swin Transformer is also 97.54%, which indicates an extremely low encounter of false negatives (24600 ppm) in contrast to true positives, making it less likely to expose defective products in the market.

Cite This Paper

Thahmidul Islam Nafi, Erfanul Haque, Faisal Farhan, Asif Rahman, "High Accuracy Swin Transformers for Image-based Wafer Map Defect Detection", International Journal of Engineering and Manufacturing (IJEM), Vol.12, No.5, pp. 10-21, 2022. DOI:10.5815/ijem.2022.05.02


[1]J. S. Fenner, M. K. Jeong, and J. C. Lu, “Optimal automatic control of multistage production processes,” IEEE Transactions on Semiconductor Manufacturing, vol. 18, no. 1, pp. 94–103, Feb. 2005, doi: 10.1109/TSM.2004.840532.

[2]S. P. Cunningham, C. J. Spanos, and K. Voros, “Semiconductor Yield Improvement: Results and Best Practices,” 1995.

[3]S. Khan, M. Naseer, M. Hayat, S. W. Zamir, F. S. Khan, and M. Shah, “Transformers in Vision: A Survey,” Jan. 2021, doi: 10.1145/3505244.

[4]H. Lee, Y. Kim, and C. O. Kim, “A deep learning model for robust wafer fault monitoring with sensor measurement noise,” IEEE Transactions on Semiconductor Manufacturing, vol. 30, no. 1, pp. 23–31, Feb. 2017, doi: 10.1109/TSM.2016.2628865.

[5]J. C. Chien, M. T. Wu, and J. der Lee, “Inspection and classification of semiconductor wafer surface defects using CNN deep learning networks,” Applied Sciences (Switzerland), vol. 10, no. 15, Aug. 2020, doi: 10.3390/APP10155340.

[6]P. C. Shih, C. C. Hsu, and F. C. Tien, “Automatic reclaimed wafer classification using deep learning neural networks,” Symmetry (Basel), vol. 12, no. 5, May 2020, doi: 10.3390/SYM12050705.

[7]J. Yu, “Enhanced stacked denoising autoencoder-based feature learning for recognition of wafer map defects,” IEEE Transactions on Semiconductor Manufacturing, vol. 32, no. 4, pp. 613–624, Nov. 2019, doi: 10.1109/TSM.2019.2940334.

[8]S. Kang, “Joint modeling of classification and regression for improving faulty wafer detection in semiconductor manufacturing,” Journal of Intelligent Manufacturing, vol. 31, no. 2, pp. 319–326, Feb. 2020, doi: 10.1007/s10845-018-1447-2.

[9]Y. Fu, X. Li, and X. Ma, “Deep-learning-based defect evaluation of mono-like cast siliconwafers,” Photonics, vol. 8, no. 10, Oct. 2021, doi: 10.3390/photonics8100426.

[10]M. Baker Alawieh, D. Boning, and D. Z. Pan, 2020 57th ACM/IEEE Design Automation Conference (DAC). IEEE, 2020.

[11]D. Ballard et al., “Backpropagation Applied to Handwritten Zip Code Recognition.”

[12]A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks.” [Online]. Available:

[13]K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Dec. 2016, vol. 2016-December, pp. 770–778. doi: 10.1109/CVPR.2016.90.

[14]A. Wang, A. Singh, J. Michael, F. Hill, O. Levy, and S. R. Bowman, “GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding,” Apr. 2018, [Online]. Available:

[15]N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, “End-to-End Object Detection with Transformers,” May 2020, [Online]. Available:

[16]A. Dosovitskiy et al., “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale,” Oct. 2020, [Online]. Available:

[17]Z. Liu et al., “Swin Transformer: Hierarchical Vision Transformer using Shifted Windows,” Mar. 2021, [Online]. Available:

[18]Y. Tay, Z. Zhao, D. Bahri, D. Metzler, and D.-C. Juan, “HyperGrid: Efficient Multi-Task Transformers with Grid-wise Decomposable Hyper Projections,” Jul. 2020, [Online]. Available:

[19]J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Oct. 2018, [Online]. Available:

[20]H. Bao, L. Dong, and F. Wei, “BEiT: BERT Pre-Training of Image Transformers,” Jun. 2021, [Online]. Available: