您所在的位置:首页 - 科学研究 - 承担项目

承担项目

建立基于深度学习的微观形态分类数据集并开发识别模型以确定矿物颜料的生产方法

This project developed and validated an interpretable, high-accuracy workflow for the automated classification of pigment manufacturing processes (natural vs. industrial) using micrographs acquired by scanning electron microscopy (SEM). We assembled a balanced dataset of 2,654 SEM images across eight pigment classes (red: RNC, RNK, RIJ; green: GNUK, GNSK, GIG; blue: BNK, BIC), complemented by optical microscopy, X-ray diffraction (XRD), SEM-EDS, and particle-size analysis to establish physicochemical ground truth. While natural and industrial pigments often share the same crystalline phase and elemental composition (e.g., cinnabar/vemilion, atacamite/verdigris), we hypothesized that micro-morphological descriptors (cleavage, edge sharpness, agglomeration, particle roundness, size distribution) provide discriminative features not captured by conventional compositional assays.

We implemented four canonical convolutional neural networks (CNNs)—AlexNet, GoogLeNet, ResNet-50, VGG-16—and a Vision Transformer (ViT) within a unified pipeline (ImageNet transfer learning; standardized preprocessing; TorchVision geometric augmentation; 80/20 train–test split). Performance was evaluated using accuracy, precision, recall, F1-score, confusion matrices, ROC, and precision–recall curves. All CNNs surpassed 96% accuracy on the multiclass task; VGG-16 reached ~99% with minimal variance, while ViT achieved perfect test-set classification on this dataset. Misclassification, when present, concentrated in the visually similar green classes (GNUK vs. GNSK) and diminished with deeper architectures.

To address interpretability, we applied Grad-CAM and guided backpropagation / guided Grad-CAM. The heat maps demonstrated that models progressively focused on particle-shape cues (cleavage planes, edge acuteness, agglomerate boundaries) rather than non-informative image regions, thereby aligning model attention with expert criteria. We also document a defocusing artifact in ViT heatmaps due to patch aggregation, highlighting a current limitation in transformer-based explainability despite superior classification metrics.

Scientifically, this work (1) applied deep learning to classify pigment manufacturing processes using SEM micrographs; (2) evaluated four CNN models (AlexNet, GoogLeNet, ResNet, and VGG16), which achieved high accuracy and successfully distinguished natural and industrial pigments that could not be differentiated by XRD or SEM-EDS; and (3) demonstrated that the Vision Transformer (ViT) achieved the best performance, reaching perfect accuracy. Interpretability techniques such as CAM and guided backpropagation visualized how models focused on particle shape and microstructure. The results demonstrate that AI-based image analysis can provide objective, efficient, and explainable tools for pigment identification and cultural heritage conservation.