All study procedures were in accordance with the 1964 Declaration of Helsinki and subsequent amendments. The study was approved by the Ethics Committee of the Graduate School and Faculty of Medicine of Kyoto University (Approval No. R2272). As this study was performed retrospectively, the requirement for informed consent was waived.
From January 2007 to December 2015, 802 patients with pathologically confirmed lung adenocarcinoma were identified from our surgical database. Of these, 463 patients were excluded due to induction chemotherapy (n=23), multiple lung cancer nodules (n=69), lack of simple thin-section CT scan (n = 367), with a tumor diameter greater than 5 cm (n = 2), and presence of lymph node metastases (n = 2). The remaining 339 patients were included in the analyzes (Fig. 1).
Two experienced pathologists examined hematoxylin and eosin tissue sections with a Nikon Eclipse 80i optical microscope (Nikon Corporation, Tokyo, Japan) according to WHO STAS definitions. The edge of the main tumor was defined as a smooth surface easily recognizable by low magnification visual field examination. STAS was defined as tumor aggregates floating in the air cavity at least one alveolus away.
CT scans were performed using a 64-detector row scanner (Aquillion 64, Canon Medical Systems, Otawara, Japan) or a 320-detector row scanner (Aquillion ONE, Canon Medical Systems). Images were reconstructed with a soft tissue core (FC11, 13) and 1 mm slice thickness for radiomic analysis and with a lung core (FC51) and 0.5 mm slice thickness for l evaluation of the C/T ratio, using a back-projection filter algorithm. Table E1 lists detailed scan parameters.
Radiological assessment of C/T ratio and nodule type
The largest diameter of the whole tumor, the largest diameter of consolidation (solid part) and the types of nodules (solid, partially solid and ground glass nodule) were determined by an experienced radiation oncologist (KT with ten years of experience). experience in radiotherapy for lung cancer and in the interpretation of images related to lung cancer). A board-certified radiologist (RS with 14 years of experience interpreting lung images) independently confirmed the results and consensus was reached by discussion in case of disagreement. All cases were anonymized and both readers were blinded to the presence or absence of STAS and clinical findings. The largest tumor diameter was measured on the axial, coronal, or sagittal planes of the CT scan in the lung window (window level, -600 HU; window width, 1500 HU). The largest consolidation diameter was measured on the same plane where the largest tumor diameter was measured.
Tumor segmentation and feature extraction
The peritumoral ROI was defined as a ring-shaped ROI 5 mm inward and 5 mm outward from the tumor surface, excluding surrounding soft tissue, such as the chest wall or the mediastinum. A radiation oncologist (KT) segmented peritumoral ROI using 3D Slicer (version 4.10.2), which is a free, open-source, cross-platform software package for medical, biomedical, and related imaging research ( https://www.slicer.org/). Details of the segmentation procedures are given in Figure E1. Segmentation in randomly selected patients was also performed by a radiologist (RS) to assess reproducibility of radiomic features. Dice coefficients were calculated to compare lesion segmentation and assess interobserver variability.
Radiomic features were extracted from peritumoral ROIs using PyRadiomics (version 3.0), supported by the Image Biomarker Standardization Initiative (IBSI)19. All slices were resampled to 1 × 1 mm2 in the horizontal and vertical directions before feature extraction. Features included 14 shapes, 18 first-order matrices, 22 grayscale co-occurrence matrices (GLCM), 14 grayscale dependency matrices (GLDM), 16 grayscale size area matrices (GLSZM), 16 Grayscale Range Length Matrices (GLRLM) and 5 Neighbor Gray Tone Difference Matrices (NGTDM). In addition to the original image, images processed with Laplacian Gaussian (LoG) filters and coiflet wavelet filters were applied for six feature classes (first-order, GLCM, GLDM, GLSZM, GLRLM, and NGTDM ). Therefore, 1288 features were extracted from each ROI. A complete list of radiomic characteristics is provided in Table E2.
Development of a model
The patient cohort was randomly divided into training and test cohorts (3:2) using the two stratification factors (the presence of STAS and the types of nodules). We developed two models for the prediction of STAS (Fig. 2) in the present study. One was a machine learning model based on peritumoral radiomics features (peritumoral radiomics model). The other was a logistic regression model based on the tumor C/T ratio (C/T ratio model).
Before developing the peritumoral radiomics model, we selected non-redundant and reproducible features. Intraclass correlation coefficients (ICC) were calculated from ROIs independently by two researchers to quantify the interobserver reproducibility of radiomic features. An ICC > 0.75 was considered reproducible20. Absolute values of the pairwise Spearman correlation coefficient were calculated to remove redundant features in the training set (|ρ| > 0.7).
Finally, the peritumoral radiomics model was developed using the Least Absolute Selection and Removal Operator (LASSO) classification algorithm (Python scikit-learn environment, version 0.22.1) as the peritumoral radiomics model. The regularization parameters for LASSO were fitted using a fivefold cross-validation on the training dataset. The model regularization parameters with the highest AUC in the quintuple cross-validation were used to create the final model.
Predictive performance evaluation
The predictive performance of the models was compared using the area under the curve (AUC) of the receiver operating characteristic curve. The 95% confidence interval (CI) for AUC was calculated by bootstrapping with 2000 iterations. For comparison, the P The value was calculated by the DeLong method using the pROC package in R (version 1.16.1). The threshold value dividing the cohort into high-risk STAS and low-risk STAS groups was selected to maximize the Youden index for each model.
Follow-up and survival periods were calculated from the day of surgery. Cumulative incidence of recurrence (ICR) was calculated using a competing risk analysis, with death without recurrence considered as a competing event. Differences in CIR between groups were tested by Gray’s test using the cmprsk package in R (version 2.2-9). The significance level was set at P