Downloaded from: https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html Which is same as: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic) 5. Number of instances: 569 6. Number of attributes: 32 (ID, diagnosis, 30 real-valued input features) 7. Attribute information 1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32) Ten real-valued features are computed for each cell nucleus: a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1) Several of the papers listed above contain detailed descriptions of how these features are computed. The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius. All feature values are recoded with four significant digits. # 569 × 32 # 1 id # 2 diagnosis <--- Response Variable. M=malignant, B=benign # 3 mean_radius (mean of distances from center to points on the perimeter) # 4 mean_texture # 5 mean_perimeter # 6 mean_area # 7 mean_smoothness # 8 mean_compactness # 9 mean_concavity # 10 mean_concave_points # 11 mean_symmetry # 12 mean_fractal dimension # 13 radius_error # 14 texture_error (standard deviation of gray-scale values) # 15 perimeter_error # 16 area_error # 17 smoothness_error (local variation in radius lengths) # 18 compactness_error (perimeter^2 / area - 1.0) # 19 concavity_error (severity of concave portions of the contour) # 20 concave_points_error (number of concave portions of the contour) # 21 symmetry_error # 22 fractal_dimension_error ("coastline approximation" - 1) # 23 worst_radius # 24 worst_texture # 25 worst_perimeter # 26 worst_area # 27 worst_smoothness # 28 worst_compactness # 29 worst_concavity # 30 worst_concave_points # 31 worst_symmetry # 32 worst_fractal_dimension 8. Missing attribute values: none 9. Class distribution: 357 benign, 212 malignant ---------- library(tidyverse) DF <- read_csv("https://nmimoto.github.io/datasets/bcancer.csv") DF