Aller au contenu

Créer un réseau de neurones avec Tensorflow

Tensorflow est un framework de machine learning open-source écrit en Python dédié à l'implémentation des méthodes de machine learning.

Le neurone formel

Le réseau de neurone

Importation

import tensorflow as tf
from tensorflow import keras

Données

import pandas as pd

DATA_URL = "https://github.com/joekakone/datasets/raw/master/datasets/Prostate_Cancer.csv"

data = pd.read_csv(DATA_URL)
data.head()
y = data["diagnosis_result"]
X = data.drop(["id", "diagnosis_result"], axis=1)

n_samples, n_features = X.shape
print("Total samples:", n_samples)
print("Total n_features:", n_features)
Total samples: 100
Total n_features: 8

Encodage des étiquettes

from sklearn.preprocessing import LabelEncoder
encoder = LabelEncoder()
encoder.fit(y)
y = encoder.transform(y)

Séparation des données

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    stratify=y, 
                                                    test_size=0.2, 
                                                    random_state=42)

Standardisation des variables

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
scaler.fit(X_train)
X_train_scaled = scaler.transform(X_train)
X_test_scaled = scaler.transform(X_test)

Conception du modèle

model = keras.Sequential(keras.layers.Dense(units=1, 
                                            activation="sigmoid", 
                                            input_shape=(n_features,)))
print(model.summary())
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 1)                 9         

=================================================================
Total params: 9
Trainable params: 9
Non-trainable params: 0
_________________________________________________________________
None

Paramétrage de l'apprentissage

model.compile(loss="binary_crossentropy", 
              optimizer=keras.optimizers.SGD(learning_rate=0.1), 
              metrics=["accuracy"])

Entraînement

history = model.fit(X_train_scaled, y_train, 
                    validation_data=[X_test_scaled, y_test], 
                    epochs=20)

Reporting

Après l'entraînement il est important de visualiser la courbe des métriques

import matplotlib.pyplot as plt

acc = history.history['accuracy']
loss = history.history['loss']
val_acc = history.history['val_accuracy']
val_loss = history.history['val_loss']
epochs = range(1, len(acc) + 1)

plt.figure(figsize=(8, 4))
plt.plot(epochs, acc, 'bo', label='Training Accuracy')
plt.plot(epochs, val_acc, 'b', label='Validation Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('Training and validation accuracy')
plt.legend(loc='best')
plt.show()

plt.figure(figsize=(8, 4))
plt.plot(epochs, loss, 'bo', label='Training Loss')
plt.plot(epochs, val_loss, 'b', label='Validation Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Training and validation loss')
plt.legend(loc='best')
plt.show()

Ces courbes permettent de savoir si le modèle génralise ou pas.