import numpy as np import pandas as pd from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout # Load the dataset data = pd.read_csv("credit_card_data.csv") # Preprocessing the data scaler = StandardScaler() data["Amount"] = scaler.fit_transform(data["Amount"].values.reshape(-1, 1)) # Splitting the data into training and testing sets X = data.drop("Class", axis=1).values y = data["Class"].values X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Building the neural network model model = Sequential() model.add(Dense(128, input_dim=X_train.shape[1], activation="relu")) model.add(Dropout(0.5)) model.add(Dense(64, activation="relu")) model.add(Dropout(0.5)) model.add(Dense(1, activation="sigmoid")) # Compiling the model model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]) # Training the model model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test)) # Evaluating the model _, accuracy = model.evaluate(X_test, y_test) print("Accuracy:", accuracy)
In this example, the code assumes that you have a CSV file named "credit_card_data.csv" containing your credit card transaction data, with the "Class" column denoting whether the transaction is fraudulent or not (1 for fraud, 0 for non-fraud). Make sure to replace the file path with the actual path to your dataset.
The code begins by loading the dataset and performing basic preprocessing, such as scaling the "Amount" feature using StandardScaler. Then, the data is split into training and testing sets using train_test_split from scikit-learn.
Next, a sequential neural network model is built using the Keras API. The model consists of three dense layers with ReLU activation and dropout layers for regularization. The last layer uses a sigmoid activation function to output a probability between 0 and 1 indicating the likelihood of fraud.
The model is compiled with binary cross-entropy loss and the Adam optimizer. Then, it is trained on the training data using the fit function. The number of epochs and batch size can be adjusted according to your dataset and computational resources.
Finally, the model is evaluated on the testing data, and the accuracy is printed.
Please note that this is a simplified example, and for a real-world credit card fraud detection system, you may need to perform more sophisticated data preprocessing, feature engineering, and hyperparameter tuning. Additionally, you might consider using more advanced techniques such as anomaly detection algorithms or incorporating other features and models to enhance the performance