In this project, I developed machine learning models to predict whether an arrest would occur based on historical crime data from the city of Chicago. The dataset included over 220,000 crime records, including crime type, location, date, and whether an arrest was made.
I performed extensive data preprocessing, including handling class imbalance through oversampling, encoding categorical variables, and feature scaling. Exploratory Data Analysis (EDA) revealed key crime trends, such as higher arrest likelihood in domestic incidents and specific community areas.
Three supervised learning models—Logistic Regression, Random Forest, and Neural Network (MLP Classifier)—were trained and evaluated using accuracy, precision, recall, F1-score, and ROC-AUC metrics. The Neural Network model outperformed the others with the highest overall performance.
The project demonstrates how predictive analytics can support smarter policing strategies, resource allocation, and public safety planning by identifying patterns that influence arrest outcomes.