Data Analytics
Automated Out: Forecasting Which Careers Face the Greatest AI Disruption
A machine learning research project that predicts AI automation risk across job roles and industries using classification models, feature engineering, and model evaluation visualizations.
Year :
2025
Industry :
Workforce / AI / Labor Market Analytics
Client :
Academic Research Project (Wilmington College)
Project Duration :
8 weeks

Problem :
Understanding which jobs are most at risk from AI automation is a growing challenge, but many discussions are still broad or opinion-based. I wanted to build a data-driven system that could classify automation risk at the job-title level and show clearer patterns across industries.
Another issue was that workforce-related datasets often come from different sources and formats, which makes analysis harder without strong preprocessing and alignment.

Solution :
I built a classification-based machine learning pipeline using merged and cleaned datasets related to AI risk, job market insights, and automation trends. I compared Decision Tree, Random Forest, and XGBoost models, and used SMOTE to handle class imbalance in the training data.
XGBoost performed best, achieving a macro F1-score of about 0.78, and I used visual analysis (ROC curves, feature importance, confusion matrix, and industry mapping visuals) to make the results easier to interpret.


Challenge :
One of the biggest challenges was class imbalance, because some job risk categories had very few examples. I addressed this with SMOTE, but I also had to be careful not to overstate results since oversampling can create patterns that may not fully match real-world behavior.
Another challenge was making the model useful beyond accuracy scores. I focused on explaining what features mattered most and where the model performed better or worse across classes, so the project could be understood by both technical and non-technical audiences.
Summary :
This project shows my ability to take a real-world question, build a full analytics workflow, and turn model outputs into meaningful insights. It combines data cleaning, feature engineering, machine learning, evaluation, and data storytelling in one end-to-end project.
It also reflects how I think about practical impact, not just technical performance, especially when analytics is connected to careers, policy, and workforce planning.

More Projects
Data Analytics
Automated Out: Forecasting Which Careers Face the Greatest AI Disruption
A machine learning research project that predicts AI automation risk across job roles and industries using classification models, feature engineering, and model evaluation visualizations.
Year :
2025
Industry :
Workforce / AI / Labor Market Analytics
Client :
Academic Research Project (Wilmington College)
Project Duration :
8 weeks

Problem :
Understanding which jobs are most at risk from AI automation is a growing challenge, but many discussions are still broad or opinion-based. I wanted to build a data-driven system that could classify automation risk at the job-title level and show clearer patterns across industries.
Another issue was that workforce-related datasets often come from different sources and formats, which makes analysis harder without strong preprocessing and alignment.

Solution :
I built a classification-based machine learning pipeline using merged and cleaned datasets related to AI risk, job market insights, and automation trends. I compared Decision Tree, Random Forest, and XGBoost models, and used SMOTE to handle class imbalance in the training data.
XGBoost performed best, achieving a macro F1-score of about 0.78, and I used visual analysis (ROC curves, feature importance, confusion matrix, and industry mapping visuals) to make the results easier to interpret.


Challenge :
One of the biggest challenges was class imbalance, because some job risk categories had very few examples. I addressed this with SMOTE, but I also had to be careful not to overstate results since oversampling can create patterns that may not fully match real-world behavior.
Another challenge was making the model useful beyond accuracy scores. I focused on explaining what features mattered most and where the model performed better or worse across classes, so the project could be understood by both technical and non-technical audiences.
Summary :
This project shows my ability to take a real-world question, build a full analytics workflow, and turn model outputs into meaningful insights. It combines data cleaning, feature engineering, machine learning, evaluation, and data storytelling in one end-to-end project.
It also reflects how I think about practical impact, not just technical performance, especially when analytics is connected to careers, policy, and workforce planning.

More Projects
Data Analytics
Automated Out: Forecasting Which Careers Face the Greatest AI Disruption
A machine learning research project that predicts AI automation risk across job roles and industries using classification models, feature engineering, and model evaluation visualizations.
Year :
2025
Industry :
Workforce / AI / Labor Market Analytics
Client :
Academic Research Project (Wilmington College)
Project Duration :
8 weeks

Problem :
Understanding which jobs are most at risk from AI automation is a growing challenge, but many discussions are still broad or opinion-based. I wanted to build a data-driven system that could classify automation risk at the job-title level and show clearer patterns across industries.
Another issue was that workforce-related datasets often come from different sources and formats, which makes analysis harder without strong preprocessing and alignment.

Solution :
I built a classification-based machine learning pipeline using merged and cleaned datasets related to AI risk, job market insights, and automation trends. I compared Decision Tree, Random Forest, and XGBoost models, and used SMOTE to handle class imbalance in the training data.
XGBoost performed best, achieving a macro F1-score of about 0.78, and I used visual analysis (ROC curves, feature importance, confusion matrix, and industry mapping visuals) to make the results easier to interpret.


Challenge :
One of the biggest challenges was class imbalance, because some job risk categories had very few examples. I addressed this with SMOTE, but I also had to be careful not to overstate results since oversampling can create patterns that may not fully match real-world behavior.
Another challenge was making the model useful beyond accuracy scores. I focused on explaining what features mattered most and where the model performed better or worse across classes, so the project could be understood by both technical and non-technical audiences.
Summary :
This project shows my ability to take a real-world question, build a full analytics workflow, and turn model outputs into meaningful insights. It combines data cleaning, feature engineering, machine learning, evaluation, and data storytelling in one end-to-end project.
It also reflects how I think about practical impact, not just technical performance, especially when analytics is connected to careers, policy, and workforce planning.



