Data Analytics

Automated Out: Forecasting Which Careers Face the Greatest AI Disruption

A machine learning research project that predicts AI automation risk across job roles and industries using classification models, feature engineering, and model evaluation visualizations.

Year :

2025

Industry :

Workforce / AI / Labor Market Analytics

Client :

Academic Research Project (Wilmington College)

Project Duration :

8 weeks

Problem :

Understanding which jobs are most at risk from AI automation is a growing challenge, but many discussions are still broad or opinion-based. I wanted to build a data-driven system that could classify automation risk at the job-title level and show clearer patterns across industries.

Another issue was that workforce-related datasets often come from different sources and formats, which makes analysis harder without strong preprocessing and alignment.

Solution :

I built a classification-based machine learning pipeline using merged and cleaned datasets related to AI risk, job market insights, and automation trends. I compared Decision Tree, Random Forest, and XGBoost models, and used SMOTE to handle class imbalance in the training data.

XGBoost performed best, achieving a macro F1-score of about 0.78, and I used visual analysis (ROC curves, feature importance, confusion matrix, and industry mapping visuals) to make the results easier to interpret.

Challenge :

One of the biggest challenges was class imbalance, because some job risk categories had very few examples. I addressed this with SMOTE, but I also had to be careful not to overstate results since oversampling can create patterns that may not fully match real-world behavior.

Another challenge was making the model useful beyond accuracy scores. I focused on explaining what features mattered most and where the model performed better or worse across classes, so the project could be understood by both technical and non-technical audiences.

Summary :

This project shows my ability to take a real-world question, build a full analytics workflow, and turn model outputs into meaningful insights. It combines data cleaning, feature engineering, machine learning, evaluation, and data storytelling in one end-to-end project.

It also reflects how I think about practical impact, not just technical performance, especially when analytics is connected to careers, policy, and workforce planning.

More Projects

Data Analytics

Automated Out: Forecasting Which Careers Face the Greatest AI Disruption

A machine learning research project that predicts AI automation risk across job roles and industries using classification models, feature engineering, and model evaluation visualizations.

Year :

2025

Industry :

Workforce / AI / Labor Market Analytics

Client :

Academic Research Project (Wilmington College)

Project Duration :

8 weeks

Problem :

Understanding which jobs are most at risk from AI automation is a growing challenge, but many discussions are still broad or opinion-based. I wanted to build a data-driven system that could classify automation risk at the job-title level and show clearer patterns across industries.

Another issue was that workforce-related datasets often come from different sources and formats, which makes analysis harder without strong preprocessing and alignment.

Solution :

I built a classification-based machine learning pipeline using merged and cleaned datasets related to AI risk, job market insights, and automation trends. I compared Decision Tree, Random Forest, and XGBoost models, and used SMOTE to handle class imbalance in the training data.

XGBoost performed best, achieving a macro F1-score of about 0.78, and I used visual analysis (ROC curves, feature importance, confusion matrix, and industry mapping visuals) to make the results easier to interpret.

Challenge :

One of the biggest challenges was class imbalance, because some job risk categories had very few examples. I addressed this with SMOTE, but I also had to be careful not to overstate results since oversampling can create patterns that may not fully match real-world behavior.

Another challenge was making the model useful beyond accuracy scores. I focused on explaining what features mattered most and where the model performed better or worse across classes, so the project could be understood by both technical and non-technical audiences.

Summary :

This project shows my ability to take a real-world question, build a full analytics workflow, and turn model outputs into meaningful insights. It combines data cleaning, feature engineering, machine learning, evaluation, and data storytelling in one end-to-end project.

It also reflects how I think about practical impact, not just technical performance, especially when analytics is connected to careers, policy, and workforce planning.

More Projects

Data Analytics

Automated Out: Forecasting Which Careers Face the Greatest AI Disruption

A machine learning research project that predicts AI automation risk across job roles and industries using classification models, feature engineering, and model evaluation visualizations.

Year :

2025

Industry :

Workforce / AI / Labor Market Analytics

Client :

Academic Research Project (Wilmington College)

Project Duration :

8 weeks

Problem :

Understanding which jobs are most at risk from AI automation is a growing challenge, but many discussions are still broad or opinion-based. I wanted to build a data-driven system that could classify automation risk at the job-title level and show clearer patterns across industries.

Another issue was that workforce-related datasets often come from different sources and formats, which makes analysis harder without strong preprocessing and alignment.

Solution :

I built a classification-based machine learning pipeline using merged and cleaned datasets related to AI risk, job market insights, and automation trends. I compared Decision Tree, Random Forest, and XGBoost models, and used SMOTE to handle class imbalance in the training data.

XGBoost performed best, achieving a macro F1-score of about 0.78, and I used visual analysis (ROC curves, feature importance, confusion matrix, and industry mapping visuals) to make the results easier to interpret.

Challenge :

One of the biggest challenges was class imbalance, because some job risk categories had very few examples. I addressed this with SMOTE, but I also had to be careful not to overstate results since oversampling can create patterns that may not fully match real-world behavior.

Another challenge was making the model useful beyond accuracy scores. I focused on explaining what features mattered most and where the model performed better or worse across classes, so the project could be understood by both technical and non-technical audiences.

Summary :

This project shows my ability to take a real-world question, build a full analytics workflow, and turn model outputs into meaningful insights. It combines data cleaning, feature engineering, machine learning, evaluation, and data storytelling in one end-to-end project.

It also reflects how I think about practical impact, not just technical performance, especially when analytics is connected to careers, policy, and workforce planning.

More Projects