Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
TECHNICAL SKILLS
• Languages: Python, R programming, SQL, Java, C
• Databases: MySQL, PostgreSQL, Microsoft SQL Server, Oracle Database
• Analytical: Classification, regression, predictive modeling, clustering, NLP, statistics, A/B testing
• Software: Pandas, sklearn, nltk, TensorFlow, numpy, scipy, ggplot2, dplyr, Apache Spark, Tableau, Excel, Google Analytics
WORK EXPERIENCE
Data Scientist – Intern | Webster Bank | New Britain, CT Jun ’18 – Aug ‘18
• Built a multi-class text classifier using Support Vector Machines to label customer feedback resulting in 50% faster responses
• Designed a Tableau dashboard to display the expense per banking channel and transaction volumes through each channel
• Mined customer data to investigate predicted incomes, produced graphs showing error percentages to invalidate the predictor
• Scripted complex SQL queries to extract over 10 million records from multiple databases to perform tasks
Tools used: Python, pandas, R programming, SQL, Tableau, Excel, sklearn, dplyr, ggplot2, classification, data visualization
Data Scientist – Graduate Researcher | Syracuse University | Syracuse, NY Aug ’17 – Present
Language Adoption in a Citizen Science Forum
• Designed a research project to analyze interaction logs and to model language adoption by a forum community
• Created a language model by cleaning text and applying TF-IDF to create a lexicon of project-specific terms
• Grouped users using k-means clustering to analyze the linguistic contribution of users with similar characteristics
• Identified that newcomers brought in new terminology; presented findings at research events including CSCW 2018
Detecting the Driving Factors Behind Successful Classification Tasks in Citizen Science
• Analyzed logs of click-data to model the behavior of users performing classification tasks in a citizen science project
• Constructed activity models for each user by analyzing their timestamps and correlating them to a classification
• Developed a logistic regression model to predict the outcome of a classification using the activity models as predictors
• Presented findings at research events and initiated a project to study the importance of each activity to the outcome
Tools used: Python, pandas, sklearn, R programming, ggplot2, dplyr, regression, classification, data visualization, clustering
ACADEMIC PROJECTS
Statistics in Information Science – Case Study on Vaccination Rates of Kindergartens in California
• Performed time series and changepoint analysis to show a significant change in vaccination rates from the year 2013
• Conducted hypothesis testing to show a strong correlation between vaccination rates and religion-based exemptions
• Ran ANOVA (Bayesian and frequentist) testing to highlight that public schools had higher rates than private ones
Advanced Analytics – Predicting Mortgage Approval
• Built a machine learning system to predict mortgage approval decisions and study biases in the process
• Trained logistic and random forest models to predict approval with over 85% accuracy
• Used linear regression to impute missing data and visualized feature coefficients to prove a lack of bias
Applied Data Science – Net Promoter Analysis for the Hyatt Group
• Analyzed 17 million records of feedback to study the likelihood of a customer recommending the hotel
• Assessed trends to characterize detractors using graphs, association-rule mining and linear regression
• Simulated regression to show that improved personal services to businessmen would convert 80% of the detractors
Database Management – Building a Repository of Academic Publications
• Designed and built a SQL database to manage modification and access to a repository of academic publications
• Created triggers to send email alerts to users when papers were added to their subscribed areas of interest
• Produced reports using complex queries to highlight changing trends in research areas and reader interest
HONORS
• Awarded Syracuse University’s Faculty Engagement Scholarship and a research role under Prof. Kevin Crowston, PhD
• Winner of Tata Consultancy Services’ coding contest – ranked among the top 1% of 200,000 participants nationwide