Passionate Data Scientist with a proven track record in predictive modeling and data analysis. Skilled in Python, machine learning, and data visualization, I thrive on transforming complex data into actionable insights. My recent projects include developing a prediction model for disease prediction and a player workload analysis platform for cricket. Eager to leverage my expertise in a dynamic team to drive innovation and deliver data-driven solutions.

HasnainSiddiqui

Passionate Data Scientist with a proven track record in predictive modeling and data analysis. Skilled in Python, machine learning, and data visualization, I thrive on transforming complex data into actionable insights. My recent projects include developing a prediction model for disease prediction and a player workload analysis platform for cricket. Eager to leverage my expertise in a dynamic team to drive innovation and deliver data-driven solutions.

Available to hire

Passionate Data Scientist with a proven track record in predictive modeling and data analysis. Skilled in Python, machine learning, and data visualization, I thrive on transforming complex data into actionable insights. My recent projects include developing a prediction model for disease prediction and a player workload analysis platform for cricket. Eager to leverage my expertise in a dynamic team to drive innovation and deliver data-driven solutions.

See more

Skills

Da
Data Science
Da
Data Visualization
AI
AI Data Labelling

Experience Level

Data Science
Expert
Data Visualization
Expert
AI Data Labelling
Intermediate

Language

English
Advanced
Urdu
Fluent

Education

Bachelor's of Engineering in Software at Mehran University of Engineering and Technology
November 21, 2021 - November 22, 2025

Qualifications

Data Camp Certified Statistical Analyst
May 1, 2022 - November 15, 2022
Udemy Certified Power Bi Expert
December 1, 2022 - April 30, 2023
Data Camp Certified Machine Learning Expert
May 1, 2023 - August 30, 2023
Data Camp Certified Data Analyst
September 1, 2023 - November 30, 2023

Industry Experience

Software & Internet
    uniE613 Stanford Open Policing Project
    Data Analysis of Stanford Open Policing Project I conducted a comprehensive data analysis of the Stanford Open Policing Project using Python libraries such as pandas, numpy, matplotlib, seaborn, and scipy.stats. Key aspects of the project included: Data Cleaning & Preparation: Removed null values, merged date and time columns for better analysis, and converted data types for accuracy. Exploratory Data Analysis (EDA): Visualized driver demographics, drug-related stops, and district-wise accident distributions through bar charts, pie charts, and line plots. Gender & Violation Analysis: Investigated the impact of gender on arrests, stop outcomes for speeding violations, and calculated violation counts across different districts. Time-Based Trends: Analyzed arrest rates by the hour and annual trends for drug-related stops and search-conducted rates. Violation Duration: Mapped and visualized the average stop duration for different violation types.
    uniE613 Breast Cancer Diagnosis Using Logistic Regression
    In this project, I developed a predictive model using logistic regression to classify breast cancer as malignant or benign based on diagnostic data. The dataset was preprocessed by handling missing values, dropping irrelevant columns, and converting categorical data into numerical form for better model performance. Key Steps: Data Preprocessing: Cleaned the dataset by removing unnecessary columns and handling missing values. Categorical variables were encoded into binary form to aid model understanding. Exploratory Data Analysis: Utilized Seaborn and Matplotlib to explore data distributions and relationships, including a correlation matrix heatmap to identify important features. Model Training: Applied a Logistic Regression model after scaling the features using StandardScaler to improve convergence. The dataset was split into training and testing sets to evaluate model performance. Model Evaluation: Achieved an impressive accuracy of 98.2%. Evaluated the model using a confusion matrix, classification report, and ROC curve analysis, which showed strong predictive performance with an AUC score reflecting excellent discriminative ability. Visualization & Insights: Generated insightful visualizations including heatmaps and ROC curves that not only supported the model evaluation but also provided clear communication of the findings. This project demonstrates my proficiency in data preprocessing, visualization, model building, and evaluation using Python libraries such as pandas, numpy, scikit-learn, seaborn, and matplotlib, showcasing my ability to deliver actionable insights and accurate predictions in healthcare analytics.

Hire a Data Scientist

We have the best data scientist experts on Twine. Hire a data scientist in Karachi today.