Shan Oomes - Student Data Science & AI

About me

As a current Data Science and AI student, I combine a solid foundation in technical knowledge with a passion for innovation in artificial intelligence. My academic journey, which includes a background in game development and professional experience in web development, has given me a diverse set of skills that allows me to tackle complex challenges across multiple domains. From data analysis and machine learning to the creation of intelligent systems, I, Shan Oomes, am deeply focused on applying AI techniques to solve real-world problems, while continually exploring new frontiers in automation, data-driven decision-making, and algorithmic efficiency.

Currently, I am responsible for onboarding new self-employed professionals in the healthcare sector at PIDZ, where attention to detail and process efficiency are key. My past role as a junior programmer at Flexpulse saw me developing both front-end and back-end features, utilizing programming languages like Python, PHP, and SQL to enhance the software's functionality.

I am passionate about AI, machine learning, and data analysis, and I’m eager to apply these tools to solve real-world problems. Whether it's working on a complex algorithm or collaborating on user-friendly software solutions, I am driven by the desire to continuously learn and contribute in impactful ways.

Projects

Some of the projects I have worked on.

Automated Plant Phenotyping with Computer Vision and Robotics

Machine Learning

Robotics

Computer Vision

Plant Phenotyping

For this project, I worked in collaboration with the Netherlands Plant Eco-phenotyping Center (NPEC), through my association with Breda University of Applied Sciences. The project's primary objective was to revolutionize plant phenotyping by integrating computer vision and robotics to enhance plant root analysis and improve automation in precision inoculation.

To address plant phenotyping challenges, computer vision techniques were employed to isolate Petri dishes from background noise and perform semantic segmentation of plant components like seeds, shoots, and roots. With a refined and preprocessed dataset, I developed a machine learning model capable of accurately predicting masks for different plant structures. This allowed for instance segmentation, which identified individual plants in the images and facilitated detailed measurements, such as root length and the precise localization of root tips—a critical step in assessing plant growth and traits.

In parallel with the computer vision work, the robotics aspect of the project focused on automating the delivery of inoculants to the identified plant root tips. I simulated and developed a precision liquid handling robot that was programmed using a PID controller for accurate liquid dispensing. The robot was integrated with the computer vision system, enabling it to locate and deliver the liquid to targeted areas on the Petri dish. This demonstrated how vision-based analysis could be seamlessly combined with robotic automation, allowing for precise interventions in plant phenotyping experiments.

Additionally, I incorporated reinforcement learning to enhance the robot’s navigation and precision. By designing appropriate reward functions and conducting hyperparameter tuning, I ensured that the robot could autonomously navigate to the correct root tips for liquid delivery. The integration of reinforcement learning further improved the system's efficiency, as the robot learned to optimize its path and actions based on real-time feedback.

Key Findings:

The computer vision model was able to achieve high accuracy in isolating plant structures, particularly roots, enabling more efficient plant trait analysis.
The liquid handling robot demonstrated precise inoculation capabilities when integrated with vision-based localization, enhancing automation in plant phenotyping.
Reinforcement learning significantly improved the robot’s precision in liquid delivery, ensuring accurate targeting of plant root tips.
The combination of machine learning, computer vision, and robotics proved to be an effective approach for automated plant phenotyping.

The project culminated in a comprehensive technical report, which documented the methodologies used for dataset acquisition and preprocessing, and detailed the computer vision and robotics pipelines. The report included a flowchart that depicted the overall workflow and performance metrics that evaluated the accuracy of root segmentation, robot precision, and the effectiveness of reinforcement learning in task execution.

Skills Gained:

Robotics – Developed and simulated a robot with a PID controller for precise inoculation tasks.
Computer Vision – Applied segmentation techniques to isolate plant structures, enabling accurate plant root phenotyping.
Reinforcement Learning – Crafted reward functions and optimized RL models for autonomous robot navigation and liquid delivery.
Machine Learning – Built predictive models for plant segmentation and collaborated on enhancing robotics through ML integration.

The outcomes of this project demonstrated the successful synergy between computer vision and robotics, advancing automated plant phenotyping. The technical report and findings offered a valuable contribution to NPEC's goal of improving plant analysis through AI technologies.

Emotion Classification with Machine Learning

Natural Language Processing (NLP)

Machine Learning

Emotion Classification

For this group project, my team and I collaborated with Banijay, in association with Breda University of Applied Sciences, to develop an emotion classification system utilizing natural language processing (NLP) and machine learning models. The objective was to analyze video content, detecting and classifying emotions to enhance the content's emotional impact and insights.

The data preprocessing involved cleaning text data using regular expressions, and normalizing it through tokenization and stemming techniques. Word embeddings were then used to represent words as vectors, enabling the integration of these vectors into machine learning models. Additional feature extraction methods like TF-IDF and Part-of-Speech (POS) tagging were applied. A custom word embedding model, trained on our project-specific corpus, was incorporated to enhance emotion classification accuracy.

We experimented with multiple models for emotion classification. Initial models were developed using Naïve Bayes and Logistic Regression algorithms. Further sophistication was added with sequence models such as Recurrent Neural Networks (RNN), XGBoost, and Long Short-Term Memory (LSTM) networks, each contributing to an improved understanding of emotional cues in text.

A robust pipeline was developed to break down video content into fragments, extract text from these fragments, and predict emotions for each segment. To ensure optimal performance, we tested transformer models using Hugging Face, selecting RoBERTa as the core model. RoBERTa was fine-tuned on the dataset and achieved high accuracy in emotion classification.

Comprehensive model evaluation was performed using metrics such as accuracy, precision, recall, and F1-score. Through error analysis, we identified areas for improvement, balancing performance metrics to select the most effective model. The process and results were documented in a detailed technical report, showcasing the methodologies and findings.

Key Findings:

RoBERTa outperformed other models in emotion classification, achieving significant accuracy improvements after fine-tuning on the project-specific dataset.
Data preprocessing and feature extraction were critical in improving model performance, with POS tagging and word embeddings contributing to enhanced emotion detection.
The pipeline's automated process for splitting video and extracting text enabled efficient emotion classification across various video content.
Combining traditional algorithms with advanced transformer models provided a deeper understanding and classification of emotions in media content.

Skills Gained:

Transformer Models – Implemented and fine-tuned transformer models, specifically RoBERTa, for NLP tasks.
Performance Metrics Analysis – Evaluated models using accuracy, precision, recall, and F1-score.
Feature Engineering – Applied techniques such as tokenization, TF-IDF, and POS tagging for improved model performance.
Natural Language Processing (NLP) – Developed emotion classification models using advanced NLP techniques.
Model Evaluation – Conducted comprehensive model performance assessments and error analyses.

The project provided Banijay with a robust tool for analyzing emotional content in their video assets, offering actionable insights to enhance viewer engagement through AI-driven emotion classification.

Move Intermodal, logistics

Machine Learning

Logistics

Move Intermodal, an intermodal logistics provider, seeks to leverage artificial intelligence to predict shipping costs and optimize decision-making. Using their historical shipment data, an AI model will be developed to estimate costs based on various factors such as route, cargo type, and transport mode. The model will provide actionable insights into cost drivers, trustworthy estimates, and potential savings. By integrating with Move's existing data systems, the project aims to improve operational efficiency, enhance strategic planning, and foster a data-driven culture within the organization.

Contact

Do you have any questions or would you like to discuss a project? Feel free to contact me using the form below. I will get back to you as soon as possible.

* required