Jay Shah
Data Scientist with applied AI skills, dedicated to delivering impactful and scalable solutions.
About
With over 6 years of experience, I specialize in AI for renewable energy and Large Language Models (LLMs). I excel in predictive maintenance for energy systems and full-stack data science. I also contribute to DataKind's social impact projects, combining AI with societal benefits.
Work Experience
AvathonPleasanton, California
Data Scientist III
AvathonSunnyvale, California
Data Scientist II
Avathon (Acquired Ensemble Energy)Palo Alto, California
Data Scientist
Avathon (Acquired Ensemble Energy)Palo Alto, California
Data Science Intern
Texas A&M UniversityCollege Station, Texas
Graduate Research Assistant
Utilities and Energy ServicesCollege Station, Texas
Student Analyst
DataKindSan Francisco, California
Data Ambassador
Education
Texas A&M University
Gujarat State University
Skills
Projects
Pravāha - Your Local Perplexity-Inspired Search Engine
Pravāha is an AI search assistant that combines local search engine capabilities with advanced Large Language Models (LLMs), inspired by Perplexity.ai.
NueroBuddy: A Personalized Chatbot
A personalized chatbot that provides mental health support and resources to users, leveraging advanced NLP models and AI-driven analytics. The project was developed with Mistral AI and Whisper Models.
StreamLens: Revolutionizing Video Content Interaction with AI
An AI-driven project aimed at transforming video content interaction, leveraging advanced analytics and machine learning. Participated in the RAG-A-THON challenge organized by Llama Index.
Gujarati Llama - Fine-tuned Version of LLaMA on Indic Language
Developed a fine-tuned version of the LLaMA model specifically for Gujarati and other Indic languages, enhancing language understanding and generation capabilities for low-resource languages.
PowerCurve Estimation for Wind Energy Farms
Collaborated with Texas A&M University to develop models for estimating power curves of wind energy farms, enhancing efficiency and predictive maintenance.
Exploratory Data Analysis of Mercedes Green Manufacturing Challenge
A project associated with Texas A&M University focusing on analyzing the green manufacturing processes of Mercedes, aiming at improving safety and efficiency.
Portfolio Analysis on New York Cab Data
Performed comprehensive data analysis on New York cab data to uncover insights and patterns, associated with Texas A&M University.
Predicting Drowsiness Related Lane Departures
A project aimed at predicting lane departures caused by drowsiness using novel feature generation techniques and convolutional neural networks, in collaboration with Texas A&M University. Achieved robust results with a confidence interval of 0.75-0.86 using the Bootstrap significance test.
Customer Relationship Prediction for a Mobile Network Operator
Worked on predicting customer behavior (churn, appetency, up-selling) for Orange, using a wide range of classification techniques to identify the highest AUC for individual problems. The project focused on true positives and direct customer communication strategies.
Phase 1 Analysis of Multivariate Quality Control Data for an Industrial Forging Process
Conducted principal component analysis and applied T2 and M-Cusum charts on multivariate data from an industrial forging process, achieving significant data reduction and cleansing. This work was associated with Texas A&M University, focusing on quality control data categorization.
Press ⌘J to open the command menu