George McIntire, Developer in Berkeley, CA, United States
George is available for hire
Hire George

George McIntire

Verified Expert  in Engineering

Bio

George is a results-driven data scientist who brings a diverse skill set and rich experience to the table. His expertise lies in data translation and a versatile toolkit, including Python, SQL, and machine learning. George excels at distilling complex findings into actionable and comprehensible insights, making data accessible and impactful for stakeholders.

Portfolio

Self-employed
Large Language Models (LLMs), Data Analysis, Amazon Web Services (AWS)...
Twitter
Python 3, BigQuery, Natural Language Processing (NLP), Social Network Analysis...
Callisto Media
Jupyter Notebook, Plotly, Pandas, Scikit-learn, Python, Data Mining, NumPy...

Experience

Availability

Part-time

Preferred Environment

Jupyter Notebook, Python 3, Google BigQuery, GitHub, Amazon SageMaker, Amazon Web Services (AWS), SQL

The most amazing...

...solution I've created is a text classification and quote extraction pipeline for a nonprofit focused on analyzing how the media quotes men as opposed to women.

Work Experience

AI & Data Science Consultant

2022 - PRESENT
Self-employed
  • Worked as the lead LLM developer for an app that uses a ChatGPT-powered LLM to generate custom infographics. I created a vector database for retrieval-augmented generation (RAG), and used Amazon DynamoDB to store user conversation history.
  • Consulted for a project that explores and tests data science methodologies for use in the legal profession. I fine-tuned and adapted ChatGPT to extract and analyze relevant information from case opinions to aid lawyers.
  • Leveraged unsupervised learning and sentence embeddings to analyze the survey results of healthcare workers.
  • Created a MVP AI recipe recommendation app. Used role&goal and few-shot prompting for prompt engineering with Chat-GPT4. Also built a RAG db with Qdrant populated with recipes and ingredient nutrients.
Technologies: Large Language Models (LLMs), Data Analysis, Amazon Web Services (AWS), Web Scraping, Social Network Analysis, Natural Language Processing (NLP), Amazon DynamoDB, Machine Learning, Python, Predictive Modeling, Artificial Intelligence (AI), Data Science, Deep Learning, Neural Networks, TensorFlow, Clustering, Data Cleansing, ChatGPT, OpenAI, OpenAI GPT-4 API, Pandas, Scikit-learn, Plotly, PyTorch, Data Mining, Data Scraping, NumPy, Data Analytics, Streamlit, Generative Pre-trained Transformers (GPT), LangChain, SQL, SpaCy, Supervised Learning, Legal, Labeling, Transformers, OpenAI GPT-3 API, Prompt Engineering, Retrieval-augmented Generation (RAG), Hugging Face, MLflow, OpenAI API, ChatGPT Prompts

Data Scientist / ML Engineer

2021 - 2021
Twitter
  • Conducted exploratory data analysis using BigQuery and named entity recognition on millions of tweets reported by users for perceived terms of service violations.
  • Detected networks that coordinated reporting actions by users maliciously targeting other users for banning using NetworkX and Neo4j.
  • Built an interactive dashboard with the results using Looker Studio, which was used by Twitter's health data science team members to inform allocating resources to combat malicious behavior.
Technologies: Python 3, BigQuery, Natural Language Processing (NLP), Social Network Analysis, Data Analysis, Neo4j, Pandas, Scikit-learn, Plotly, Python, Data Mining, NumPy, PostgreSQL, Google BigQuery, Dashboards, Data Analytics, SQL, SpaCy, Supervised Learning

Data Visualization Analyst

2018 - 2020
Callisto Media
  • Conducted data research projects using exploratory data analysis, statistical analysis, and machine learning.
  • Partnered with the marketing team to build an interactive dashboard using Plotly's Dash tool, allowing them to visualize important KPIs for various campaigns easily.
  • Designed a word-similarity mechanism that outputs a score that evaluates how similar two Amazon key phrases are to one another, helping the company evaluate the Amazon book market to decide which types of books to publish.
  • Used Word2Vec to automate a process that matches Amazon search terms with their appropriate categories designated by the Callisto taxonomy—a vital project to the company, given its reliance on Amazon search data.
Technologies: Jupyter Notebook, Plotly, Pandas, Scikit-learn, Python, Data Mining, NumPy, Dash, Dashboards, Data Analytics, Data Visualization, Regression Modeling, SpaCy, Supervised Learning

DataJockey

https://github.com/GeorgeMcIntire/DataJockey
DataJockey is a passion project that combines my two professions: data science and DJing. This project applies my data science expertise to analyzing my song collection. With machine learning's increasing ability to process, synthesize, and even generate music, I became inspired to dive in and see if big data algorithms could help me better understand my musical oeuvre and optimize my routine DJ activities.

Protect Nil LLM

I was the principal large language model (LLM) developer for an innovative application incorporating a ChatGPT-powered language model (LLM) to craft personalized infographics, where I enhanced the ChatGPT model with memory capabilities by implementing a retrieval-augmented generation (RAG) approach by creating a vector database and leveraging DynamoDB to store user conversation history efficiently; my primary toolkit for LLM development was LangChain, and I significantly elevated model performance through adept, prompt engineering techniques. I orchestrated the development of a comprehensive full-stack pipeline, encompassing a user-friendly front-end experience powered by a StreamLit app, seamless storage of user information using Amazon RDS, and efficient application deployment via Amazon EC2. My experience in these endeavors showcases my proficiency in creating innovative and efficient solutions in LLM development and full-stack application deployment.

Gender Representation and Opinion Detection in the Media

https://www.ischool.berkeley.edu/projects/2022/gender-representation-and-opinion-detection-media
We built a front-end dashboard to help them visualize gender representation through news articles in their various issue areas. This builds off of previous work such as Informed Opinion’s Gender Gap Tracker and the Global Media Monitoring Project’s Who Makes the News Report. As part of this project we productionized the models, built a data pipeline, performed usability testing, and documented and handed off our work to the organization.

My role was training a subjectivity text classification model and mine patterns in the extracted quotes from the articles dataset.
2020 - 2022

Master's Degree in Information Systems

UC Berkeley School of Information - Berkeley, CA, USA

2007 - 2011

Bachelor's Degree in Economics

Occidental College - Los Angeles, CA, USA

Libraries/APIs

Pandas, Scikit-learn, NumPy, PyTorch, TensorFlow, SpaCy, OpenAI API

Tools

ChatGPT, Plotly, GitHub, Amazon SageMaker, BigQuery

Languages

SQL, Python, Python 3, R

Frameworks

Streamlit

Storage

Databases, PostgreSQL, Amazon DynamoDB, Amazon S3 (AWS S3), Neo4j

Platforms

Jupyter Notebook, Amazon Web Services (AWS)

Other

Machine Learning, Natural Language Processing (NLP), Data Analysis, Web Scraping, Data Visualization, Artificial Intelligence (AI), Data Science, Prompt Engineering, Clustering, Data Cleansing, OpenAI, Data Mining, Data Scraping, Dashboards, Data Analytics, Supervised Learning, ChatGPT Prompts, English, Google BigQuery, Writing & Editing, Social Network Analysis, Causal Inference, Surveying, Large Language Models (LLMs), Predictive Modeling, Deep Learning, Neural Networks, Generative Pre-trained Transformers (GPT), OpenAI GPT-4 API, Chatbots, Dash, LangChain, Pinecone, Regression Modeling, Text Recognition, Labeling, Blogging, Technical Writing, Content Writing, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNN), Transformers, OpenAI GPT-3 API, Retrieval-augmented Generation (RAG), Churn Analysis, Hugging Face, MLflow, Critical Thinking, Research, Information Systems, Economics, Amazon RDS, Legal, Text Classification

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring