
Tal Perry
Verified Expert in Engineering
Generative Pre-trained Transformers (GPT) Developer
Tal is a Google developer with expertise in machine learning and a former NLP researcher at Citi. He is the founder and CTO of LightTag, a profitable NLP SaaS platform. His experience spans ML, ops, and human-machine interfaces. The solutions he's put in production include language-based compliance monitoring systems, high-frequency trading systems trading hundreds of millions a day, and NLP-based alternative data offerings for competitive intelligence and financial analysis.
Portfolio
Experience
Availability
Preferred Environment
Amazon Web Services (AWS), SQL, PostgreSQL, Docker, Django, Redux, React, TensorFlow, TypeScript, JavaScript, Python
The most amazing...
...thing I've built is a patented system for analyzing trader behavior based on the behavioral finance literature.
Work Experience
Founder, CEO, CTO
LightTag
- Created a SaaS business for NLP annotation with customers including Viasat, Microsoft, and Pitchbook.
- Deployed a language-agnostic machine learning model that correctly generates 70% of entity annotations on the platform.
- Built a multi-tenant SaaS supporting thousands of tenants while maintaining strong guarantees on tenant data isolation and low infrastructure expenses.
- Invented and implemented a patent-pending interface for drag and drop relationship annotation supporting constituency and dependency grammar.
- Conducted customer interviews and implemented findings to increase conversions, retention rates, and customer delight.
- Designed and deployed deep NLP models that can adapt to customer data without incurring significant compute costs.
Data Scientist
Citi
- Applied behavioral finance theory to create a patented system for detecting bias in credit trader behavior.
- Used rule-based and deep learning NLP to create multilingual compliance and CRM solutions for sell-side credit and rates trading.
- Reduced labor costs and turnaround time for institutional loan origination by developing ML-based document classification, routing, and extraction systems.
CTO
Superfly
- Grew the engineering team from a team of one to a cohesive and productive team of 12.
- Reduced turnaround time on POCs from three weeks on average to less than 48 hours by making core data assets accessible to the business side.
- Led the technological and product shift of the company from a $0 revenue consumer-facing service to a multi-million dollar alternative data provider.
- Maintained an acceptable infrastructure cost as we grew our data processing scale 1,000X.
- Increased return on data annotation costs by developing a "human-friendly" domain-specific language for semi-structured text analytics.
- Drove data acquisition throughput by deploying a terabyte-scale Elasticsearch cluster and designing a custom interface to find "needles in haystacks."
Research Engineer
Fluent Trade Technologies
- Deployed high-frequency algorithmic trading systems capable of trading hundreds of millions in notional volume a day.
- Implemented ML algorithms with single-digit millisecond latency to maintain an edge in HFT.
- Contributed to API design, usability testing, and QA as the company expanded into HFT PaaS offerings.
- Liaised between the research team, engineering, and senior management and helped frame objectives and challenges in an accessible form to each group.
Algorithmic Trader
Self Employed
- Designed, developed, and deployed a multi-equity long/short algorithmic trading system in C++.
- Implemented a multi-exchange and multi-threaded order management system.
- Developed backtesting infrastructure and data warehousing for equities data.
Experience
LightTag - Text Annotation SaaS
http://www.lighttag.ioI built LightTag because I needed it and turned it into a profitable business through a combination of ML and UX.
YLabel - Serverless, In-Browser Full Text Search and Annotation
https://github.com/LightTag/ylabelDense Continuous Sentences - NLP Variational Autoencoder Using Densenet
https://github.com/talolard/DenseContinuousSentancesArticle - Convolutional Methods For Text
https://medium.com/@TalPerry/convolutional-methods-for-text-d5260fd5675fArticle - How To Label Data
https://www.lighttag.io/how-to-label-data/Introductory Course To NLP
https://github.com/LightTag/NLPCourseRLStocks - Real Time Portfolio Rebalancing with Transaction Costs Solved with Reinforcement Learning
https://github.com/talolard/rlstocksAfter a foray into modern methods, I focused on a paper from the early '90s (Learning to Trade via Direct Reinforcement by Moody) that offers a much more domain focused approach to policy gradient algorithms.
Skills
Languages
Python, SQL, TypeScript 3, JavaScript, TypeScript, C++
Frameworks
Django, Redux
Libraries/APIs
PyTorch, TensorFlow, React, Pandas, Scikit-learn
Paradigms
Data Science
Platforms
MetaTrader, MetaTrader 4, Docker, Amazon Web Services (AWS)
Other
Data Analysis, Data Analytics, Natural Language Processing (NLP), Regular Expressions, Text Mining, Deep Learning, Machine Learning, GPT, Generative Pre-trained Transformers (GPT), Statistics, FIX Protocol, Trading Applications, Forex Trading
Storage
PostgreSQL, Redis, Elasticsearch
Industry Expertise
Trading Systems
Tools
Celery
Education
Bachelor of Science Degree in Mathametics
Tel Aviv University - Tel Aviv, Israel