Ilmira Terpugova
Verified Expert in Engineering
Machine Learning Developer
Innopolis, Tatarstan, Russia
Toptal member since May 7, 2019
Ilmira is a data scientist with strong mathematical and programming background. She is hard-working, responsible and conscientiously follows deadlines. Ilmira produces valuable findings while also providing thorough, concise explanations and visualizations of the results.
Portfolio
Experience
Availability
Preferred Environment
Git, PyCharm, Linux, MacOS
The most amazing...
...algorithm I've coded is based on a term-weighting scheme and centroid-based classifier and won Kaggle competition for classifying genetic mutations.
Work Experience
Data Scientist
ChangeDynamix, Inc.
- Implemented unsupervised user clustering using unstructured network activity data collected inside a client network; including feature construction, preprocessing and selection, clustering algorithm selection and evaluation, and visualization.
Data Scientist
Eurecat, Centre Tecnològic de Catalunya
- Implemented an ML classification pipeline for the prediction of complex diseases (diabetes type II) using genomic and environmental data in a distributed environment which included imputation of missing values, feature selection (out of 755,000 features) and analysis of feature importance.
- Created a web server for an automatized pipeline running and results representation.
- Improved the implementation of chi-squared feature selection compared to MLlib.
Sofware Engineer
SoftPlus CJSC
- Designed and implemented new features for a control panel for Internet-Hosting LLC.
- Integrated payment systems, social networks, domain name registrars, and an SMS sending service.
Software Engineer
Science Research Institute of Measuring Technology — Radio Systems
- Contributed to the C/C++ and Java software development for landing systems.
Experience
Classifying Clinically Actionable Genetic Mutations
https://www.kaggle.com/c/msk-redefining-cancer-treatment/leaderboardPrediction of Complex Diseases Using Genomic and Environmental Data
I implemented the distributed pipeline for the classification task to predict complex disease; e.g., type II diabetes which is believed to depend on the combination of several genes and lifestyle and environmental factors
Right Whale Recognition
https://arxiv.org/abs/1604.05605Protein Classification From Primary Structures in the Context of Database Biocuration
https://upcommons.upc.edu/bitstream/handle/2117/106701/124491.pdfEducation
Master's Degree in Artificial Intelligence
Universitat Politecnica de Catalunya (UPC), Universitat de Barcelona (UB), and Universitat Rovirai Virgili (URV) - Barcelona, Spain
Specialist's Degree in Applied Mathematics
South Ural State University - Cheliabinsk
Skills
Libraries/APIs
Pandas, jQuery, Scikit-learn, Keras, TensorFlow, MLlib
Tools
PyCharm, Git, Jupyter, TensorBoard, Seaborn, Plotly
Languages
Python, Scala, Groovy, Java, JavaScript, Java 7, C++
Frameworks
Spark, Flask, Grails, Bootstrap, Selenium, Qt Quick, Apache Spark, Qt
Platforms
MacOS, Linux, JavaFX
Storage
MySQL, HDFS
Other
Data Science, Data Analysis, Data Analytics, Statistics, Machine Learning, Neural Networks, Parquet, Visualization
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring