Alexandra Soroka, Developer in Saint Petersburg, Russia
Alexandra is available for hire
Hire Alexandra

Alexandra Soroka

Verified Expert  in Engineering

Machine Learning Developer

Location
Saint Petersburg, Russia
Toptal Member Since
May 22, 2019

Alexandra's always wanted to work with mathematics and language, so she sought out an education that combined programming and linguistics—and has been quite successful. She has about five years of NLP experience at Yandex, a major Russian search engine, and two years at smaller companies plus freelancing. Most of her projects involve entity recognition, but she’s also ranked search results, generated query expansions, and done text summarization.

Availability

Part-time

Preferred Environment

Git, Linux

The most amazing...

...thing I’ve built was a named-entity-recognition component for financial text (English/Dutch); clarified the requirements, built the datasets and model.

Work Experience

Senior Software Engineer

2019 - PRESENT
Huawei
  • Helped design a web search from scratch—designing and ranking quality evaluation systems, dataset building processes, and so on.
Technologies: Web Search

Data Scientist

2018 - 2019
Itexus
  • Created a working named-entity-recognition component for English and Dutch from scratch.
  • Clarified the requirements.
  • Built the dataset.
  • Developed the neural network and wrapped it into a library which then applied it to various financial texts.
Technologies: Deep Learning, Gensim, Python, Keras

Chief Data Scientist

2018 - 2018
Econophysica
  • Worked on a short-term (two months) project where I extracted oil field attributes from geological reports.
Technologies: Python, Keras

Researcher | Software Developer

2012 - 2017
Yandex
  • Enhanced and developed a named-entity-recognition system for search queries in the linguistics department.
  • Optimized the search result ranking in the ranking, relevance, and linguistics department.
  • Maintained a query expansions generation system.
  • Took part in structuring information for a knowledge graph in the web ontologies department.
Technologies: Linux, MapReduce, Random Forests, Python, C++

Software Developer Intern

2011 - 2012
Yandex
  • Enhanced the performance of a named-entity-recognition system for search queries.
Technologies: MapReduce, Random Forests, Python, C++

A NER Component for Financial Text

I developed from scratch a component recognizing named entities related to finance in texts. I did everything from clarifying requirements with the client to building the model itself and the class applying it to documents—in both English and Dutch.

A Text Summarization Project

This started as a freelance text summarization project. It involved automatically summarizing English classics, in all forms from long EPUBs to short ones while preserving valuable information. The core of the system was a version of TextRank, and it was extractive and unsupervised.
2007 - 2012

Specialist's Degree (Equivalent to a Master's Degree) in Computer Science

Russian State University for the Humanities - Moscow, Russia

Libraries/APIs

Keras, Natural Language Toolkit (NLTK), Beautiful Soup

Tools

Git, Gensim

Platforms

Linux

Languages

C++, Python 3, Python

Paradigms

MapReduce

Other

Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Machine Learning, Random Forests, Web Search, Deep Learning

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring