Chris Wagner, Web Scraping Developer in Columbus, United States
Chris Wagner

Web Scraping Developer in Columbus, United States

Member since January 21, 2015
Chris has been writing code for more than 20 years. He's spent the last 15 years on applications ranging from search and data aggregation (scraping) to online scheduling and email marketing. His preferred development environment is in a type-safe, functional language like Scala or Haskell, but he's no stranger to Ruby, JavaScript, Python, and PHP. Eagerly waiting for Bitcoin to change the world, Chris is particularly interested in crypto projects.
Chris is now available for hire

Portfolio

  • Go-UPC
    Scala, Java, JVM, PostgreSQL, API Design, NGINX, Google Ads, JavaScript, HTML...
  • Spendabit
    PostgreSQL, Scalatra, Scala, API Development, Web Scraping, Data Scraping...
  • Salon Lofts
    Amazon Web Services (AWS), AWS, Elasticsearch, MySQL, RSpec...

Experience

Location

Columbus, United States

Availability

Full-time

Preferred Environment

Git, Linux, Scala, Java, Ruby, Agile, Test-driven Development (TDD), Continuous Development (CD), IntelliJ, Python

The most amazing...

...thing I've developed recently is a search engine for things you can buy with bitcoin.

Employment

  • Lead Software Engineer

    2021 - PRESENT
    Go-UPC
    • Developed an API for matching UPC and EAN numbers (i.e., barcodes) for the name, image, description, and other product information.
    • Aggregated product information from over 15,000 unique websites by large-scale web crawling and a proprietary AI product-detection algorithm.
    • Engineered a mechanism for locating products on the fly when the system could not find a matching product in its local database.
    • Designed and co-developed a complex product distillation algorithm to select the most relevant and highest quality product information from various sources.
    Technologies: Scala, Java, JVM, PostgreSQL, API Design, NGINX, Google Ads, JavaScript, HTML, CSS, Test-driven Development (TDD), Git, API Development, Data Scraping, HTML5, SQL, HTTP, IntelliJ IDEA, User Interface (UI), User Experience (UX), Software Development, REST APIs, Web Scraping
  • CTO

    2014 - 2020
    Spendabit
    • Constructed a search engine with an index of millions of products from hundreds of independent data sources (online stores, mostly).
    • Developed a proprietary scraping engine that identifies and imports products from arbitrary eCommerce websites. Some call it "AI"—just point and fire.
    • Devised novel algorithms ("tricks") for improving search-result relevance beyond standard keyword matching methods.
    • Managed a test suite covering hundreds of use cases and edge cases across search, scraping, and basic app functionality.
    • Designed a responsive user interface with HTML5, CSS, Bootstrap, and a sprinkling of JavaScript.
    • Designed and implemented an API to enable third-party integrations with the search back end.
    Technologies: PostgreSQL, Scalatra, Scala, API Development, Web Scraping, Data Scraping, Search, Java, JVM, Search Engines, SEO, JavaScript, HTML, CSS, Test-driven Development (TDD), Bitcoin, Cryptocurrency, ScalaTest, Git, API Design, HTML5, SQL, HTTP, IntelliJ IDEA, SBT, User Interface (UI), User Experience (UX), Software Development, REST APIs
  • Senior Software Engineer

    2015 - 2019
    Salon Lofts
    • Supported and extended an online-scheduling system used by thousands of stylists across the United States.
    • Dramatically expanded the application's test suite to include hundreds of "full-stack" test cases (tests running against a real web browser, interacting with the full application more or less as a real user would) and thousands of unit tests.
    • Developed a "waiting list" system that attempts to autofill schedule openings by interacting with clients via SMS when an opening arises (using a bit of "natural language processing" to read client text messages, etc.).
    • Built the interface and API integration to permit stylists to require a credit/debit card when booking an appointment.
    • Implemented a feature to allow stylists to enforce cancellation policies (requiring clients to provide a credit card when booking).
    Technologies: Amazon Web Services (AWS), AWS, Elasticsearch, MySQL, RSpec, Ruby on Rails (RoR), Ruby, Darcs, Git, RubyMine, Buildkite, Travis CI, Natural Language Processing (NLP), Scheduling, Credit Card Processing, Stripe, JavaScript, HTML, CSS, Test-driven Development (TDD), Amazon S3 (AWS S3), Software Development
  • Lead Developer

    2014 - 2015
    FreshAddress, LLC
    • Supported a second-hand Rails application with limited documentation.
    • Extended the Rails app to add new features ranging from enhanced search functionality to user interface improvements.
    • Reverse-engineered many app components for lack of solid documentation.
    Technologies: Sphinx Search Engine, MySQL, Ruby on Rails (RoR), JavaScript, HTML, CSS, Git, Software Development
  • Chief Programmer and Technologist

    2009 - 2014
    DownsizeDC.org & Zero Aggression Project
    • Designed and implemented the "Educate the Powerful" System, a tool enabling constituents of the U.S. Congress to quickly and efficiently contact their representatives via automation of congressional web forms.
    • Developed a polling and petitioning application for ZeroAggressionProject.org using a Scala and Scalatra microframework.
    • Maintained several websites across several servers, including individual and shared test suites, to keep them ticking smoothly.
    • Developed custom mailing-list management software (as a replacement for the rickety phpList open-source tool) used to send regular mailings to a 30,000-subscriber mailing list, coded in Scala and leveraging the JavaMail stack.
    • Provided counsel on all of the organization's technical matters.
    • Delivered technical support to the organization's user base and the team.
    Technologies: Linux, Apache Tomcat, MySQL, PostgreSQL, PHP, Java, Scala, JavaScript, HTML, CSS, SQL, IntelliJ IDEA, Software Development, Mailchimp
  • Senior Developer

    2008 - 2009
    Pubget
    • Held one of two development roles in getting the Pubget bio­-science search engine off the ground.
    • Led the development of a powerful screen-­scraping engine, enabling Pubget to aggregate articles from thousands of scientific journals and provide access to the latest journal articles days or weeks ahead of the then industry-­standard PubMed.
    • Leveraged a powerful Solr search server, distributed across many physical machines, to provide split-­second search across the content of millions of scientific articles.
    • Experienced a startup company going from the "out-of-the-garage" phase to becoming a "real" company with an office, proper staff, and the things that go along with that.
    • Spearheaded the company's automated testing efforts (unit tests, etc.).
    Technologies: Apache Solr, MySQL, Ruby on Rails (RoR), JavaScript, HTML, CSS, Data Scraping, Software Development, MongoDB, Django, Web Scraping
  • Lead Developer

    2005 - 2007
    SCRIP-SAFE
    • Developed a Java/J2EE-­based web application to enable educational institutions to securely exchange confidential documents (namely, student transcripts).
    • Leveraged many agile (compared to other Java technologies, anyway) and open­-source technologies, including Hibernate, Spring, and Lucene.
    • Worked with non­-technical people within the organization and at institutions to understand the requirements and needs of those in the industry.
    Technologies: Linux, Apache Lucene, Spring, Hibernate, Apache Struts, Java, JavaScript, HTML, CSS, Software Development

Experience

  • Spendabit
    https://spendabit.co

    Spendabit is a search engine for things you can buy with bitcoin. It searches across and aggregates products from thousands of merchants across hundreds of data sources (eCommerce websites, CSV data feeds, etc.).

    Spendabit is a Scala-based application leveraging PostgreSQL on the back end for general data storage and search (using the PostgreSQL Full Text Search extensions) and several Java and Scala libraries (such as Slick for database access). It leverages state-of-the-art "scraping" (web crawling) technology to import, and keep up-to-date, the majority of its products.

  • DownsizeDC.org
    https://downsizedc.org/

    Downsizedc.org is an educational venue but also a platform for (primarily US-based) people to advocate the case for liberty on many issues.

    The most significant component is the "Educate the Powerful" system that presents a straightforward user interface atop a powerful back end that automates the web-based contact forms of members of the US Congress, using a large set of rules and heuristics. It handles several hundred contact forms without the need for website-specific rules or coding and even supports captchas and other similar roadblocks.

    (Unfortunately, the "Educate the Powerful" system is no longer live.)

  • Zero Aggression Project

    The polling and petitioning application at Zeroaggressionproject.org aims to both recruit people to the position of the zero aggression principle (non-aggression principle) and offer a platform for recruitment and advocacy to existing adherents to the principle.

    The application presents potential respondents with a series of panels positing facts, questions, hypothetical scenarios, and general arguments regarding the particular campaign (for example, the drug prohibition or immigration campaign). Responses are recorded in a database and re-presented to users in a visual fashion.

    The back end is written in Scala, leveraging the Scalatra micro-framework, with PostgreSQL as the data store. The front end makes use of Bootstrap and jQuery.

  • BitcoinChipin.com

    BitcoinChipin.com is a pet project but presents a clean, elegant user interface. It is a simple tool that provides users with a graphical widget, embeddable on their own website or blog, which they can use to encourage people to donate, via bitcoin, to a particular cause (e.g., "Help us feed the homeless").

    BitcoinChipin.com is mostly PHP-based, with MySQL for the database and Bootstrap and jQuery for UI components. It uses the blockchain.info API for interacting with the bitcoin network (retrieving balances).

Skills

  • Languages

    PHP, HTML5, Scala, Python, Java, Ruby, SQL, HTML, JavaScript, CSS
  • Paradigms

    Agile Software Development, Test-driven Development (TDD), Unit Testing, Continuous Development (CD), Automated Testing
  • Platforms

    Linux, JVM, Amazon Web Services (AWS), Buildkite
  • Other

    HTTP, Agile Software Testing, Cryptocurrency, Web Scraping, Data Scraping, Software Development, Bitcoin, User Interface (UI), User Experience (UX), API Design, APIs, Search, Search Engines, Darcs, Scheduling, Credit Card Processing
  • Frameworks

    Bootstrap, Scalatra, Ruby on Rails (RoR), Apache Struts, Hibernate, Spring, Play Framework, Flask, Django
  • Libraries/APIs

    API Development, REST APIs, Apache Lucene, jQuery, Stripe
  • Tools

    SBT, ScalaTest, IntelliJ IDEA, Git, IntelliJ, Apache Tomcat, RubyMine, RSpec, Mailchimp, Apache Solr, NGINX, Travis CI
  • Storage

    MySQL, PostgreSQL, Amazon S3 (AWS S3), Elasticsearch, MongoDB, Sphinx Search Engine

Education

  • Bachelor of Engineering Degree in Computer Science
    2001 - 2005
    Ohio University - Athens, Ohio

To view more profiles

Join Toptal
Share it with others