Scroll To View More
Hire the top 3% of freelance developers
Dan Lecocq

Dan Lecocq

Seattle, WA, United States
Member since May 6, 2014
Dan is an engineer and cowboy coder with a background in big data and distributed systems. He has extensive experience with profiling, optimization, asynchronous network I/O, and getting huge amounts of work pushed through a pipeline reliably and efficiently.
Dan is now available for hire
Portfolio
  • Moz
    Python, C++, Ruby, Java, Elasticsearch, HBase, qless, NSQ, gevent
  • IBM Research
    Python, WebSockets, C++
Experience
  • Distributed Programming, 7 years
  • Python, 5 years
  • Concurrent Programming, 4 years
  • C++, 3 years
  • Test-driven Development (TDD), 2 years
  • HBase, 1 year
Seattle, WA, United States
Availability
Part-time
Preferred Environment
Linux, Git, Python, C++, Ruby, JavaScript
The most amazing...
...thing I've coded is a system to crawl and index hundreds of millions of tweeted URLs within 10 minutes of being tweeted.
Employment
  • Senior Software Engineer
    Moz
    2011 - PRESENT
    • Rewrote a service recursively crawling customer sites and analyzing and reporting SEO issues.
    • Wrote a queueing system (qless) that has been widely adopted internally for externally for production systems.
    • Designed and implemented a service for crawling and indexing pages discovered through important RSS feeds.
    • Helped to implement an algorithm to remove navigation, headers, and footers from web content for the purposes of indexing (eventually published).
    • Wrote a number of web crawlers for different purposes, contributing many well-used open source projects along the way to the state of the art of web crawling.
    • Crawled and processed tens of billions of pages across all my various crawlers.
    • Worked to support our next generation of backlinks indexing infrastructure.
    Technologies: Python, C++, Ruby, Java, Elasticsearch, HBase, qless, NSQ, gevent
  • Graduate Researcher
    IBM Research
    2010 - 2010
    • Collaborated between KAUST's supercomputing department and IBM Research.
    • Augmented a computational steering library to work with WebSockets.
    • Included work with Lawrence Berkeley National Lab to eventually support streaming visualization.
    • Targeted KAUST's supercomputing infrastructure, an IBM BlueGene/P.
    • Worked to enable researchers to examine, monitor, and update parameters of running simulations.
    Technologies: Python, WebSockets, C++
Experience
  • Shovel (Development)
    https://github.com/seomoz/shovel

    Simple command-line dispatch of Python functions. Users find themselves regularly wanting to invoke small, simple Python functions from the command line, so I wrote what has become one of Moz's most popular repos.

  • qless (Development)
    https://github.com/seomoz/qless

    A rich queueing system for Redis, used for production services both at Moz and elsewhere. It utilizes Redis's Lua script support to implement complex atomic operations for queueing. It consists of a Lua core (https://github.com/seomoz/qless-core) and Ruby (https://github.com/seomoz/qless) and Python (https://github.com/seomoz/qless-py) bindings.

  • simhash-py (Development)
    https://github.com/seomoz/simhash-py

    Fast simhash in Python. It supports maintaining and finding near-duplicates in a set of documents with extreme speed. It consists of our underlying library simhash-cpp (https://github.com/seomoz/simhash-cpp) and the surrounding Python bindings.

  • pyreBloom (Development)
    https://github.com/seomoz/pyreBloom

    Extremely fast bloom filter manipulations in a Redis instance. While Redis itself does all the persistence, this library implements a highly-efficient Python C extension.

  • dragnet (Development)
    https://github.com/seomoz/dragnet

    Web page content extraction. This is the implementation supporting some published work (http://dl.acm.org/citation.cfm?id=2487828) where we separate the main content of web page articles and blog posts from the other components (navigation, headers, footers, etc.).

Skills
  • Languages
    Python, C++, JavaScript, Ruby, Lua
  • Paradigms
    Concurrent Programming, Distributed Programming, Test-driven Development (TDD)
  • Platforms
    Linux
  • Storage
    Redis, AWS S3, HBase, MySQL
  • Other
    Elasticsearch, Open Source
  • Libraries/APIs
    jQuery
Education
  • Master's degree in Applied Mathematics and Computational Science
    King Abdullah University of Science and Technology - Thuwal, Saudi Arabia
    2009 - 2010
  • Bachelor's degree in Computer Science
    Colorado School of Mines - Golden, CO
    2004 - 2009
Hire the top 3% of freelance developers
I really like this profile
Share it with others