Noah Diewald, Developer in Columbus, OH, United States
Noah is available for hire
Hire Noah

Noah Diewald

Verified Expert  in Engineering

Lean Developer

Columbus, OH, United States

Toptal member since October 9, 2024

Bio

Noah is a linguist and developer with a strong background in modeling linguistic theories using dependently typed programming languages and theorem provers such as Coq. As a full-stack web developer, he is proud of the research and educational tools he has developed, including successful collaborations with experts in ML that resulted in tools and published papers. Noah enjoys working with Elm and Haskell and has experience with JavaScript, Ruby, Erlang, and other popular languages.

Portfolio

Ohio State University, Department of Linguistics
Coq, Haskell, PostgREST, PostgreSQL, Elm, JavaScript, CouchDB
University of Wisconsin Language Sciences
JavaScript, Erlang, Ruby, Haskell, CouchDB, Lexicography, PostgreSQL...

Experience

  • Lexicography - 20 years
  • Linguistics - 20 years
  • Dependent Type Theory - 10 years
  • Computational Linguistics - 10 years
  • Formal Methods - 10 years
  • Elm - 8 years
  • Haskell - 5 years
  • Coq - 5 years

Availability

Part-time

Preferred Environment

Emacs, Linux, Coq, Haskell, Elm, PostgreSQL, CouchDB

The most amazing...

...experience I've had is gaining an insight into the organization of grammar through discrete mathematical analysis.

Work Experience

PhD Candidate

2014 - PRESENT
Ohio State University, Department of Linguistics
  • Developed a formal theory of the co-variation of form and meaning of words for my PhD thesis, utilizing the theorem prover Coq to validate analysis and models, resulting in superior analyses of polysemy and free variation in natural language.
  • Created the first local web app for linguistic fieldwork, with a front end in Elm and PouchDB/CouchDB for synchronization, allowing me to collect data in remote Amazonian villages and later sync it to a central server.
  • Developed software for rapid tagging of corpora for low-resource languages on a team with ML researchers, which relied on analogical relationships between words rather than segmentation into morphemes. Papers were well received at the ACL workshop.
  • Built software for lexical databases and corpora using PostgreSQL and various web technologies that resulted in three electronic and print dictionaries and served as the basis for research and education programs.
  • Taught undergraduate courses in linguistics for eight years, both in person and online, designing courses in Native American linguistics and general linguistics. I received consistently positive reviews, and students achieved good outcomes.
  • Applied for grants for linguistic research, which allowed for two years of linguistic fieldwork in the Ecuadorian Amazon, resulting in substantial data collection and a language corpus with little previous scientific description.
  • Presented at academic conferences, providing evidence on the importance of discourse in determining word meaning, the abstractions needed to understand polysemy, and the necessary formal mechanisms to model non-determinism in inflectional systems.
  • Headed a team of researchers and native speakers of an indigenous language in Puyo, Ecuador, overcoming cultural and linguistic differences to produce a large body of linguistic data.
  • Designed experiment-like linguistic diagnostics to investigate anaphoric properties and other discourse dependencies of word forms, providing more precise measures of anaphoricity than are commonly found in the linguistic literature.
Technologies: Coq, Haskell, PostgREST, PostgreSQL, Elm, JavaScript, CouchDB

Senior Systems Programmer

2010 - 2017
University of Wisconsin Language Sciences
  • Developed a web-based lexicography system for a Native American language using PostgreSQL and Ruby on Rails. The system resulted in the publication of a child and adult print and electronic dictionary.
  • Developed a web-based lexicography tool in Erlang and JavaScript with a CouchDB back end for documenting a Native American language, which multiple organizations wanted independent databases that could selectively synchronize.
  • Typeset print dictionaries in LaTeX that were considered both easy to navigate and beautiful.
  • Collaborated with field researchers, Native American community members, and tribal officials effectively.
  • Imported data from linguistic applications such as SIL Toolbox, ELAN, and others into PostgreSQL and CouchDB.
Technologies: JavaScript, Erlang, Ruby, Haskell, CouchDB, Lexicography, PostgreSQL, Ruby on Rails 3

Experience

A Word-and-paradigm Workflow for Fieldwork Annotation

https://aclanthology.org/2022.computel-1.20/
There are many challenges in morphological fieldwork annotation. It heavily relies on segmentation and feature labeling (which have both practical and theoretical drawbacks), it’s time-intensive, and the annotator needs to be linguistically trained and may still annotate things inconsistently.

I collaborated on developing a workflow that relies on unsupervised and active learning grounded in word-and-paradigm morphology (WP). ML has the potential to greatly accelerate the annotation process and allow a human annotator to focus on problematic cases. At the same time, the WP approach makes for an annotation system that is word-based and relational, removing the need to make decisions about feature labeling and segmentation early in the process and allowing speakers of the language of interest to participate more actively since linguistic training is not necessary.

This is a proof of concept for the first step of the workflow. In a realistic fieldwork setting, annotators can process hundreds of forms per hour.

Wao Tededo Documentation Database

https://wao.tededo.org/
This site contains a database of information gathered during my field research on Wao Tededo. Wao Tededo, also spelled Wao Terero, is a language in the Ecuadorian Amazon. The database is currently a work in progress and cannot be viewed by the general public. Eventually, in the interest of open science and a commitment to making the data collected freely available to the Wao community, the majority of its contents will be viewable by the general public, following review by Wao speakers.

Morphological Examples

https://github.com/noahdiewald/morphology-examples
The files in this repository are mostly more fleshed-out examples from academic talks. They still need to be completed but provide details otherwise missing in overviews. In addition, the repository includes my current work for my thesis, "Wao Terero Lexical Suffixes." These contain my most current theoretical ideas and most advanced formal model.

Education

2014 - 2019

Master's Degree in Linguistics

Ohio State University - Columbus, Ohio, USA

Skills

Libraries/APIs

PostgREST

Tools

Emacs

Languages

Elm, JavaScript, Coq, Haskell, Erlang, Ruby, Python, Agda

Platforms

Linux

Storage

PostgreSQL, CouchDB, PouchDB

Frameworks

Ruby on Rails 3

Other

Linguistics, Field Research, University Teaching, Formal Methods, Semantics, HTTP, Morphology, Lexicography, Grant Proposals, Computational Linguistics, Dependent Type Theory, Natural Language Processing (NLP), Lean

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring