
Noah Diewald
Verified Expert in Engineering
Lean Developer
Columbus, OH, United States
Toptal member since October 9, 2024
Noah is a linguist and developer with a strong background in modeling linguistic theories using dependently typed programming languages and theorem provers such as Coq. As a full-stack web developer, he is proud of the research and educational tools he has developed, including successful collaborations with experts in ML that resulted in tools and published papers. Noah enjoys working with Elm and Haskell and has experience with JavaScript, Ruby, Erlang, and other popular languages.
Portfolio
Experience
- Lexicography - 20 years
- Linguistics - 20 years
- Dependent Type Theory - 10 years
- Computational Linguistics - 10 years
- Formal Methods - 10 years
- Elm - 8 years
- Haskell - 5 years
- Coq - 5 years
Availability
Preferred Environment
Emacs, Linux, Coq, Haskell, Elm, PostgreSQL, CouchDB
The most amazing...
...experience I've had is gaining an insight into the organization of grammar through discrete mathematical analysis.
Work Experience
PhD Candidate
Ohio State University, Department of Linguistics
- Developed a formal theory of the co-variation of form and meaning of words for my PhD thesis, utilizing the theorem prover Coq to validate analysis and models, resulting in superior analyses of polysemy and free variation in natural language.
- Created the first local web app for linguistic fieldwork, with a front end in Elm and PouchDB/CouchDB for synchronization, allowing me to collect data in remote Amazonian villages and later sync it to a central server.
- Developed software for rapid tagging of corpora for low-resource languages on a team with ML researchers, which relied on analogical relationships between words rather than segmentation into morphemes. Papers were well received at the ACL workshop.
- Built software for lexical databases and corpora using PostgreSQL and various web technologies that resulted in three electronic and print dictionaries and served as the basis for research and education programs.
- Taught undergraduate courses in linguistics for eight years, both in person and online, designing courses in Native American linguistics and general linguistics. I received consistently positive reviews, and students achieved good outcomes.
- Applied for grants for linguistic research, which allowed for two years of linguistic fieldwork in the Ecuadorian Amazon, resulting in substantial data collection and a language corpus with little previous scientific description.
- Presented at academic conferences, providing evidence on the importance of discourse in determining word meaning, the abstractions needed to understand polysemy, and the necessary formal mechanisms to model non-determinism in inflectional systems.
- Headed a team of researchers and native speakers of an indigenous language in Puyo, Ecuador, overcoming cultural and linguistic differences to produce a large body of linguistic data.
- Designed experiment-like linguistic diagnostics to investigate anaphoric properties and other discourse dependencies of word forms, providing more precise measures of anaphoricity than are commonly found in the linguistic literature.
Senior Systems Programmer
University of Wisconsin Language Sciences
- Developed a web-based lexicography system for a Native American language using PostgreSQL and Ruby on Rails. The system resulted in the publication of a child and adult print and electronic dictionary.
- Developed a web-based lexicography tool in Erlang and JavaScript with a CouchDB back end for documenting a Native American language, which multiple organizations wanted independent databases that could selectively synchronize.
- Typeset print dictionaries in LaTeX that were considered both easy to navigate and beautiful.
- Collaborated with field researchers, Native American community members, and tribal officials effectively.
- Imported data from linguistic applications such as SIL Toolbox, ELAN, and others into PostgreSQL and CouchDB.
Experience
A Word-and-paradigm Workflow for Fieldwork Annotation
https://aclanthology.org/2022.computel-1.20/I collaborated on developing a workflow that relies on unsupervised and active learning grounded in word-and-paradigm morphology (WP). ML has the potential to greatly accelerate the annotation process and allow a human annotator to focus on problematic cases. At the same time, the WP approach makes for an annotation system that is word-based and relational, removing the need to make decisions about feature labeling and segmentation early in the process and allowing speakers of the language of interest to participate more actively since linguistic training is not necessary.
This is a proof of concept for the first step of the workflow. In a realistic fieldwork setting, annotators can process hundreds of forms per hour.
Wao Tededo Documentation Database
https://wao.tededo.org/Morphological Examples
https://github.com/noahdiewald/morphology-examplesEducation
Master's Degree in Linguistics
Ohio State University - Columbus, Ohio, USA
Skills
Libraries/APIs
PostgREST
Tools
Emacs
Languages
Elm, JavaScript, Coq, Haskell, Erlang, Ruby, Python, Agda
Platforms
Linux
Storage
PostgreSQL, CouchDB, PouchDB
Frameworks
Ruby on Rails 3
Other
Linguistics, Field Research, University Teaching, Formal Methods, Semantics, HTTP, Morphology, Lexicography, Grant Proposals, Computational Linguistics, Dependent Type Theory, Natural Language Processing (NLP), Lean
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring