December, 2013

article thumbnail

Parsing English in 500 Lines of Python

Explosion

I wrote this blog post in 2013, describing an exciting advance in natural language understanding technology. Today, almost all high-performance parsers are using a variant of the algorithm described below (including spaCy). The original post is preserved below, with added commentary in light of recent research. A syntactic parser describes a sentence’s grammatical structure, to help another application reason about it.

Python 45
article thumbnail

Parsing English in 500 Lines of Python

Explosion

This post explains how transition-based dependency parsers work, and argues that this algorithm represents a break-through in natural language understanding. A concise sample implementation is provided, in 500 lines of Python, with no external dependencies. This post was written in 2013. In 2015 this type of parser is now increasingly dominant.

Python 40