Steven Abney

Associate Professor of Linguistics at the University of Michigan, with courtesy positions in Computer Science & Engineering and the School of Information.

My area of research is computational linguistics, which encompasses language technology (machine translation, speech recognition, information extraction), digital linguistics, the language part of artificial intelligence, and computational psycholinguistics.

My career has alternated between academic linguistic departments and computer science departments in industrial research labs. To my mind, language is an intrinsically computational system, and computational linguistics is linguistics. Languages are no less complex than subatomic particles, galaxies, or living cells, and they deserve to be studied with the kind of mathematical and computational sophistication that is taken for granted in physics, astronomy, or molecular biology.

The projects I am currently working on include:

  • Language digitization - creating a multilingual corpus of aligned and analyzed text, as a digital form of language documentation & description, and as a platform to study unsupervised learning of machine translation systems.

  • Dependency parsing - learning dependency parsers for languages with nonplanar dependency graphs.

I am also interested in the following topics:

  • semisupervised learning and spectral methods
  • information extraction, especially for biomed
  • partial parsing and deterministic parsing
  • grammatical inference
  • conversational agents
  • spoken language systems
  • automated phonetic transcription

Cass is a partial parser that used to be available from this page. It has been in maintenance-only mode for many years. Unfortunately, it no longer compiles under current versions of gcc, and it looks like fixing the problem will require a fairly substantial rewrite, i.e., it is unlikely to happen soon. If you want the old tarfile, it is here.