Probabilistic Representations of Linguistic Meaning (PReLiM)

The workshop on Probabilistic Representations of Linguistic Meaning (PReLiM) was held in Prague from July 7-11, 2014, funded by the U.S. National Science Foundation's PIRE Program.

Workshop Description

This 1-week workshop aims to gather leaders from the semantics, NLP, and cognitive science communities, to consider how linguistic semantics and pragmatics might integrate with probabilistic knowledge and reasoning (about the world and about one's interlocutor).

Rationale

"Deep" natural-language understanding will eventually need more sophisticated semantic representations. What representations should the NLP community be using in 10 years? How will they figure into inference? How can we start to recover them from text or other linguistic resources?

Conversely, semanticists and pragmaticists need to model mental states and reasoning and how these relate to the linguistic form of speech acts. Is there an important role for probability distributions over semantic representations and within semantic representations?

Probabilistic representations are now standard across AI and cognitive science. Over 30 years, probabilistic models of language have grown beyond n-grams and collocations to incorporate more and more linguistic structure such as lexical categories, syntactic features, selectional preferences, semantic frames, and reference. We propose that it is time to turn the same lens on semantic representations, and integrate them with current thinking about probabilistic knowledge and reasoning.

Why Probability?

Traditionally, it has been predicate logic that is used to encode the meaning of a reading of a sentence, and to reason about its entailments. Can probability distributions enrich our understanding of the underpinnings of linguistic meaning?

Knowledge of syntax includes knowledge of the frequencies of different constructions, and this knowlege is used in both generation and comprehension. Probabilistic reasoning is arguably even more important in reasoning about the meaning of a sentence and the inferences that are intended to be drawn from it.

Probability has several roles to play:

  1. The intended meaning of a sentence may convey information about probability distributions. These describe either patterns in the world or the speaker's uncertainty about the world.

    • Possible worlds: What inferences about the world or the speaker's beliefs can one draw from modal or counterfactual statements? Can the traditional notions of "accessible" worlds and "minimal" changes be made more precise by using probability?
    • Concepts: Does a word evoke a probability distribution over prototypical entities or situations? How does prototypicality affect the interpretation of generics and indefinites? How does it compose, and how does it interact with truth conditions?
    • Vagueness: What contrast set is intended by "tall," "expensive," or "many"? Can probability help us to interpret the meaning and compositional behavior of graded predicates?

  2. Interpreting a sentence requires reasoning about what meaning was most plausibly intended (just as in statistical parsing). The space of possible meanings can be quite rich, and the reasoning can interact strongly with world knowledge.

    • Sloppiness: When and how should one accommodate presuppositions, or coerce arguments to new types? How should one construe the domain of a quantifier? What alternative worlds or situations are evoked by a modal, counterfactual, or adverb of quantification? Where is the boundary between vagueness of the intended meaning and uncertainty about the intended meaning?
    • Underspecification: For example, what is the precise relationship between the nouns in a noun-noun compound? What tripartite structure is intended by a generic? What temporal relations are implied by a sentence?

  3. The form of expression may acknowledge uncertainty in the hearer's current belief state or in the common ground between speaker and hearer.

    • Dynamic semantics: How does a speech act shift the hearer's distribution over states of the world?
    • Inferrables: When is it appropriate to mark definiteness or givenness?
    • Pragmatics: What inferences are being actively invited by the speaker, and how is this reflected in conventional form?

We will also discuss the residue of probabilistic methods. Probability may be an imperfect tool for modeling cognition. And even if mental states are essentially probability distributions, does language discuss mental states in these terms? Or is it some other, non-probabilistic "folk theory" of mind that is referenced by linguistic constructions such as evidentials, epistemic modals, and verbs of attitude and belief? Similarly, are folk theories of reasoning assumed by pragmatic conventions, causal language, or generic statements?

Emerging work in vector space semantic models also aims to address some of our questions by departing from conventional logical-form representations. However, our main focus here will be on probabilistic approaches such as distributions over possible worlds or situations.

Participants And Format

Our aim in the PReLiM workshop is to bring together leading thinkers from three communities:

  • Natural language processing. The NLP community of late has been working actively on recovering Montagovian, frame-based, or distributional representations of meaning, as well as using such representations in tasks like question answering and machine translation.
  • Linguistics. Linguists can clarify the range of semantic and pragmatic puzzles to be solved, and can challenge unwise methods. A few semanticists are already engaging with probabilistic methods.
  • Probabilistic methods in cognitive science and AI. The Bayesian modeling community has been increasingly concerned with probabilistic reasoning, probabilistic knowledge representation, and distributions over possible worlds.

This short PReLiM workshop will immediately precede the longer AMR workshop on the practical use of abstract meaning representations in machine translation. Thus, in addition to the above invitees, PReLiM will include the AMR participants, particularly students. PReLiM will give them a broader view of the long-term problems in meaning representation, and will set the stage for their shorter-term discussions on how to design the next version of AMR.

Format:

  • Background readings will be circulated in advance of the workshop.
  • The workshop itself will consist of a mix of talks, panel presentations on prearranged topics, and breakout discussions.
  • One goal of the discussions will be to try to agree on a high-level formal framework for considering issues like those above, one that is conducive to future exploration of specific phenomena and examples. Participants will be invited to bring proposals.
  • Another goal is to identify avenues for empirical progress -- e.g., collecting useful data, running experiments on humans, or constructing dialogue systems for restricted domains.
  • Discussion after the workshop can continue on a mailing list.

Team Members

Jason Eisner (organizer)Johns Hopkins University
Oren EtzioniUniversity of Washington, Allen Institute
Shalom LappinKing's College London
Staffan LarssonUniversity of Gothenburg
Dan LassiterStanford University
Percy LiangStanford University
David McAllesterToyota Technical Institute
James PustejovskyBrandeis University
Kyle RawlinsJohns Hopkins University
Benjamin Van DurmeJohns Hopkins University

Graduate Students

Nicholas AndrewsJohns Hopkins University
Drew ReisingerJohns Hopkins University
Darcey RileyJohns Hopkins University
Rachel RudingerJohns Hopkins University

Daily Schedule

Most of the senior participants gave talks. The abstracts and videos can be found here.

Monday

Tuesday

Wednesday

  • Morning discussion: Worlds and situations
    • Generics, quantifiers
    • Modals, conditionals and counterfactuals; “minimal change”
    • Chalktalk by Drew Reisinger on dialogue scenario
  • Morning talk: Dan Lassiter, Bayesian Pragmatics
  • Afternoon discussion: Pragmatics
    • Meta-reasoning (chalktalk by Dan Lassiter)
    • Presuppositions and implicatures
    • Game theory
  • Afternoon talk: David McAllester, The Problem of Reference

Thursday

Friday

  • Morning discussion: Remaining difficult issues, e.g.,
    • Imprecise language
    • Contradictory beliefs
    • Linguistic ambiguity about contrast sets
  • Morning talk: Martha Palmer, Designing Abstract Meaning Representations for Machine Translation
  • Afternoon discussion: Practical next steps towards semantic AI, e.g.,
    • Chalktalk by Rachel Rudinger on Stanford dependencies?
    • Chalktalk by Nick Andrews on web scraping scenario?
  • Afternoon talk: Benjamin Van Durme, Common Sense and Language