Seminars

Jul
21
Mon
Generation or transfer: this is still the question – Nianwen Xue (Brandeis University) @ Czech Republic
Jul 21 @ 2:00 pm – 3:00 pm

View Seminar Video
View Presentation Slides
Abstract
I will discuss what makes AMR abstract, comparing it with syntactic structures as they are annotated in the treebanks. I will also discuss how similar ideas can be implemented in the alignment of parallel parse trees, sharing our experience working on a hierarchically aligned Chinese-English parallel treebank. Finally I will speculate on the relative strengths and weaknesses of these two types of resources and the MT approaches they support.

Some references:

L. Banarescu, C. Bonial, S. Cai, M. Georgescu, K. Griffitt, U. Hermjakob, K. Knight, P. Koehn, M. Palmer, and N. Schneider. 2013. “Abstract Meaning Representation for Sembanking” Proc. Linguistic Annotation Workshop, 2013.

Nianwen Xue, Ondrej Bojar, Jan Hajic, Martha Palmer, Zdenka Uresova and Xiuhong Zhang. 2014. Not an Interlingua, but Close: Comparison of English AMRs to Chinese and Czech. Proceedings of LREC-2014. Reykjavik, Iceland.

Dun Deng and Nianwen Xue. 2014 (to appear). Aligning Chinese-English Parallel Parse Trees: Is it Feasible?. Proceedings of LAW VIII 2014. Dublin, Ireland.

Dun Deng and Nianwen Xue. 2014 (to appear). Building a Hierarchically Aligned Parallel Chinese-English TreeBank. Proceedings of COLING-2014. Dublin, Ireland.

All Participant Lectures will be held in Room S1, 4th Floor.

Jul
22
Tue
Relating human perceptual data to speech representations through cognitive modeling – Naomi Feldman (University of Maryland) @ Czech Republic
Jul 22 @ 2:00 pm – 3:00 pm

View Seminar Video
View Presentation Slides
Abstract
Abstract of the talk to be announced shortly
All Participant Lectures will be held in Room S1, 4th Floor.

Biography
Naomi Feldman is an assistant professor in the Department of Linguistics and the Institute of Advanced Computer Studies at the University of Maryland, and a member of the computational linguistics and information processing lab. She works primarily on computational models of human language, using techniques from machine learning and statistics to formalize models of how people learn and process language. She received her Ph.D. in cognitive science at Brown University in 2011.

Jul
24
Thu
SEMANTICS Lecture (iVectors) – Lukas Burget (Brno University of Technology) @ Czech Republic
Jul 24 @ 2:00 pm – 3:00 pm

View Seminar Video
Abstract
Lukas will relate NN approaches of generating iVectors to the current pursuits of the ASR team – eliminating unknown unknowns.
All Participant Lectures will be held in Room S1, 4th Floor.

Kernels for Relational Learning from Text Pairs – Alessandro Moschitti (Qatar Computing Research Institute) @ Czech Republic
Jul 24 @ 6:00 pm – 7:00 pm

View Presentation Slides
Abstract
Linguistic relation learning is a pervasive research area in Natural Language Processing, which ranges from syntactic relations captured by syntactic parsers to semantic relations, e.g., modeled with Semantic Role Labeling, coreference resolution, discourse structure approaches or more directly with systems for relation extraction applied to pairs of entities. Such methods typically target constituents spanning one or multiple sentences.

An even more challenging class regards relational learning from pairs of entire (short) texts, which, to be captured, requires the joint analysis of the relations between the different constituents in both texts. Typical examples of such relations are: textual entailment, paraphrasing, correct vs. incorrect association of question with its target answer passage, correct/incorrect translation between a text and its translation, etc.

Given the complexity of providing a theory modeling such relations, researchers rely on machine learning methods. Such models define vector of features for training relational classifiers, which are based on several textual similarities. The latter are computed using different representations, applied to the two texts.

This talk will show a different approach to relational learning from text pairs, which is based on structural kernels: first, a structural/linguistic representation of the text is provided, e.g., using syntactic parsing or semantic role labeling. Then, semantic links between the constituents of the two texts are automatically derived, e.g., using string matching or lexical similarity. Finally, the obtained structures are processed by structural kernels, which automatically map them in feature spaces, where learning algorithms can learn the target relation encoded by the data labels. The talk will show results using different representations for passage reranking in question answering systems.
All Participant Lectures will be held in Room S1, 4th Floor.

Biography
Alessandro Moschitti is a Senior Research Scientist at the Qatar Computing Research Institute (QCRI) and a tenured professor at the Computer Science (CS) Department of the University of Trento, Italy. He obtained his PhD in CS from the University of Rome in 2003. He has been the only non-US faculty member to participate in the IBM Watson Jeopardy! challenge. He has significant expertise in both theoretical and applied ML for NLP, IR and Data Mining. He has devised innovative kernels for advanced syntactic/semantic processing with support vector and other kernel-based machines. He is an author or co-author of more than 190 scientific articles in many different areas of NLP, ranging from Semantic Role Labeling to Opinion Mining. He has been an area chair for the semantics track at ACL and IJCNLP conferences and for machine learning track at ACL and ECML. Additionally, he has been PC chair of other important conferences and workshops for the ML and ACL communities. Currently, he is the General Chair of EMNLP 2014 and he is on the editorial board of JAIR, JNLE and JoDS. He has received three IBM Faculty Awards, one Google Faculty Award and three best paper awards.

Jul
25
Fri
‘About’ attitudes – Kyle Rawlins (Johns Hopkins University) @ Czech Republic
Jul 25 @ 2:00 pm – 3:00 pm

View Seminar Video
View Presentation Slides
Abstract
A central problem in linguistic semantics is the grammar (lexical semantics, compositional semantics, syntactic behavior) of clause-embedding predicates, such as ‘know’, ‘tell’, ‘wonder’, and ‘think’. I present an investigation of this problem through the lens of the interaction with ‘about’-phrases. I argue that the best account of this interaction involves the verbs being neo-Davidsonian eventuality predicates that characterize events and states with ‘content’ (following recent work by Kratzer, Hacquard, and others). Arguments and modifiers of attitude verbs function to characterize the content, leading to a clean separation of syntactic argument structure and event structure; ‘about’-phrases in particular indirectly characterize content via a notion of aboutness adapted from work by David Lewis.
All Participant Lectures will be held in Room S1, 4th Floor.

Jul
28
Mon
Toward more linguistically-informed translation models – Adam Lopez (Johns Hopkins University) @ Czech Republic
Jul 28 @ 2:00 pm – 3:00 pm

View Seminar Video
Abstract
Modern translation systems model translation as simple substitution and permutation of word tokens, sometimes informed by syntax. Formally, these models are probabilistic relations on regular or context-free sets, a poor fit for many of the world’s languages. Computational linguists have developed more expressive mathematical models of language that exhibit high empirical coverage of annotated language data, correctly predict a variety of important linguistic phenomena in many languages, explicitly model semantics, and can be processed with efficient algorithms. I will discuss some ways in which such models can be used in machine translation, focusing particularly on combinatory categorial grammar (CCG).
All Participant Lectures will be held in Room S1, 4th Floor.

Jul
29
Tue
Deep, Long and Wide Artificial Neural Networks in Processing of Speech – Hynek Hermansky (Johns Hopkins University) @ Czech Republic
Jul 29 @ 2:00 pm – 3:00 pm

View Seminar Video
View Presentation Slides
Abstract
Up to recently, automatic recognition of speech (ASR) proceeded in a single stream: from a speech signal, through a feature extraction module and pattern classifier into search for the best word sequence. Features were mostly hand-crafted based and represented relative short (10-20 ms) instantaneous snapshots of speech signal. Introduction of artificial neural nets (ANNs) into speech processing allowed for much more ambitious and more effective schemes. Today’s speech features for ASR are derived from large amounts of speech data, often using complex deep neural net architectures. The talk argues for ANNs that are not only deep but also wide (i.e., processing information in multiple parallel processing streams) and long (i.e., extracting information from speech segments much longer than 10-20 ms). Support comes from psychophysics and physiology of speech perception, as well as from speech data itself. The talk reviews history of gradual shift towards nonlinear multi-stream extraction of information from spectral dynamics of speech, and shows some advantages of this approach in ASR.

All Participant Lectures will be held in Room S1, 4th Floor.

Jul
30
Wed
Broad-Coverage Semantic Dependency Parsing
Jul 30 all-day
Grammar Factorization by Tree Decomposition – Dan Gildea (University of Rochester) @ Czech Republic
Jul 30 @ 2:00 pm – 3:00 pm

View Seminar Video

View Presentation Slides

Abstract
We describe the application of the graph-theoretic property known as treewidth to the problem of finding efficient parsing algorithms. This method, similar to the junction tree algorithm used in graphical models for machine learning, allows automatic discovery of efficient algorithms such as the O(n^4) algorithm for bilexical grammars of Eisner and Satta (1999). We examine the complexity of applying this method to parsing algorithms for general Linear Context-Free Rewriting Systems (LCFRS). We show that any polynomial-time algorithm for this problem would imply an improved approximation algorithm for the well-studied treewidth problem on general graphs.

All Participant Lectures will be held in Room S1, 4th Floor.

Sep
23
Tue
Synchronous Rewriting for Natural Language Processing – Giorgio Satta (University of Padua) @ Czech Republic
Sep 23 @ 12:00 pm – 9:00 pm

View Seminar Video
View Presentation Slides
2014 Frederick Jelinek Memorial Summer Workshop

Abstract
In synchronous rewriting two or more rewriting processes, typically context-free, can be carried out in a synchronous way. Synchronous rewriting systems are exploited in machine translation and syntax/semantics interface, as well as in parsing applications where one needs to model syntactic structures based on discontinuous phrases or on non-projective dependency trees.
In this presentation I overview some formalisms using synchronous rewriting, including the linear context-free rewriting systems of (Vijay-Shanker, Weir, and Joshi, 1987) and discuss several computational problems that arise in the context of the above mentioned applications.
All Participant Lectures will be held in Room S1, 4th Floor.

Center for Language and Speech Processing