BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9//
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-FROM-URL:https://www.clsp.jhu.edu
X-WR-TIMEZONE:America/New_York
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:STANDARD
DTSTART:20231105T020000
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
RDATE:20241103T020000
TZNAME:EST
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20240310T020000
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
RDATE:20250309T020000
TZNAME:EDT
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:ai1ec-20716@www.clsp.jhu.edu
DTSTAMP:20240328T185428Z
CATEGORIES;LANGUAGE=en-US:Seminars
CONTACT:
DESCRIPTION:Abstract\nOver the last few years\, deep neural models have tak
en over the field of natural language processing (NLP)\, brandishing great
improvements on many of its sequence-level tasks. But the end-to-end natu
re of these models makes it hard to figure out whether the way they repres
ent individual words aligns with how language builds itself from the botto
m up\, or how lexical changes in register and domain can affect the untest
ed aspects of such representations.\nIn this talk\, I will present NYTWIT\
, a dataset created to challenge large language models at the lexical leve
l\, tasking them with identification of processes leading to the formation
of novel English words\, as well as with segmentation and recovery of the
specific subclass of novel blends. I will then present XRayEmb\, a method
which alleviates the hardships of processing these novelties by fitting a
character-level encoder to the existing models’ subword tokenizers\; and
conclude with a discussion of the drawbacks of current tokenizers’ vocabul
ary creation schemes.\nBiography\nYuval Pinter is a Senior Lecturer in the
Department of Computer Science at Ben-Gurion University of the Negev\, fo
cusing on natural language processing. Yuval got his PhD at the Georgia In
stitute of Technology School of Interactive Computing as a Bloomberg Data
Science PhD Fellow. Before that\, he worked as a Research Engineer at Yaho
o Labs and as a Computational Linguist at Ginger Software\, and obtained a
n MA in Linguistics and a BSc in CS and Mathematics\, both from Tel Aviv U
niversity. Yuval blogs (in Hebrew) about language matters on Dagesh Kal.
DTSTART;TZID=America/New_York:20210910T120000
DTEND;TZID=America/New_York:20210910T131500
LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD
SEQUENCE:0
SUMMARY:Yuval Pinter (Ben-Gurion University – Virtual Visit) “Challenging a
nd Adapting NLP Models to Lexical Phenomena”
URL:https://www.clsp.jhu.edu/events/yuval-pinter/
X-COST-TYPE:free
X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\n\\n\\nAbstr
act
\nOver the last few years\, deep neural models have tak
en over the field of natural language processing (NLP)\, brandishing great
improvements on many of its sequence-level tasks. But the end-to-end natu
re of these models makes it hard to figure out whether the way they repres
ent individual words aligns with how language builds itself from the botto
m up\, or how lexical changes in register and domain can affect the untest
ed aspects of such representations.
\nIn this talk\, I will present
NYTWIT\, a dataset created to challenge large language models at the lexic
al level\, tasking them with identification of processes leading to the fo
rmation of novel English words\, as well as with segmentation and recovery
of the specific subclass of novel blends. I will then present XRayEmb\, a
method which alleviates the hardships of processing these novelties by fi
tting a character-level encoder to the existing models’ subword tokenizers
\; and conclude with a discussion of the drawbacks of current tokenizers’
vocabulary creation schemes.
\nBiography
\nYuval Pinter
is a Senior Lecturer in the Department of Computer Science at Ben-Gurion
University of the Negev\, focusing on natural language processing. Yuval got his PhD at the Georgia Institute of Tec
hnology School of Interactive Computing as a Bloomberg Data Science PhD Fe
llow. Before that\, he worked as a Research Engineer at Yahoo Labs and as
a Computational Linguist at Ginger Software\, and obtained an MA in Lingui
stics and a BSc in CS and Mathematics\, both from Tel Aviv University. Yuval blogs (in Hebrew) about language matter
s on Dagesh Kal.
\n
X-TAGS;LANGUAGE=en-US:2021\,Pinter\,September
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-21487@www.clsp.jhu.edu
DTSTAMP:20240328T185428Z
CATEGORIES;LANGUAGE=en-US:Seminars
CONTACT:
DESCRIPTION:Abstract\nEnormous amounts of ever-changing knowledge are avai
lable online in diverse textual styles and diverse formats. Recent advance
s in deep learning algorithms and large-scale datasets are spurring progre
ss in many Natural Language Processing (NLP) tasks\, including question an
swering. Nevertheless\, these models cannot scale up when task-annotated t
raining data are scarce. This talk presents my lab’s work toward building
general-purpose models in NLP and how to systematically evaluate them. Fir
st\, I present a general model for two known tasks of question answering i
n English and multiple languages that are robust to small domain shifts.
Then\, I show a meta-training approach that can solve a variety of NLP tas
ks with only using a few examples and introduce a benchmark to evaluate cr
oss-task generalization. Finally\, I discuss neuro-symbolic approaches to
address more complex tasks by eliciting knowledge from structured data and
language models.\n\nBiography\n\nHanna Hajishirzi is an Assistant Profess
or in the Paul G. Allen School of Computer Science & Engineering at the Un
iversity of Washington and a Senior Research Manager at the Allen Institut
e for AI. Her research spans different areas in NLP and AI\, focusing on d
eveloping general-purpose machine learning algorithms that can solve many
NLP tasks. Applications for these algorithms include question answering\,
representation learning\, green AI\, knowledge extraction\, and conversati
onal dialogue. Honors include the NSF CAREER Award\, Sloan Fellowship\, Al
len Distinguished Investigator Award\, Intel rising star award\, best pape
r and honorable mention awards\, and several industry research faculty awa
rds. Hanna received her PhD from University of Illinois and spent a year a
s a postdoc at Disney Research and CMU.
DTSTART;TZID=America/New_York:20220225T120000
DTEND;TZID=America/New_York:20220225T131500
LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j
/96735183473
SEQUENCE:0
SUMMARY:Hanna Hajishirzi (University of Washington & Allen Institute for AI
) “Toward Robust\, Knowledge-Rich NLP”
URL:https://www.clsp.jhu.edu/events/hanna-hajishirzi-university-of-washingt
on-allen-institute-for-ai-toward-robust-knowledge-rich-nlp/
X-COST-TYPE:free
X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\n\\n\\nAbstr
act
\nEno
rmous amounts of ever-changing knowledge are available online in diverse
textual styles and diverse formats. Recent advances in deep learning algor
ithms and large-scale datasets are spurring progress in many Natural Langu
age Processing (NLP) tasks\, including question answering. Nevertheless\,
these models cannot scale up when task-annotated training data are scarce.
This talk presents my lab’s work toward building general-purpose models i
n NLP and how to systematically evaluate them. First\, I present a general
model for two known tasks of question answering in English and multiple l
anguages that are robust to small domain shifts. Then\, I show a meta-tra
ining approach that can solve a variety of NLP tasks with only using a few
examples and introduce a benchmark to evaluate cross-task generalization.
Finally\, I discuss neuro-symbolic approaches to address more comp
lex tasks by eliciting knowledge from structured data and language models.
\n\nBiography
\n\n<
div>Hanna Hajishirzi is an
Assistant Professor in the Paul G. Allen School of Computer Science & Eng
ineering at the University of Washington and a Senior Research Manager at
the Allen Institute for AI. Her research spans different areas in NLP and
AI\, focusing on developing general-purpose machine learning algorithms th
at can solve many NLP tasks. Applications for these algorithms include que
stion answering\, representation learning\, green AI\, knowledge extractio
n\, and conversational dialogue. Honors include the NSF CAREER Award\, Slo
an Fellowship\, Allen Distinguished Investigator Award\, Intel rising star
award\, best paper and honorable mention awards\, and several industry re
search faculty awards. Hanna received her PhD from University of Illinois
and spent a year as a postdoc at Disney Research and CMU.\n
BODY>
X-TAGS;LANGUAGE=en-US:2022\,February\,Hajishirzi
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-22380@www.clsp.jhu.edu
DTSTAMP:20240328T185428Z
CATEGORIES;LANGUAGE=en-US:Seminars
CONTACT:
DESCRIPTION:Abstract\nThe availability of large multilingual pre-trained la
nguage models has opened up exciting pathways for developing NLP technolog
ies for languages with scarce resources. In this talk I will advocate for
the need to go beyond the most common languages in multilingual evaluation
\, and on the challenges of handling new\, unseen-during-training language
s and varieties. I will also share some of my experiences with working wit
h indigenous and other endangered language communities and activists.\nBio
graphy\n\nAntonios Anastasopoulos is an Assistant Professor in Computer Sc
ience at George Mason University. In 2019\, Antonis received his PhD in Co
mputer Science from the University of Notre Dame and then worked as a post
doctoral researcher at the Language Technologies Institute at Carnegie Mel
lon University. His research interests revolve around computational lingui
stics and natural language processing with a focus on low-resource setting
s\, endangered languages\, and cross-lingual learning.\n\n\n
DTSTART;TZID=America/New_York:20220930T120000
DTEND;TZID=America/New_York:20220930T131500
LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218
SEQUENCE:0
SUMMARY:Antonios Anastasopoulos (George Mason University) “NLP Beyond the T
op-100 Languages”
URL:https://www.clsp.jhu.edu/events/antonis-anastasopoulos-george-mason-uni
versity/
X-COST-TYPE:free
X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\n\\n\\nAbstr
act
\nThe availability of large multilingual pre-trained la
nguage models has opened up exciting pathways for developing NLP technolog
ies for languages with scarce resources. In this talk I will advocate for
the need to go beyond the most common languages in multilingual evaluation
\, and on the challenges of handling new\, unseen-during-training language
s and varieties. I will also share some of my experiences with working wit
h indigenous and other endangered language communities and activists.
\nBiography
\n\n
Antonios Anastasopoulos is an Assistant Professor in Compu
ter Science at George Mason University. In 2019\, Antonis received his PhD
in Computer Science from the University of Notre Dame and then worked as
a postdoctoral researcher at the Language Technologies Institute at Carneg
ie Mellon University. His research interests revolve around computational
linguistics and natural language processing with a focus on low-resource s
ettings\, endangered languages\, and cross-lingual learning.
\n
\n
\n
\n
X-TAGS;LANGUAGE=en-US:2022\,Anastasopoulos\,September
END:VEVENT
END:VCALENDAR