BEGIN:VCALENDAR VERSION:2.0 PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9// CALSCALE:GREGORIAN METHOD:PUBLISH X-FROM-URL:https://www.clsp.jhu.edu X-WR-TIMEZONE:America/New_York BEGIN:VTIMEZONE TZID:America/New_York X-LIC-LOCATION:America/New_York BEGIN:STANDARD DTSTART:20231105T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 RDATE:20241103T020000 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20240310T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 RDATE:20250309T020000 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:ai1ec-21489@www.clsp.jhu.edu DTSTAMP:20240328T162651Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nSince it is increasingly harder to opt out from inter acting with AI technology\, people demand that AI is capable of maintainin g contracts such that it supports agency and oversight of people who are r equired to use it or who are affected by it. To help those people create a mental model about how to interact with AI systems\, I extend the underly ing models to self-explain—predict the label/answer and explain this predi ction. In this talk\, I will present how to generate (1) free-text explana tions given in plain English that immediately tell users the gist of the r easoning\, and (2) contrastive explanations that help users understand how they could change the text to get another label.\nBiography\nAna Marasovi ć is a postdoctoral researcher at the Allen Institute for AI (AI2) and the Paul G. Allen School of Computer Science & Engineering at University of W ashington. Her research interests broadly lie in the fields of natural lan guage processing\, explainable AI\, and vision-and-language learning. Her projects are motivated by a unified goal: improve interaction and control of the NLP systems to help people make these systems do what they want wit h the confidence that they’re getting exactly what they need. Prior to joi ning AI2\, Ana obtained her PhD from Heidelberg University.\nHow to pronou nce my name: the first name is Ana like in Spanish\, i.e.\, with a long “a ” like in “water”\; regarding the last name: “mara” as in actress mara wil son + “so” + “veetch”. DTSTART;TZID=America/New_York:20220228T120000 DTEND;TZID=America/New_York:20220228T131500 LOCATION:Ames Hall 234 - Presented Virtually Via Zoom https://wse.zoom.us/j /96735183473 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Ana Marasović (Allen Institute for AI & University of Washington) “ Self-Explaining for Intuitive Interaction with AI” URL:https://www.clsp.jhu.edu/events/ana-marasovic-allen-institute-for-ai-un iversity-of-washington-self-explaining-for-intuitive-interaction-with-ai/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\nAbstr act
\nSince it is increasingly harder to opt out from inter acting with AI technology\, people demand that AI is capable of maintainin g contracts such that it supports agency and oversight of people who are r equired to use it or who are affected by it. To help those people create a mental model about how to interact with AI systems\, I extend the underly ing models to self-explain—predict the label/answer and explain this predi ction. In this talk\, I will present how to generate (1) free-text explana tions given in plain English that immediately tell users the gist of the r easoning\, and (2) contrastive explanations that help users understand how they could change the text to get another label.
\nBiograph y
\nAna Marasović is a postdoctoral researcher at the Allen Institute for AI (AI2) and the Paul G. Allen School of Computer Science & Engineering at University of Washington. Her research interests broadly l ie in the fields of natural language processing\, explainable AI\, and vis ion-and-language learning. Her projects are motivated by a unified goal: i mprove interaction and control of the NLP systems to help people make thes e systems do what they want with the confidence that they’re getting exact ly what they need. Prior to joining AI2\, Ana obtained her PhD from Heidel berg University.
\nHow to pronounce my name: the first name i s Ana like in Spanish\, i.e.\, with a long “a” like in “water”\; regarding the last name: “mara” as in actress mara wilson + “so” + “veetch”.
\n< /BODY> X-TAGS;LANGUAGE=en-US:2022\,February\,Marasovic END:VEVENT BEGIN:VEVENT UID:ai1ec-23304@www.clsp.jhu.edu DTSTAMP:20240328T162651Z CATEGORIES;LANGUAGE=en-US:Seminars CONTACT: DESCRIPTION:Abstract\nTransformers are essential to pretraining. As we appr oach 5 years of BERT\, the connection between attention as architecture an d transfer learning remains key to this central thread in NLP. Other archi tectures such as CNNs and RNNs have been used to replicate pretraining res ults\, but these either fail to reach the same accuracy or require supplem ental attention layers. This work revisits the semanal BERT result and con siders pretraining without attention. We consider replacing self-attention layers with recently developed approach for long-range sequence modeling and transformer architecture variants. Specifically\, inspired by recent p apers like the structured space space sequence model (S4)\, we use simple routing layers based on state-space models (SSM) and a bidirectional model architecture based on multiplicative gating. We discuss the results of th e proposed Bidirectional Gated SSM (BiGS) and present a range of analysis into its properties. Results show that architecture does seem to have a no table impact on downstream performance and a different inductive bias that is worth exploring further.\nBiography\nAlexander “Sasha” Rush is an Asso ciate Professor at Cornell Tech. His work is at the intersection of natura l language processing and generative modeling with applications in text ge neration\, efficient inference\, and controllability. He has written sever al popular open-source software projects supporting NLP research and data science\, and works part-time as a researcher at Hugging Face. He is the s ecretary of ICLR and developed software used to run virtual conferences du ring COVID. His work has received paper and demo awards at major NLP\, vis ualization\, and hardware conferences\, an NSF Career Award\, and a Sloan Fellowship. He tweets and blogs\, mostly about coding and ML\, at @srush_n lp. DTSTART;TZID=America/New_York:20230203T120000 DTEND;TZID=America/New_York:20230203T131500 LOCATION:Hackerman Hall B17 @ 3400 N. Charles Street\, Baltimore\, MD 21218 SEQUENCE:0 SUMMARY:Sasha Rush (Cornell University) “Pretraining Without Attention” URL:https://www.clsp.jhu.edu/events/sasha-rush-cornell-university/ X-COST-TYPE:free X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\nAbstr act
\nTransformers are essential to pretraining. As we appr oach 5 years of BERT\, the connection between attention as architecture an d transfer learning remains key to this central thread in NLP. Other archi tectures such as CNNs and RNNs have been used to replicate pretraining res ults\, but these either fail to reach the same accuracy or require supplem ental attention layers. This work revisits the semanal BERT result and con siders pretraining without attention. We consider replacing self-attention layers with recently developed approach for long-range sequence modeling and transformer architecture variants. Specifically\, inspired by recent p apers like the structured space space sequence model (S4)\, we use simple routing layers based on state-space models (SSM) and a bidirectional model architecture based on multiplicative gating. We discuss the results of th e proposed Bidirectional Gated SSM (BiGS) and present a range of analysis into its properties. Results show that architecture does seem to have a no table impact on downstream performance and a different inductive bias that is worth exploring further.
\nBiography
\n