BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//128.220.36.25//NONSGML kigkonsult.se iCalcreator 2.26.9//
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-FROM-URL:https://www.clsp.jhu.edu
X-WR-TIMEZONE:America/New_York
BEGIN:VTIMEZONE
TZID:America/New_York
X-LIC-LOCATION:America/New_York
BEGIN:STANDARD
DTSTART:20231105T020000
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
RDATE:20241103T020000
TZNAME:EST
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20240310T020000
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
RDATE:20250309T020000
TZNAME:EDT
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:ai1ec-20115@www.clsp.jhu.edu
DTSTAMP:20240329T060623Z
CATEGORIES;LANGUAGE=en-US:Seminars
CONTACT:
DESCRIPTION:Abstract\nData science in small medical datasets usually means
doing precision guesswork on unreliable data provided by those with high e
xpectations. The first part of this talk will focus on issues that data sc
ientists and engineers have to address when working with this kind of data
(e.g. unreliable labels\, the effect of confounding factors\, necessity o
f clinical interpretability\, difficulties with fusing more data sets). Th
e second part of the talk will include some real examples of this kind of
data science in the field of neurology (prediction of motor deficits in Pa
rkinson’s disease based on acoustic analysis of speech\, diagnosis of Park
inson’s disease dysgraphia utilising online handwriting\, exploring the Mo
zart effect in epilepsy based on the music information retrieval) and psyc
hology (assessment of graphomotor disabilities in children with developmen
tal dysgraphia).\nBiography\nJiri Mekyska is the head of the BDALab (Brain
Diseases Analysis Laboratory) at the Brno University of Technology\, wher
e he leads a multidisciplinary team of researchers (signal processing engi
neers\, data scientists\, neurologists\, psychologists) with a special foc
us on the development of new digital endpoints and digital biomarkers enab
ling to better understand\, diagnose and monitor neurodegenerative (e.g. P
arkinson’s disease) and neurodevelopmental (e.g. dysgraphia) diseases.
DTSTART;TZID=America/New_York:20210329T120000
DTEND;TZID=America/New_York:20210329T131500
LOCATION:via Zoom
SEQUENCE:0
SUMMARY:Jiri Mekyska (Brno University of Technology) “Data Science in Small
Medical Data Sets: From Logistic Regression Towards Logistic Regression”
URL:https://www.clsp.jhu.edu/events/jiri-mekyska-brno-university-of-technol
ogy/
X-COST-TYPE:free
X-ALT-DESC;FMTTYPE=text/html:\\n\\n
\\n\\n\\nAbstr
act
\nData science in small medical datasets usually means
doing precision guesswork on unreliable data provided by those with high e
xpectations. The first part of this talk will focus on issues that data sc
ientists and engineers have to address when working with this kind of data
(e.g. unreliable labels\, the effect of confounding factors\, necessity o
f clinical interpretability\, difficulties with fusing more data sets). Th
e second part of the talk will include some real examples of this kind of
data science in the field of neurology (prediction of motor deficits in Pa
rkinson’s disease based on acoustic analysis of speech\, diagnosis of Park
inson’s disease dysgraphia utilising online handwriting\, exploring the Mo
zart effect in epilepsy based on the music information retrieval) and psyc
hology (assessment of graphomotor disabilities in children with developmen
tal dysgraphia).
\nBiography
\nJiri Mekyska is the he
ad of the BDALab (Brain Diseases Analysis Laboratory) at the Brno Universi
ty of Technology\, where he leads a multidisciplinary team of researchers
(signal processing engineers\, data scientists\, neurologists\, psychologi
sts) with a special focus on the development of new digital endpoints and
digital biomarkers enabling to better understand\, diagnose and monitor ne
urodegenerative (e.g. Parkinson’s disease) and neurodevelopmental (e.g. dy
sgraphia) diseases.
\n\n
X-TAGS;LANGUAGE=en-US:2021\,March\,Mekyska
END:VEVENT
BEGIN:VEVENT
UID:ai1ec-21615@www.clsp.jhu.edu
DTSTAMP:20240329T060623Z
CATEGORIES;LANGUAGE=en-US:Student Seminars
CONTACT:
DESCRIPTION:Abstract\n\n\nWe consider a problem of data collection for sema
ntically rich NLU tasks\, where detailed semantics of documents (or uttera
nces) are captured using a complex meaning representation. Previously\, d
ata collection for such tasks was either handled at the cost of extensive
annotator training (e.g. in FrameNet or PropBank) or simplified meaning re
presentation (e.g. in QA-SRL or Overnight). In this talk\, we present two
systems [1\, 2] that aim to support fast\, accurate\, and expressive sema
ntic annotations by pairing human workers with a trained model in the loop
.\n\nThe first system\, called Guided K-best [1]\, is an annotation toolki
t for conversational semantic parsing. Instead of typing annotations from
scratch\, data specialists choose a correct parse from the K-best output
of a few-shot prototyped model. As the K-best list can be large (e.g. K=1
00)\, we guide the annotators’ exploration of the K-best list via explaina
ble hierarchical clustering. In addition\, we experiment with RoBERTa-bas
ed reranking of the K-best list to recalibrate the few-shot model towards
Accuracy@K. The final system allows to annotate data up to 35% faster tha
n the standard\, non-guided K-best and improves the few-shot model’s top-1
accuracy by up to 18%. The second system\, called SchemaBlocks [2]\, is
an annotation toolkit for schemas\, or structured descriptions of frequent
real-world scenarios (e.g.\, cooking a meal). It represents schemas in t
he annotation UI as nested blocks. Using a novel Causal ARM model\, we fu
rther speed up the annotation process and guide data specialists towards e
xpressive and diverse schemas. As part of this work\, we collect 232 sche
mas\, evaluating their internal coherence and their coverage on large-scal
e newswire corpora.\n\n\n
DTSTART;TZID=America/New_York:20220311T120000
DTEND;TZID=America/New_York:20220311T131500
LOCATION:Virtual Seminar
SEQUENCE:0
SUMMARY:Student Seminar – Anton Belyy “Systems for Human-AI Cooperation on
Collecting Semantic Annotations”
URL:https://www.clsp.jhu.edu/events/student-seminar-anton-belyy-systems-for
-human-ai-cooperation-on-collecting-semantic-annotations/
X-COST-TYPE:free
X-ALT-DESC;FMTTYPE=text/html:\\n\\n\\n\\n\\nAbstr
act
\n\n
\n
We consider a problem of data collect
ion for semantically rich NLU tasks\, where detailed semantics of document
s (or utterances) are captured using a complex meaning representation. Pr
eviously\, data collection for such tasks was either handled at the cost o
f extensive annotator training (e.g. in FrameNet or PropBank) or simplifie
d meaning representation (e.g. in QA-SRL or Overnight). In this talk\, we
present two systems [1\, 2] that aim to support fast\, accurate\, and exp
ressive semantic annotations by pairing human workers with a trained model
in the loop.
\n
\n
The first system\, called Guided K-
best [1]\, is an annotation toolkit for conversational semantic parsing.
Instead of typing annotations from scratch\, data specialists choose a cor
rect parse from the K-best output of a few-shot prototyped model. As the
K-best list can be large (e.g. K=100)\, we guide the annotators’ explorati
on of the K-best list via explainable hierarchical clustering. In additio
n\, we experiment with RoBERTa-based reranking of the K-best list to recal
ibrate the few-shot model towards Accuracy@K. The final system allows to
annotate data up to 35% faster than the standard\, non-guided K-best and i
mproves the few-shot model’s top-1 accuracy by up to 18%. The second syst
em\, called SchemaBlocks [2]\, is an annotation toolkit for schemas\, or s
tructured descriptions of frequent real-world scenarios (e.g.\, cooking a
meal). It represents schemas in the annotation UI as nested blocks. Usin
g a novel Causal ARM model\, we further speed up the annotation process an
d guide data specialists towards expressive and diverse schemas. As part
of this work\, we collect 232 schemas\, evaluating their internal coherenc
e and their coverage on large-scale newswire corpora.
\n
\n
\n
\n
X-TAGS;LANGUAGE=en-US:2022\,Belyy\,March
END:VEVENT
END:VCALENDAR