Studies in Health Technology and Informatics, 2021
The automation of medical documentation is a highly desirable process, especially as it could ave... more The automation of medical documentation is a highly desirable process, especially as it could avert significant temporal and monetary expenses in healthcare. With the help of complex modelling and high computational capability, Automatic Speech Recognition (ASR) and deep learning have made several promising attempts to this end. However, a factor that significantly determines the efficiency of these systems is the volume of speech that is processed in each medical examination. In the course of this study, we found that over half of the speech, recorded during follow-up examinations of patients treated with Intra-Vitreal Injections, was not relevant for medical documentation. In this paper, we evaluate the application of Convolutional and Long Short-Term Memory (LSTM) neural networks for the development of a speech classification module aimed at identifying speech relevant for medical report generation. In this regard, various topology parameters are tested and the effect of the mode...
We present the Hamburg Dependency Treebank (HDT), which to our knowledge is the largest dependenc... more We present the Hamburg Dependency Treebank (HDT), which to our knowledge is the largest dependency treebank currently available. It consists of genuine dependency annotations, i.e. they have not been transformed from phrase structures. We explore characteristics of the treebank and compare it against others. To exemplify the benefit of large dependency treebanks, we evaluate different parsers on the HDT. In addition, a set of tools will be described which help working with and searching in the treebank.
Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction - HRI '12, 2012
In this paper, we investigate the role of physical embodiment of a robot and its degrees of freed... more In this paper, we investigate the role of physical embodiment of a robot and its degrees of freedom in HRI. Both factors have been suggested to be relevant in definitions of embodiment, and so far we do not understand their effects on the way people interact with robots very well. Linguistic analyses of verbal interactions with robots differing with respect to physical embodiment and degrees of freedom provide a useful methodology to investigate factors conditioning human-robot interaction. Results show that both physical embodiment and degrees of freedom influence interaction, and that the effect of physical embodiment is located in the interpersonal domain, concerning in how far the robot is perceived as an interaction partner, whereas degrees of freedom influence the way users project the suitability of the robot for the current task.
A transformation-based approach to robust parsing is presented, which achieves a strictly monoton... more A transformation-based approach to robust parsing is presented, which achieves a strictly monotonic improvement of its current best hypothesis by repeatedly applying local repair steps to a complex multi-level representation. The transformation process is guided by scores derived from weighted constraints. Besides being interruptible, the procedure exhibits a performance profile typical for anytime procedures and holds great promise for the implementation of time-adaptive behaviour.
It has been proposed that the design of robots might benefit from interactions that are similar t... more It has been proposed that the design of robots might benefit from interactions that are similar to caregiver–child interactions, which is tailored to children’s respective capacities to a high degree. However, so far little is known about how people adapt their tutoring behaviour to robots and whether robots can evoke input that is similar to child-directed interaction. The paper presents detailed analyses of speakers’ linguistic behaviour and non-linguistic behaviour, such as action demonstration, in two comparable situations: In one experiment, parents described and explained to their nonverbal infants the use of certain everyday objects; in the other experiment, participants tutored a simulated robot on the same objects. The results, which show considerable differences between the two situations on almost all measures, are discussed in the light of the computer-as-social-actor paradigm and the register hypothesis. Keywords: child-directed speech (CDS); motherese; robotese; motion...
... OBJA CJ KON OBJA DET S SUBJ DET the eagle caught the rabbit and devoured it . SYNTAX ... Here... more ... OBJA CJ KON OBJA DET S SUBJ DET the eagle caught the rabbit and devoured it . SYNTAX ... Here is a rule that enforces this ordering: {X:SYN/\Y:SYN} : 'OBJA-OBJC-order' : 0.1 : X.label = OBJA & Y.label = OBJC -> X↓from < Y↓from; ...
In memoriam Peter J. Foth iv Contents I have had much help and support in the work described in t... more In memoriam Peter J. Foth iv Contents I have had much help and support in the work described in this thesis. In particular, none of this work could have been done without all those who contributed to the DFG projects 'Parsing of spoken language with limited resources' (ME 1472/1-2), and 'Partial parsing and information extraction' (ME 1472/4-1) and helped to create the WCDG system itself.
Part-of-speech tagging has become a standardtechnique in automatic analysis of natural language.W... more Part-of-speech tagging has become a standardtechnique in automatic analysis of natural language.We discuss the results of an attempt tointegrate POS tagging into a constraint dependencygrammar for robust parsing. It can beshown that the incorporation of tag scores improvesboth the accuracy and the performanceof the parser considerably. Moreover, the robustnessagainst ungrammatical input extendsto robustness against tagging errors.
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06, 2006
In this paper we investigate the benefit of stochastic predictor components for the parsing quali... more In this paper we investigate the benefit of stochastic predictor components for the parsing quality which can be obtained with a rule-based dependency grammar. By including a chunker, a supertagger, a PP attacher, and a fast probabilistic parser we were able to improve upon the baseline by 3.2%, bringing the overall labelled accuracy to 91.1% on the German NEGRA corpus. We attribute the successful integration to the ability of the underlying grammar model to combine uncertain evidence in a soft manner, thus avoiding the problem of error propagation.
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06, 2006
We investigate the utility of supertag information for guiding an existing dependency parser of G... more We investigate the utility of supertag information for guiding an existing dependency parser of German. Using weighted constraints to integrate the additionally available information, the decision process of the parser is influenced by changing its preferences, without excluding alternative structural interpretations from being considered. The paper reports on a series of experiments using varying models of supertags that significantly increase the parsing accuracy. In addition, an upper bound on the accuracy that can be achieved with perfect supertags is estimated.
Abstract: Based on constraint optimization techniques an architecture for robust parsing of natur... more Abstract: Based on constraint optimization techniques an architecture for robust parsing of natural language utterances has been developed. The resulting system is able to combine possibly contradicting evidence from a variety of information sources, using a plausibility-based arbitration procedure to derive fairly rich structural representations, comprising aspects of syntax, semantics and other description levels of language. Results of a series of experiments are reported which demonstrate the high...
Proceedings of the Coling Acl on Main Conference Poster Sessions, 2006
To study PP attachment disambiguation as a benchmark for empirical methods in natural language pr... more To study PP attachment disambiguation as a benchmark for empirical methods in natural language processing it has often been reduced to a binary decision problem (between verb or noun attachment) in a particular syntactic configuration. A parser, however, must solve the more general task of deciding between more than two alternatives in many different contexts. We combine the attachment predictions made by a simple model of lexical attraction with a full-fledged parser of German to determine the actual benefit of the subtask to parsing. We show that the combination of data-driven and rule-based components can reduce the number of all parsing errors by 14% and raise the attachment accuracy for dependency parsing of German to an unprecedented 92%.
Conference of the European Chapter of the Association for Computational Linguistics, 2000
Covering as many phenomena as possible is a traditional goal of parser development, but the broad... more Covering as many phenomena as possible is a traditional goal of parser development, but the broader a grammar is made, the blunter it may become,as rare constructionsinfluencethe be- haviour on simple sentences that were already solved correctly. We observe the effects of in- tentionally removing support for specific con- structions from a broad-coverage grammar of German. We show that accuracy
Proceedings of the COLING/ACL on Main conference poster sessions -, 2006
To study PP attachment disambiguation as a benchmark for empirical methods in natural language pr... more To study PP attachment disambiguation as a benchmark for empirical methods in natural language processing it has often been reduced to a binary decision problem (between verb or noun attachment) in a particular syntactic configuration. A parser, however, must solve the more general task of deciding between more than two alternatives in many different contexts. We combine the attachment predictions made by a simple model of lexical attraction with a full-fledged parser of German to determine the actual benefit of the subtask to parsing. We show that the combination of data-driven and rule-based components can reduce the number of all parsing errors by 14% and raise the attachment accuracy for dependency parsing of German to an unprecedented 92%.
Proceedings of the 11th International Conference on Parsing Technologies - IWPT '09, 2009
We present an asymmetric approach to a run-time combination of two parsers where one component se... more We present an asymmetric approach to a run-time combination of two parsers where one component serves as a predictor to the other one. Predictions are integrated by means of weighted constraints and therefore are subject to preferential decisions. Previously, the same architecture has been successfully used with predictors providing partial or inferior information about the parsing problem. It has now been applied to a situation where the predictor produces exactly the same type of information at a fully competitive quality level. Results show that the combined system outperforms its individual components, even though their performance in isolation is already fairly high.
We present a parser for German that achieves a competitive accuracy on unrestricted input while m... more We present a parser for German that achieves a competitive accuracy on unrestricted input while maintaining a coverage of 100%. By writing well-formedness rules as declarative, defeasible constraints that integrate different sources of linguistic knowledge, very high robustness is achieved against all sorts of extragrammatical constructions. ⋆ This is an extended version of a paper published in the proceedings of the 7. Konferenz zur Verarbeitung natürlicher Sprache, Vienna 2004 [1]
Proceedings of the ACL 2004 on Interactive poster and demonstration sessions -, 2004
The manual design of grammars for accurate natural language analysis is an iterative process; whi... more The manual design of grammars for accurate natural language analysis is an iterative process; while modelling decisions usually determine parser behaviour, evidence from analysing more or different input can suggest unforeseen regularities, which leads to a reformulation of rules, or even to a different model of previously analysed phenomena. We describe an implementation of Weighted Constraint Dependency Grammar that supports the grammar writer by providing display, automatic analysis, and diagnosis of dependency analyses and allows the direct exploration of alternative analyses and their status under the current grammar.
Proceedings of the 18th conference on Computational linguistics -, 2000
... restarts, repairs, hesitations and other gr~tmmatical errors, individual errors should not ma... more ... restarts, repairs, hesitations and other gr~tmmatical errors, individual errors should not make further analysis ... of analysis can be defined to model syntactic as well as semantic structures. ... our heuristics is not strong enough, eg, the German sentence with a topicalized direct object ...
Studies in Health Technology and Informatics, 2021
The automation of medical documentation is a highly desirable process, especially as it could ave... more The automation of medical documentation is a highly desirable process, especially as it could avert significant temporal and monetary expenses in healthcare. With the help of complex modelling and high computational capability, Automatic Speech Recognition (ASR) and deep learning have made several promising attempts to this end. However, a factor that significantly determines the efficiency of these systems is the volume of speech that is processed in each medical examination. In the course of this study, we found that over half of the speech, recorded during follow-up examinations of patients treated with Intra-Vitreal Injections, was not relevant for medical documentation. In this paper, we evaluate the application of Convolutional and Long Short-Term Memory (LSTM) neural networks for the development of a speech classification module aimed at identifying speech relevant for medical report generation. In this regard, various topology parameters are tested and the effect of the mode...
We present the Hamburg Dependency Treebank (HDT), which to our knowledge is the largest dependenc... more We present the Hamburg Dependency Treebank (HDT), which to our knowledge is the largest dependency treebank currently available. It consists of genuine dependency annotations, i.e. they have not been transformed from phrase structures. We explore characteristics of the treebank and compare it against others. To exemplify the benefit of large dependency treebanks, we evaluate different parsers on the HDT. In addition, a set of tools will be described which help working with and searching in the treebank.
Proceedings of the seventh annual ACM/IEEE international conference on Human-Robot Interaction - HRI '12, 2012
In this paper, we investigate the role of physical embodiment of a robot and its degrees of freed... more In this paper, we investigate the role of physical embodiment of a robot and its degrees of freedom in HRI. Both factors have been suggested to be relevant in definitions of embodiment, and so far we do not understand their effects on the way people interact with robots very well. Linguistic analyses of verbal interactions with robots differing with respect to physical embodiment and degrees of freedom provide a useful methodology to investigate factors conditioning human-robot interaction. Results show that both physical embodiment and degrees of freedom influence interaction, and that the effect of physical embodiment is located in the interpersonal domain, concerning in how far the robot is perceived as an interaction partner, whereas degrees of freedom influence the way users project the suitability of the robot for the current task.
A transformation-based approach to robust parsing is presented, which achieves a strictly monoton... more A transformation-based approach to robust parsing is presented, which achieves a strictly monotonic improvement of its current best hypothesis by repeatedly applying local repair steps to a complex multi-level representation. The transformation process is guided by scores derived from weighted constraints. Besides being interruptible, the procedure exhibits a performance profile typical for anytime procedures and holds great promise for the implementation of time-adaptive behaviour.
It has been proposed that the design of robots might benefit from interactions that are similar t... more It has been proposed that the design of robots might benefit from interactions that are similar to caregiver–child interactions, which is tailored to children’s respective capacities to a high degree. However, so far little is known about how people adapt their tutoring behaviour to robots and whether robots can evoke input that is similar to child-directed interaction. The paper presents detailed analyses of speakers’ linguistic behaviour and non-linguistic behaviour, such as action demonstration, in two comparable situations: In one experiment, parents described and explained to their nonverbal infants the use of certain everyday objects; in the other experiment, participants tutored a simulated robot on the same objects. The results, which show considerable differences between the two situations on almost all measures, are discussed in the light of the computer-as-social-actor paradigm and the register hypothesis. Keywords: child-directed speech (CDS); motherese; robotese; motion...
... OBJA CJ KON OBJA DET S SUBJ DET the eagle caught the rabbit and devoured it . SYNTAX ... Here... more ... OBJA CJ KON OBJA DET S SUBJ DET the eagle caught the rabbit and devoured it . SYNTAX ... Here is a rule that enforces this ordering: {X:SYN/\Y:SYN} : 'OBJA-OBJC-order' : 0.1 : X.label = OBJA & Y.label = OBJC -> X↓from < Y↓from; ...
In memoriam Peter J. Foth iv Contents I have had much help and support in the work described in t... more In memoriam Peter J. Foth iv Contents I have had much help and support in the work described in this thesis. In particular, none of this work could have been done without all those who contributed to the DFG projects 'Parsing of spoken language with limited resources' (ME 1472/1-2), and 'Partial parsing and information extraction' (ME 1472/4-1) and helped to create the WCDG system itself.
Part-of-speech tagging has become a standardtechnique in automatic analysis of natural language.W... more Part-of-speech tagging has become a standardtechnique in automatic analysis of natural language.We discuss the results of an attempt tointegrate POS tagging into a constraint dependencygrammar for robust parsing. It can beshown that the incorporation of tag scores improvesboth the accuracy and the performanceof the parser considerably. Moreover, the robustnessagainst ungrammatical input extendsto robustness against tagging errors.
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06, 2006
In this paper we investigate the benefit of stochastic predictor components for the parsing quali... more In this paper we investigate the benefit of stochastic predictor components for the parsing quality which can be obtained with a rule-based dependency grammar. By including a chunker, a supertagger, a PP attacher, and a fast probabilistic parser we were able to improve upon the baseline by 3.2%, bringing the overall labelled accuracy to 91.1% on the German NEGRA corpus. We attribute the successful integration to the ability of the underlying grammar model to combine uncertain evidence in a soft manner, thus avoiding the problem of error propagation.
Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06, 2006
We investigate the utility of supertag information for guiding an existing dependency parser of G... more We investigate the utility of supertag information for guiding an existing dependency parser of German. Using weighted constraints to integrate the additionally available information, the decision process of the parser is influenced by changing its preferences, without excluding alternative structural interpretations from being considered. The paper reports on a series of experiments using varying models of supertags that significantly increase the parsing accuracy. In addition, an upper bound on the accuracy that can be achieved with perfect supertags is estimated.
Abstract: Based on constraint optimization techniques an architecture for robust parsing of natur... more Abstract: Based on constraint optimization techniques an architecture for robust parsing of natural language utterances has been developed. The resulting system is able to combine possibly contradicting evidence from a variety of information sources, using a plausibility-based arbitration procedure to derive fairly rich structural representations, comprising aspects of syntax, semantics and other description levels of language. Results of a series of experiments are reported which demonstrate the high...
Proceedings of the Coling Acl on Main Conference Poster Sessions, 2006
To study PP attachment disambiguation as a benchmark for empirical methods in natural language pr... more To study PP attachment disambiguation as a benchmark for empirical methods in natural language processing it has often been reduced to a binary decision problem (between verb or noun attachment) in a particular syntactic configuration. A parser, however, must solve the more general task of deciding between more than two alternatives in many different contexts. We combine the attachment predictions made by a simple model of lexical attraction with a full-fledged parser of German to determine the actual benefit of the subtask to parsing. We show that the combination of data-driven and rule-based components can reduce the number of all parsing errors by 14% and raise the attachment accuracy for dependency parsing of German to an unprecedented 92%.
Conference of the European Chapter of the Association for Computational Linguistics, 2000
Covering as many phenomena as possible is a traditional goal of parser development, but the broad... more Covering as many phenomena as possible is a traditional goal of parser development, but the broader a grammar is made, the blunter it may become,as rare constructionsinfluencethe be- haviour on simple sentences that were already solved correctly. We observe the effects of in- tentionally removing support for specific con- structions from a broad-coverage grammar of German. We show that accuracy
Proceedings of the COLING/ACL on Main conference poster sessions -, 2006
To study PP attachment disambiguation as a benchmark for empirical methods in natural language pr... more To study PP attachment disambiguation as a benchmark for empirical methods in natural language processing it has often been reduced to a binary decision problem (between verb or noun attachment) in a particular syntactic configuration. A parser, however, must solve the more general task of deciding between more than two alternatives in many different contexts. We combine the attachment predictions made by a simple model of lexical attraction with a full-fledged parser of German to determine the actual benefit of the subtask to parsing. We show that the combination of data-driven and rule-based components can reduce the number of all parsing errors by 14% and raise the attachment accuracy for dependency parsing of German to an unprecedented 92%.
Proceedings of the 11th International Conference on Parsing Technologies - IWPT '09, 2009
We present an asymmetric approach to a run-time combination of two parsers where one component se... more We present an asymmetric approach to a run-time combination of two parsers where one component serves as a predictor to the other one. Predictions are integrated by means of weighted constraints and therefore are subject to preferential decisions. Previously, the same architecture has been successfully used with predictors providing partial or inferior information about the parsing problem. It has now been applied to a situation where the predictor produces exactly the same type of information at a fully competitive quality level. Results show that the combined system outperforms its individual components, even though their performance in isolation is already fairly high.
We present a parser for German that achieves a competitive accuracy on unrestricted input while m... more We present a parser for German that achieves a competitive accuracy on unrestricted input while maintaining a coverage of 100%. By writing well-formedness rules as declarative, defeasible constraints that integrate different sources of linguistic knowledge, very high robustness is achieved against all sorts of extragrammatical constructions. ⋆ This is an extended version of a paper published in the proceedings of the 7. Konferenz zur Verarbeitung natürlicher Sprache, Vienna 2004 [1]
Proceedings of the ACL 2004 on Interactive poster and demonstration sessions -, 2004
The manual design of grammars for accurate natural language analysis is an iterative process; whi... more The manual design of grammars for accurate natural language analysis is an iterative process; while modelling decisions usually determine parser behaviour, evidence from analysing more or different input can suggest unforeseen regularities, which leads to a reformulation of rules, or even to a different model of previously analysed phenomena. We describe an implementation of Weighted Constraint Dependency Grammar that supports the grammar writer by providing display, automatic analysis, and diagnosis of dependency analyses and allows the direct exploration of alternative analyses and their status under the current grammar.
Proceedings of the 18th conference on Computational linguistics -, 2000
... restarts, repairs, hesitations and other gr~tmmatical errors, individual errors should not ma... more ... restarts, repairs, hesitations and other gr~tmmatical errors, individual errors should not make further analysis ... of analysis can be defined to model syntactic as well as semantic structures. ... our heuristics is not strong enough, eg, the German sentence with a topicalized direct object ...
Uploads
Papers by Kilian Foth