Papers by Jean-Pierre Corriveau
Reproduced with permission of the copyright owner. Further reproduction prohibited without permis... more Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

Computational Linguistics, Volume 22, Number 4, December 1996, 1996
The use of examples as the basis for machine translation systems has gained considerable acceptan... more The use of examples as the basis for machine translation systems has gained considerable acceptance since the original proposal of Nagao in 1984. In this short book, Jones first reviews the fundamental principles of example-based machine translation (EBMT) in order to then introduce the specific mechanisms of his model. The key characteristic of this model is its purely stochastic processing, which is based on the algorithm put forth by Skousen (1989) in his Analogical Modeling strategy for language comprehension. Overall, the book is well-written and offers a good introduction to some of the very interesting problems of machine translation. But my reading left me somewhat unsatisfied. In order to explain this, let me first summarize the chapters. The goals and method of the work are clearly stated in the nine pages that form the introduction: Jones intends to demonstrate the possibility of uniformly using examples rather than rules for machine translation. I emphasize that, in this context, "the term

2 Geographic Partitioning Techniques for the Anonymization of Health Care DataHospitals and healt... more 2 Geographic Partitioning Techniques for the Anonymization of Health Care DataHospitals and health care organizations collect large amounts of detailed healthcare data that is in high demand by researchers. Thus, the possessors of suchdata are in need of methods that allow for this data to be released withoutcompromising the con dentiality of the individuals to whom it pertains. As thegeographic aspect of this data is becoming increasingly relevant for researchbeing conducted, it is important for an anonymization process to pay dueattention to the geographic attributes of such data. In this paper, a novelsystem for health care data anonymization is presented. At the core of thesystem is the aggregation of an initial regionalization guided by the use ofa Voronoi diagram. We conduct a comparison with another geographic-basedsystem of anonymization, GeoLeader. We show that our system is capable ofproducing results of a comparable quality with a much faster running time.
A great deal of psycholinguistic findings reveal that context highlights or obscures certain aspe... more A great deal of psycholinguistic findings reveal that context highlights or obscures certain aspects in the meaning of a word (viz., word sense modulation). Computational models of lexicon, however, are mostly concerned with the ways context selects a meaning for a word (word sense selection). In this paper, we propose a model that combines sense selection with sense modulation. Word senses in this proposal consist of a sense-concept and a sense-view. Furthermore, we outline an exemplar-based approach in which se~se-views are developed gradually and incrementally. A prototype implementation of this model for sentential context is also briefly discussed.

Security and Privacy in Communication Networks, 2017
New threats to networks are constantly arising. This justifies protecting network assets and miti... more New threats to networks are constantly arising. This justifies protecting network assets and mitigating the risk associated with attacks. In a distributed environment, researchers aim, in particular, at eliminating faulty network entities. More specifically, much research has been conducted on locating a single static black hole, which is defined as a network site whose existence is known a priori and that disposes of any incoming data without leaving any trace of this occurrence. However, the prevalence of faulty nodes requires an algorithm able to a) identify faulty nodes that can be repaired without human intervention and b) locate black holes, which are taken to be faulty nodes whose repair does require human intervention. In this paper, we consider a specific attack model that involves multiple faulty nodes that can be repaired by mobile software agents, as well as a virus v that can infect a previously repaired faulty node and turn it into a black hole. We refer to the task of repairing multiple faulty nodes and pointing out the location of the black hole as the Faulty Node Repair and Dynamically Spawned Black Hole Search. We first analyze the attack model we put forth. We then explain a) how to identify whether a node is either 1) a safe node or 2) a repairable faulty node or 3) the black hole that has been infected by virus v during the search/repair process and, b) how to perform the correct relevant actions. These two steps constitute a complex task, which, we explain, significantly differs from the traditional Black Hole Search. We continue by proposing an algorithm to solve this problem in an asynchronous ring network with only one whiteboard (which resides in a node called the homebase). We prove the correctness of our solution and analyze its complexity by both theoretical analysis and experiment evaluation. We conclude that, using our proposed algorithm, b + 4 agents can repair all faulty nodes and locate the black hole infected by a virus v within finite time, when the black hole appears in the network before the last faulty node is repaired. Our algorithm works even when b, the number of faulty nodes, is unknown a priori.

This paper is concerned with semantic flexibility. As suggested by many studies, human concepts d... more This paper is concerned with semantic flexibility. As suggested by many studies, human concepts demonstrate enormous flexibility in various contexts. In this paper, we propose a model for constructing the conceptual representation on the basis of context. Our solution is based on the separation of representational and classificatory functions of concepts, and we discuss its implications regarding the conceptual development. We later sketch an implementation of the model for a simple form of context, sentential context. 1 Introduction A context significantly affects its concepts. Perhaps the best known contextual effect is conceptual flexibility, where the interpretation of a single concept varies in different contexts. For instance consider the word book in the following examples: (1) The book broke the window. (2) I read the book. (3) Many books were burnt in the fire. (4) The book is waiting for printing. (5) The prisoner smuggled out his book page by page. Each use of book i...
Lecture Notes in Computer Science, 1995
Quantification in natural language is an important phenomena as it relates to scoping, reference ... more Quantification in natural language is an important phenomena as it relates to scoping, reference resolution, and, more importantly, to inference. In this paper we argue that the reasoning involved in quantifier scoping and reference resolution is highly dependent on the linguistic context as well as time and memory constraints. Time and memory constraints are not only physical realities that an intelligent agent must cope with, but, as we shall argue, they play an important role in the inferencing process that underlies the task of language ...

Journal of the American Medical Informatics Association, 2009
Background: Explicit patient consent requirements in privacy laws can have a negative impact on h... more Background: Explicit patient consent requirements in privacy laws can have a negative impact on health research, leading to selection bias and reduced recruitment. Often legislative requirements to obtain consent are waived if the information collected or disclosed is de-identified. Objective: The authors developed and empirically evaluated a new globally optimal de-identification algorithm that satisfies the k-anonymity criterion and that is suitable for health datasets. Design: Authors compared OLA (Optimal Lattice Anonymization) empirically to three existing k-anonymity algorithms, Datafly, Samarati, and Incognito, on six public, hospital, and registry datasets for different values of k and suppression limits. Measurement: Three information loss metrics were used for the comparison: precision, discernability metric, and non-uniform entropy. Each algorithm's performance speed was also evaluated. Results: The Datafly and Samarati algorithms had higher information loss than OLA and Incognito; OLA was consistently faster than Incognito in finding the globally optimal de-identification solution. Conclusions: For the de-identification of health datasets, OLA is an improvement on existing k-anonymity algorithms in terms of information loss and performance.
Proceedings of the National Conference on Artificial Intelligence, Jul 27, 1997
Quantification in natural language is an important phenomena that seems to touch on some pragmati... more Quantification in natural language is an important phenomena that seems to touch on some pragmatic and inferential aspects of language understanding. In this paper we focus on quantifier scope ambiguity and suggest a cognitively plausible model that resolves a number of problems that have traditionally been addressed in isolation. Our claim here is that the problem of quantifier scope ambiguity can not be adequately addressed at the syntactic and semantic levels, but is an inferencing problem that must be addressed at the ...

Computational Linguistics, Volume 22, Number 1, March 1996, 1996
In the foreword to this recent book by Larry Bookman, Wendy Lehnert writes, "this volume is witho... more In the foreword to this recent book by Larry Bookman, Wendy Lehnert writes, "this volume is without question a milestone in language processing scholarship." To put it simply, I agree. In 236 pages, Bookman develops at length the two-tier model of semantic memory that he first introduced in his doctoral dissertation on text comprehension. The relational (symbolic) tier captures a set of dependency relationships between concepts. The analog semantic-feature (ASF) tier represents (at the subsymbolic level) common or shared knowledge about the concepts in the relational tier, expressed as a set of statistical associations. This hybrid approach to NLU, named LeMICON (Learning Memory Integrated Context), stands out first and foremost because it claims to address the bottleneck of hand-coding the knowledge required for interpretation. More specifically, LeMICON offers both automatic acquisition of knowledge from on-line text corpora and integration of a text's interpretation into the knowledge base. Before commenting on this breakthrough, let me summarize the work. Much like the model of Lange (1993) and my own model (Corriveau 1995), LeMI-CON tackles most facets of text comprehension, in contrast with PDP models, such as that of Miikkulainen (1993), which are typically more specialized. Bookman first skillfully presents, in a 21-page introduction, the essential characteristics of his work. LeMICON ignores syntax, but considers several psycholinguistic and neurolinguistic problems that are typically ignored by others. For example, it models the loss of information in working memory, and the time-course of inferences as well as their construction process. Throughout this discussion, the spreading-activation process, so typical of local and hybrid connectionist architectures such as LeMICON, is thought of as constructing time trajectories through the space of ASFs. That is, "energy" propagates through ASFs to form, over time, stable chains corresponding to inferences. The
Proceedings of the Twenty First Annual Conference of the Cognitive Science Society, 2020
Proceedings Third IEEE International Symposium on Object-Oriented Real-Time Distributed Computing (ISORC 2000) (Cat. No. PR00607)
One of the most crucial and complicated phases of real-time system development lies in the transi... more One of the most crucial and complicated phases of real-time system development lies in the transition from system behavior (generally specified using scenario models) to the behavior of interacting components (typically captured by means of communicating hierarchical finite state machines). It is commonly accepted that a systematic approach is required for this transition. We overview such an approach, which we

Bibüothëque nationale du Canada Acquisitions and Acquisitions et Bibliographie SeMces senrices bi... more Bibüothëque nationale du Canada Acquisitions and Acquisitions et Bibliographie SeMces senrices bibliographiques 395 Wellington Street 395. rue Wellington OttawaON K1A O(V4 OrrawaON K1AON4 CaMda canada The author has granted a non-L'auteur a accordé une licence non exclusive licence allowing the exclusive permettant à la National L&,rary of Canada to Bibliothèque nationale du Canada de reproduce, loan, distribute or sell reproduire, prêter, distriiuer ou copies of this thesis in microform, vsndre des copies de cette thèse sous paper or electronic formats. la forme de microfiche/nim, de reproduction sur papier ou sur format électronique. The author retains ownership of the L'auteur conserve la propriété du copyright in this thesis. Neither the droit d'auteur qui protége cette thèse. thesis nor substantial extracts fiom it Ni la thèse ni des extraits substantiels may be printed or othenvise de celle-ci ne doivent être imprimés reproduced without the author's ou autrement reproduits sans son permission. autorisation. TO THE MEMORY OF SABA ABRAHAM SABA who inspired pride, hmility, and uncompromising goodness.
Proceedings of the Twenty-Eighth Hawaii International Conference on System Sciences, vol.3
Existing computational approaches to cognition generally adopt a static strategy. Conversely, in ... more Existing computational approaches to cognition generally adopt a static strategy. Conversely, in this paper, it is argued that broadening the notion of software reliability for cognitive architectures ultimately leads us to acknowledging the diachronic nature of cognition. In turn, this observation suggests a new set of requirements for a cognitive architecture, including strong tractability and avoidance of epistemological commitments. These requirements can be satisfied by a time-constrained model of memory, which takes the form of a massively parallel network of objects exchanging simple signals. All knowledge is expressed in a separate knowledge base organized into an object-oriented hierarchy.

2017 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), 2017
During the shuffle stage of the MapReduce framework, a large volume of data may be relocated to t... more During the shuffle stage of the MapReduce framework, a large volume of data may be relocated to the same destination at the same time. This, in turn, may lead to the network hotspot problem. On the other hand, it is always more effective to achieve better data locality by moving the computation closer to the data than the other way around. However, doing this may result in the partitioning skew problem, which is characterized by the unbalanced computational loads between the destinations. Consequently, shuffling algorithms should consider all the following criteria: data locality, partitioning skew, and network hotspot. In order to do so, we introduce MCSA, a Multi-Criteria shuffling algorithm for the MapReduce scheduling stage that rests on three cost functions to accurately reflect the trade-offs between these different criteria. Extensive simulations were conducted and their results show that the MCSA-based scheduler consistently outperforms other schedulers based on these criter...

system as more than a mere existence proof: he offers it as a psycholinguistic theory of how chil... more system as more than a mere existence proof: he offers it as a psycholinguistic theory of how children actually acquire their first language, and he frequently cites observations about child language as tending to confirm his theory. A book that can make even a prima facie plausible claim to have achieved these things must be an important one; hence the length of this review. Just how far Berwick's attempt has succeeded is a question not to be answered quickly. The book is dense, and Berwick is not always as skilled as he might be at helping the reader to disentangle the central skeleton of his exposition from peripheral technical details. An adequate assessment of Berwick's work will need extended consideration by the scholarly community, preferably with further elucidation by Berwick himself. To set the ball rolling, let me mention some points that worried me on a first reading of the book, though I do so without any assumption that they will ultimately prove fatal to Berwi...

Computational Linguistics, 1992
In this superbly written essay, Francois Rastier, a distinguished French linguist with several bo... more In this superbly written essay, Francois Rastier, a distinguished French linguist with several books on interpretative semantics, questions the foundations of linguistics and cognitive science in order to investigate the role of semantics with respect to the other disciplines of this 'new' interdisciplinary science of cognition. The book is organized into three sections: the first, of 100 pages, considers the history and epistemology of cognitive science; the second, of particular relevance to computational linguists, studies in 60 pages the relationship between semantics and AI; and the last, of 70 pages, investigates the interactions between semantics on the one hand, and psychology and neurosciences on the other hand. Before developing these different studies, Rastier clearly states his positions in a 10-page introduction. In his opinion, "linguistics is a descriptive, partially predictive science [and] empirical rationalism 1 is the philosophical approach best suited to the theoretical activity of the linguist "2 (p. 12) for it can account for the multiplicity of determinations proper to linguistic objects such as texts. Only the dogmatic rationalist, guilty of unwarranted theoretical reductionism, searches for methodological universals "that he invents and reifies, admiring himself for their discovery" (p. 12). For Rastier, diversity, not unity, is taken to be the fundamental problem of linguistics. In particular, context, both linguistic and nonlinguistic, is taken to be an integral unpredictable component of comprehension, accounting for the multiplicity of interpretations. One may reduce context to a Montague-like index, but recognizing the existence of contextual variables says little about their instantiation. Consequently, for Rastier, "linguistic performance consists in adapting oneself to a situation whose parameters escape the computational paradigm" (p. 13). Linguistics is viewed as a subdiscipline of semiotics, a social science, concerned with actual tongues, 3 concrete linguistic communication, and cultures-three factors systematically downplayed, if not ignored, by cognitive science. This introduction sets the tone for the rest of the book and presents its two recurring themes: a systematic attack on universalism (its leaders and its philosophical underpinnings) and a strong argument in favor of the existence and autonomy of a semiotic level, which includes semantics, the world of the Saussurian signifid, distinct from the conceptual level. Section 1 starts with an investigation into the nature, history, and assumptions of cognitive science. According to Rastier, only the functionalist postulate, which assumes
Uploads
Papers by Jean-Pierre Corriveau