Skip to main content

Katsumi Takahashi

Followers

0

Following

8

Public Views

kazunari tozawa

University of New Mexico

Dhaneshwar Mardi

University of Massachusetts Amherst

Mario Larangeira

University of Virginia

Interests

Uploads

Papers by Katsumi Takahashi

Efficient Secure Three-Party Sorting with Applications to Data Analysis and Heavy Hitters

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security

We present a three-party sorting protocol secure against passive and active adversaries in the ho... more We present a three-party sorting protocol secure against passive and active adversaries in the honest majority setting. The protocol can be easily combined with other secure protocols which work on shared data, and thus enable different data analysis tasks, such as private set intersection of shared data, deduplication, and the identification of heavy hitters. The new protocol computes a stable sort. It is based on radix sort and is asymptotically better than previous secure sorting protocols. It improves on previous radix sort protocols by not having to shuffle the entire length of the items after each comparison step. We implemented our sorting protocol with different optimizations and achieved concretely fast performance. For example, sorting one million items with 32-bit keys and 32-bit values takes less than 2 seconds with semi-honest security and about 3.5 seconds with malicious security. Finding the heavy hitters among hundreds of thousands of 256-bit values takes only a few seconds, compared to close to an hour in previous work. CCS CONCEPTS • Security and privacy → Cryptography.

Investigation on Anxieties while Using the Internet to Study about “Anshin”

Journal of Information Processing, 2011

is an emotion in Japanese that is difficult to translate because it is vague, varies from person ... more is an emotion in Japanese that is difficult to translate because it is vague, varies from person to person, and is subjective. It means something like "a feeling of contentment". The demand for Internet use with "Anshin" is high. We believe that the emotion and the demand could be universal. To study "Anshin," we conducted group interviews as our first step. We obtained 95/157 cases of "Anshin"/anxiety from 28 people. From the results, we found that studying anxiety is valuable. Anxiety is a kind of opposite concept to "Anshin" and controlling it leads to a kind of "Anshin." To discuss this, we constructed a model of the process of anxiety generation and selected candidates for the related elements. After investigating obtained cases, we produced a questionnaire for Internet anxieties to prepare the evaluation of them.

Experimental Trials on Privacy-preserving Data Analysis Using 2-party Secure Circuit Evaluation(2) - The Behavior Analysis of University Students

In recent years, there is a need for widespread utilization of privacy or confidential informatio... more In recent years, there is a need for widespread utilization of privacy or confidential information as well as preserving one. One of the technologies which meets the need is secure function evaluation. In this paper, We evaluate our secure function evaluation system in the field of education. We collect and analyze information on daily life of students by privacy protection enquete system and show a correlation between lifestyle behaviour and academic performance. 1. はじめに

Processing Load Prediction for Parallel FP-growth

Load balancing is a dominant factor to achieve scalable parallel frequent pattern mining. In this... more Load balancing is a dominant factor to achieve scalable parallel frequent pattern mining. In this paper, we examine some methods to predict processing load for parallel FP-growth algorithm. We propose item processing order based heuristic and load prediction function based on the path depth and other statistics which can be collected before the execution of mining process. We also propose sampling to predict statistics such as the number of iterations. Finally, we implement those methods to improve the initial distribution of processing units i.e. conditional pattern bases as well as the load balancing during the execution of those conditional pattern bases. The performance evaluation shows that sampling based load prediction and item ordering heuristics perform well for the initial distribution.

Naviz: User Behavior Visualization of Dynamic Page

Navigational behavior of website visitors can be extracted from web access log files with data mi... more Navigational behavior of website visitors can be extracted from web access log files with data mining techniques such as sequential pattern mining. Visualization of the discovered patterns is very helpful to understand how visitors navigate over the various pages on the site. Currently several web log visualization tools have been developed. However those tools are far from satisfactory. They do

Naviz:Website Navigational Behavior Visualizer

Lecture Notes in Computer Science, 2002

Navigational behavior of website visitors can be extracted from web access log files with data mi... more Navigational behavior of website visitors can be extracted from web access log files with data mining techniques such as sequential pattern mining. Visualization of the discovered patterns is very helpful to understand how visitors navigate over the various pages on the site. Currently several web log visualization tools have been developed. However those tools are far from satisfactory. They do not provide global view of visitor access as well as individual traversal path effectively. Here we introduce Naviz, a system of interactive web log visualization that is designed to overcome those drawbacks. It combines two-dimensional graph of visitor access traversals that considers appropriate web traversal properties, i.e. hierarchization regarding traversal traffic and grouping of related pages, and facilities for filtering traversal paths by specifying visited pages and path attributes, such as number of hops, support and confidence. The tool also provides support for modern dynamic web pages. We apply the tool to visualize results of data mining study on web log data of Mobile Townpage, a directory service of phone numbers in Japan for i-Mode mobile internet users. The results indicate that our system can easily handle thousands of discovered patterns to discover interesting navigational behavior such as success paths, exit paths and lost paths.

Editor's Message to Special Issue of Usable Security

Journal of Information Processing, 2019

Secret sharing system, secret sharing apparatus, secret sharing method, secret sorting method and secret sharing program

k-Anonymous Microdata Release via Post Randomisation Method

Lecture Notes in Computer Science, 2015

The problem of the release of anonymized microdata is an important topic in the fields of statist... more The problem of the release of anonymized microdata is an important topic in the fields of statistical disclosure control (SDC) and privacy preserving data publishing (PPDP), and yet it remains sufficiently unsolved. In these research fields, k-anonymity has been widely studied as an anonymity notion for mainly deterministic anonymization algorithms, and some probabilistic relaxations have been developed. However, they are not sufficient due to their limitations, i.e., being weaker than the original k-anonymity or requiring strong parametric assumptions. First we propose P k-anonymity, a new probabilistic k-anonymity, and prove that P k-anonymity is a mathematical extension of kanonymity rather than a relaxation. Furthermore, P k-anonymity requires no parametric assumptions. This property has a significant meaning in the viewpoint that it enables us to compare privacy levels of probabilistic microdata release algorithms with deterministic ones. Second, we apply P k-anonymity to the post randomization method (PRAM), which is an SDC algorithm based on randomization. PRAM is proven to satisfy P k-anonymity in a controlled way, i.e, one can control PRAM's parameter so that P k-anonymity is satisfied. On the other hand, PRAM is also known to satisfy ε-differential privacy, a recent popular and strong privacy notion. This fact means that our results significantly enhance PRAM since it implies the satisfaction of both important notions: k-anonymity and ε-differential privacy.

Kokono Search: A Location Based Search Engine

World Wide Web Conference Series, 2001

Abstract: We have developed a location-based search system for webdocuments on the Internet. This... more

Privacy Preserving Computations without Public Key Cryptographic Operation

Lecture Notes in Computer Science, 2008

We develop the privacy preserving computation protocol presented by Naor, Pinkas and Sumner at AC... more We develop the privacy preserving computation protocol presented by Naor, Pinkas and Sumner at ACM Conference on Electronic Commerce in 1999 into a more efficient one. Their protocol is based on the Yao's two-party secure function evaluation and can be used to implement any combinatorial circuit for input clients' data without disclosing them. In this paper we propose three types of protocol as variants of the Naor-Pinkas-Sumner protocol in each different framework. The first protocol is the almost same framework as theirs but requires no public key cryptographic operations for clients unlike their protocol. The second protocol furthermore eliminates an oblivious transfer from the two-party operation in their protocol and the first protocol by adding a new entity named "mediator" into the Yao's two-party setting. In the new threeparty setting, it is assumed that no party colludes with any other parties to retain the secrecy of the clients' data. The last protocol removes the mediator from the second protocol in return for clients' some additional burden. Since an oblivious transfer used in the Naor-Pinkas-Sumner protocol and the first protocol is the dominant step in each protocol, the second and last protocols are expected to be much faster than the others.

Secret Sharing with Share-Conversion: Achieving Small Share-Size and Extendibility to Multiparty Computation

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2015

Practically Efficient Multi-party Sorting Protocols from Comparison Sort Algorithms

Lecture Notes in Computer Science, 2013

Sorting is one of the most important primitives in various systems, for example, database systems... more Sorting is one of the most important primitives in various systems, for example, database systems, since it is often the dominant operation in the running time of an entire system. Therefore, there is a long list of work on improving its efficiency. It is also true in the context of secure multi-party computation (MPC), and several MPC sorting protocols have been proposed. However, all existing MPC sorting protocols are based on less efficient sorting algorithms, and the resultant protocols are also inefficient. This is because only a method for converting data-oblivious algorithms to corresponding MPC protocols is known, despite the fact that most efficient sorting algorithms such as quicksort and merge sort are not data-oblivious. We propose a simple and general approach of converting non-data-oblivious comparison sort algorithms, which include the above algorithms, into corresponding MPC protocols. We then construct an MPC sorting protocol from the well known efficient sorting algorithm, quicksort, with our approach. The resultant protocol is practically efficient since it significantly improved the running time compared to existing protocols in experiments.

Secret Sharing Schemes with Conversion Protocol to Achieve Short Share-Size and Extendibility to Multiparty Computation

Lecture Notes in Computer Science, 2013

Secret sharing scheme (SSS) has been extensively studied since SSSs are important not only for se... more Secret sharing scheme (SSS) has been extensively studied since SSSs are important not only for secure data storage but also as the fundamental building block for many cryptographic protocols such as multiparty computation (MPC). Although both code efficiency and application of MPC are important for SSSs, it is difficult to satisfy both. There have been many studies about MPC on Shamir’s and replicated SSS while their share size is large, and computationally secure SSS and a ramp scheme have a short share size while there have been few studies concerning their MPC. We propose a new computational SSS, and show how to convert shares of our SSS and a ramp SSS to those of multiparty-friendly SSS such as Shamir’s and replicated SSS. This enables one to secretly-share data compactly and extend secretly-shared data to MPC if needed.

Applicability of existing anonymization methods to large location history data in urban travel

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2012

ABSTRACT Service providers want to know user attributes and recorded information in order to impr... more ABSTRACT Service providers want to know user attributes and recorded information in order to improve more satisfaction of the people, or the efficiency of their services by offering services specialized to the users&#39; preferences. However, since they choose wrong way to collect, classify, analysis, use or disclose to others, of personal information, it may exceed the explicit or implicit of the user regarding the provision of personal information. So far, many anonymization methods for those data have been proposed to solve this problem. As one of anonymous method, we focus on k-anonymization technique to realize a &#39;forest from the trees&#39; as described above. In papers in which these methods are proposed, only qualitative analyze or examples are shown that demonstrate the usefulness of anonymized data, which are the outputs of those methods. Since it is generally said that, if the size of data gets bigger, the anonymization of data becomes easier, those methods have not been applied to real huge data. In this paper, we transform the travel records of 722,000 people traveling by train in the Tokyo area with our proposed anonymization methods, analyze the degree to which each of the results is useful, and conclude that the results are useless even when anonymity level is set to low.

Tag-Based Secure Set-Intersection Protocol and Its Application to Privacy-Enhancing Biometrics

2010 13th International Conference on Network-Based Information Systems, 2010

The secure set-intersection protocol is a cryptographic protocol that retrieves the intersection ... more The secure set-intersection protocol is a cryptographic protocol that retrieves the intersection of two or more datasets without revealing any additional information apart from the intersection data. In this paper we formalize the secure matching tag, which is frequently used in the existing secure set-intersection protocols, and propose applying the tag to privacy-enhancing biometrics. Due to the index property of the tag, the proposed protocol can efficiently match the biometric data of a certain user from among a large amount of encrypted biometric data of users without entering the user ID.

User behavior analysis of location aware search engine

Proceedings Third International Conference on Mobile Data Management MDM 2002

Rapid growth of internet access from mobile users puts much importance on location specific infor... more Rapid growth of internet access from mobile users puts much importance on location specific information on the web. An unique web service called Mobile Info Search (MIS) from NTT Laboratories gathers the information and provide location aware search facilities. We performed association rule mining and sequence pattern mining against the access log which was accumulated at the MIS site in order to get some insight into the behavior of mobile users regarding the spatial information on the web. Detail web log mining process and the rules we derived are reported in this paper.

Efficient Secure Three-Party Sorting with Applications to Data Analysis and Heavy Hitters

Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security

We present a three-party sorting protocol secure against passive and active adversaries in the ho... more We present a three-party sorting protocol secure against passive and active adversaries in the honest majority setting. The protocol can be easily combined with other secure protocols which work on shared data, and thus enable different data analysis tasks, such as private set intersection of shared data, deduplication, and the identification of heavy hitters. The new protocol computes a stable sort. It is based on radix sort and is asymptotically better than previous secure sorting protocols. It improves on previous radix sort protocols by not having to shuffle the entire length of the items after each comparison step. We implemented our sorting protocol with different optimizations and achieved concretely fast performance. For example, sorting one million items with 32-bit keys and 32-bit values takes less than 2 seconds with semi-honest security and about 3.5 seconds with malicious security. Finding the heavy hitters among hundreds of thousands of 256-bit values takes only a few seconds, compared to close to an hour in previous work. CCS CONCEPTS • Security and privacy → Cryptography.

Investigation on Anxieties while Using the Internet to Study about “Anshin”

Journal of Information Processing, 2011

is an emotion in Japanese that is difficult to translate because it is vague, varies from person ... more is an emotion in Japanese that is difficult to translate because it is vague, varies from person to person, and is subjective. It means something like "a feeling of contentment". The demand for Internet use with "Anshin" is high. We believe that the emotion and the demand could be universal. To study "Anshin," we conducted group interviews as our first step. We obtained 95/157 cases of "Anshin"/anxiety from 28 people. From the results, we found that studying anxiety is valuable. Anxiety is a kind of opposite concept to "Anshin" and controlling it leads to a kind of "Anshin." To discuss this, we constructed a model of the process of anxiety generation and selected candidates for the related elements. After investigating obtained cases, we produced a questionnaire for Internet anxieties to prepare the evaluation of them.

Experimental Trials on Privacy-preserving Data Analysis Using 2-party Secure Circuit Evaluation(2) - The Behavior Analysis of University Students

In recent years, there is a need for widespread utilization of privacy or confidential informatio... more In recent years, there is a need for widespread utilization of privacy or confidential information as well as preserving one. One of the technologies which meets the need is secure function evaluation. In this paper, We evaluate our secure function evaluation system in the field of education. We collect and analyze information on daily life of students by privacy protection enquete system and show a correlation between lifestyle behaviour and academic performance. 1. はじめに

Processing Load Prediction for Parallel FP-growth

Load balancing is a dominant factor to achieve scalable parallel frequent pattern mining. In this... more Load balancing is a dominant factor to achieve scalable parallel frequent pattern mining. In this paper, we examine some methods to predict processing load for parallel FP-growth algorithm. We propose item processing order based heuristic and load prediction function based on the path depth and other statistics which can be collected before the execution of mining process. We also propose sampling to predict statistics such as the number of iterations. Finally, we implement those methods to improve the initial distribution of processing units i.e. conditional pattern bases as well as the load balancing during the execution of those conditional pattern bases. The performance evaluation shows that sampling based load prediction and item ordering heuristics perform well for the initial distribution.

Naviz: User Behavior Visualization of Dynamic Page

Navigational behavior of website visitors can be extracted from web access log files with data mi... more Navigational behavior of website visitors can be extracted from web access log files with data mining techniques such as sequential pattern mining. Visualization of the discovered patterns is very helpful to understand how visitors navigate over the various pages on the site. Currently several web log visualization tools have been developed. However those tools are far from satisfactory. They do

Naviz:Website Navigational Behavior Visualizer

Lecture Notes in Computer Science, 2002

Navigational behavior of website visitors can be extracted from web access log files with data mi... more Navigational behavior of website visitors can be extracted from web access log files with data mining techniques such as sequential pattern mining. Visualization of the discovered patterns is very helpful to understand how visitors navigate over the various pages on the site. Currently several web log visualization tools have been developed. However those tools are far from satisfactory. They do not provide global view of visitor access as well as individual traversal path effectively. Here we introduce Naviz, a system of interactive web log visualization that is designed to overcome those drawbacks. It combines two-dimensional graph of visitor access traversals that considers appropriate web traversal properties, i.e. hierarchization regarding traversal traffic and grouping of related pages, and facilities for filtering traversal paths by specifying visited pages and path attributes, such as number of hops, support and confidence. The tool also provides support for modern dynamic web pages. We apply the tool to visualize results of data mining study on web log data of Mobile Townpage, a directory service of phone numbers in Japan for i-Mode mobile internet users. The results indicate that our system can easily handle thousands of discovered patterns to discover interesting navigational behavior such as success paths, exit paths and lost paths.

Editor's Message to Special Issue of Usable Security

Journal of Information Processing, 2019

Secret sharing system, secret sharing apparatus, secret sharing method, secret sorting method and secret sharing program

k-Anonymous Microdata Release via Post Randomisation Method

Lecture Notes in Computer Science, 2015

The problem of the release of anonymized microdata is an important topic in the fields of statist... more The problem of the release of anonymized microdata is an important topic in the fields of statistical disclosure control (SDC) and privacy preserving data publishing (PPDP), and yet it remains sufficiently unsolved. In these research fields, k-anonymity has been widely studied as an anonymity notion for mainly deterministic anonymization algorithms, and some probabilistic relaxations have been developed. However, they are not sufficient due to their limitations, i.e., being weaker than the original k-anonymity or requiring strong parametric assumptions. First we propose P k-anonymity, a new probabilistic k-anonymity, and prove that P k-anonymity is a mathematical extension of kanonymity rather than a relaxation. Furthermore, P k-anonymity requires no parametric assumptions. This property has a significant meaning in the viewpoint that it enables us to compare privacy levels of probabilistic microdata release algorithms with deterministic ones. Second, we apply P k-anonymity to the post randomization method (PRAM), which is an SDC algorithm based on randomization. PRAM is proven to satisfy P k-anonymity in a controlled way, i.e, one can control PRAM's parameter so that P k-anonymity is satisfied. On the other hand, PRAM is also known to satisfy ε-differential privacy, a recent popular and strong privacy notion. This fact means that our results significantly enhance PRAM since it implies the satisfaction of both important notions: k-anonymity and ε-differential privacy.

Kokono Search: A Location Based Search Engine

World Wide Web Conference Series, 2001

Abstract: We have developed a location-based search system for webdocuments on the Internet. This... more

Privacy Preserving Computations without Public Key Cryptographic Operation

Lecture Notes in Computer Science, 2008

We develop the privacy preserving computation protocol presented by Naor, Pinkas and Sumner at AC... more We develop the privacy preserving computation protocol presented by Naor, Pinkas and Sumner at ACM Conference on Electronic Commerce in 1999 into a more efficient one. Their protocol is based on the Yao's two-party secure function evaluation and can be used to implement any combinatorial circuit for input clients' data without disclosing them. In this paper we propose three types of protocol as variants of the Naor-Pinkas-Sumner protocol in each different framework. The first protocol is the almost same framework as theirs but requires no public key cryptographic operations for clients unlike their protocol. The second protocol furthermore eliminates an oblivious transfer from the two-party operation in their protocol and the first protocol by adding a new entity named "mediator" into the Yao's two-party setting. In the new threeparty setting, it is assumed that no party colludes with any other parties to retain the secrecy of the clients' data. The last protocol removes the mediator from the second protocol in return for clients' some additional burden. Since an oblivious transfer used in the Naor-Pinkas-Sumner protocol and the first protocol is the dominant step in each protocol, the second and last protocols are expected to be much faster than the others.

Secret Sharing with Share-Conversion: Achieving Small Share-Size and Extendibility to Multiparty Computation

IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences, 2015

Practically Efficient Multi-party Sorting Protocols from Comparison Sort Algorithms

Lecture Notes in Computer Science, 2013

Sorting is one of the most important primitives in various systems, for example, database systems... more Sorting is one of the most important primitives in various systems, for example, database systems, since it is often the dominant operation in the running time of an entire system. Therefore, there is a long list of work on improving its efficiency. It is also true in the context of secure multi-party computation (MPC), and several MPC sorting protocols have been proposed. However, all existing MPC sorting protocols are based on less efficient sorting algorithms, and the resultant protocols are also inefficient. This is because only a method for converting data-oblivious algorithms to corresponding MPC protocols is known, despite the fact that most efficient sorting algorithms such as quicksort and merge sort are not data-oblivious. We propose a simple and general approach of converting non-data-oblivious comparison sort algorithms, which include the above algorithms, into corresponding MPC protocols. We then construct an MPC sorting protocol from the well known efficient sorting algorithm, quicksort, with our approach. The resultant protocol is practically efficient since it significantly improved the running time compared to existing protocols in experiments.

Secret Sharing Schemes with Conversion Protocol to Achieve Short Share-Size and Extendibility to Multiparty Computation

Lecture Notes in Computer Science, 2013

Secret sharing scheme (SSS) has been extensively studied since SSSs are important not only for se... more Secret sharing scheme (SSS) has been extensively studied since SSSs are important not only for secure data storage but also as the fundamental building block for many cryptographic protocols such as multiparty computation (MPC). Although both code efficiency and application of MPC are important for SSSs, it is difficult to satisfy both. There have been many studies about MPC on Shamir’s and replicated SSS while their share size is large, and computationally secure SSS and a ramp scheme have a short share size while there have been few studies concerning their MPC. We propose a new computational SSS, and show how to convert shares of our SSS and a ramp SSS to those of multiparty-friendly SSS such as Shamir’s and replicated SSS. This enables one to secretly-share data compactly and extend secretly-shared data to MPC if needed.

Applicability of existing anonymization methods to large location history data in urban travel

2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2012

ABSTRACT Service providers want to know user attributes and recorded information in order to impr... more ABSTRACT Service providers want to know user attributes and recorded information in order to improve more satisfaction of the people, or the efficiency of their services by offering services specialized to the users&#39; preferences. However, since they choose wrong way to collect, classify, analysis, use or disclose to others, of personal information, it may exceed the explicit or implicit of the user regarding the provision of personal information. So far, many anonymization methods for those data have been proposed to solve this problem. As one of anonymous method, we focus on k-anonymization technique to realize a &#39;forest from the trees&#39; as described above. In papers in which these methods are proposed, only qualitative analyze or examples are shown that demonstrate the usefulness of anonymized data, which are the outputs of those methods. Since it is generally said that, if the size of data gets bigger, the anonymization of data becomes easier, those methods have not been applied to real huge data. In this paper, we transform the travel records of 722,000 people traveling by train in the Tokyo area with our proposed anonymization methods, analyze the degree to which each of the results is useful, and conclude that the results are useless even when anonymity level is set to low.

Tag-Based Secure Set-Intersection Protocol and Its Application to Privacy-Enhancing Biometrics

2010 13th International Conference on Network-Based Information Systems, 2010

The secure set-intersection protocol is a cryptographic protocol that retrieves the intersection ... more The secure set-intersection protocol is a cryptographic protocol that retrieves the intersection of two or more datasets without revealing any additional information apart from the intersection data. In this paper we formalize the secure matching tag, which is frequently used in the existing secure set-intersection protocols, and propose applying the tag to privacy-enhancing biometrics. Due to the index property of the tag, the proposed protocol can efficiently match the biometric data of a certain user from among a large amount of encrypted biometric data of users without entering the user ID.

User behavior analysis of location aware search engine

Proceedings Third International Conference on Mobile Data Management MDM 2002

Rapid growth of internet access from mobile users puts much importance on location specific infor... more Rapid growth of internet access from mobile users puts much importance on location specific information on the web. An unique web service called Mobile Info Search (MIS) from NTT Laboratories gathers the information and provide location aware search facilities. We performed association rule mining and sequence pattern mining against the access log which was accumulated at the MIS site in order to get some insight into the behavior of mobile users regarding the spatial information on the web. Detail web log mining process and the rules we derived are reported in this paper.