Papers by Tamara Dimitrova

We develop graphlet analysis for multiplex networks and discuss how this analysis can be extended... more We develop graphlet analysis for multiplex networks and discuss how this analysis can be extended to multilayer and multilevel networks as well as to graphs with node and/or link categorical attributes. The analysis has been adapted for two typical examples of multiplexes: economic trade data represented as a 957-plex network and 75 social networks each represented as a 12-plex network. We show that wedges (open triads) occur more often in economic trade networks than in social networks, indicating the tendency of a country to produce/trade of a product in local structure of triads which are not closed. Moreover, our analysis provides evidence that the countries with small diversity tend to form correlated triangles. Wedges also appear in the social networks, however the dominant graphlets in social networks are triangles (closed triads). If a multiplex structure indicates a strong tie, the graphlet analysis provides another evidence for the concepts of strong/weak ties and structur...

IEEE Access, 2018
We review models for analyzing multivariate data of mixed (heterogeneous) domains such as binary,... more We review models for analyzing multivariate data of mixed (heterogeneous) domains such as binary, categorical, ordinal, counts, continuous, and/or skewed continuous, and methods for modeling various graphs including multiplex, multilevel, and multilayer networks. Data are modeled with Markov random fields which encode Markov property between nodes: two nodes are not connected with an edge if and only if random variables associated with these nodes are conditionally independent, given the other variables. Inferring dependence structure through graphical models (both directed and undirected) is essential for discovering multivariate interaction among high-dimensional data, which could potentially be associated with several diseases. Networks are modeled with exponential random graph models which encode Markov property between edges: two edges are conditionally dependent, given the rest of the network, if they have a common vertex. Studying and understanding multilayer and/or multilevel representations of various phenomena, including social and natural phenomena, could lead to predictive models of these phenomena. Modeling data of heterogeneous domains and multilevel and/or multilayer networks pose challenges which are reviewed. Addressing these challenges within a unified framework stresses open problems and points out new directions for research. INDEX TERMS Graphical models, heterogeneous domains, exponential random graph models, multilevel networks, multiplex networks, multilayer networks.

Graphlet analysis is part of network theory that does not depend on the choice of the network nul... more Graphlet analysis is part of network theory that does not depend on the choice of the network null model and can provide comprehensive description of the local network structure. Here, we propose a novel method for graphlet-based analysis of directed networks by computing first the signature vector for every vertex in the network and then the graphlet correlation matrix of the network. This analysis has been applied to brain effective connectivity networks by considering both direction and sign (inhibitory or excitatory) of the underlying directed (effective) connectivity. In particular, the signature vectors for brain regions and the graphlet correlation matrices of the brain effective network are computed for 40 healthy subjects and common dependencies are revealed. We found that the signature vectors (node, wedge, and triangle degrees) are dominant for the excitatory effective brain networks. Moreover, by considering only those correlations (or anti correlations) in the correlation matrix that are significant (>0.7 or <−0.7) and are presented in more than 60% of the subjects, we found that excitatory effective brain networks show stronger causal (measured with Granger causality) patterns (G-causes and G-effects) than inhibitory effective brain networks. The complexity of systems is frequently the result of non-trivial local connectivity and interaction of its constituents parts. A number of network structural characteristics have recently been the subject of particularly intense research, including degree distributions 1 , community structure 2,3 , and various measures of vertex cen-trality 4,5 , to mention only a few. Vertices may have attributes associated with them; for example, properties of proteins in protein-protein interaction networks 6 , users' social network profiles 7 , or authors' publication histories in co-authorship networks 8. Two approaches that focus on the local connectivity of subgraphs within a network are Motifs and Graphlets. Motifs are defined as sub-graphs that repeat frequently in the networks i.e they repeat at frequency higher than in the random graphs 9,10 , and they depend on the choice of the network's null model. In contrast, graphlets are induced sub-graphs of a network that appear at any frequency and hence are independent of a null model. They have been introduced recently 11 and they have found numerous applications as building blocks of network analysis in various disciplines ranging from social science 12,13 to biology 14,15. In social science, graphlet analysis (known as sub-graph census) is widely adopted in sociometric studies 12. Much of the work in this vein focused on analyzing triadic tendencies as important structural features of social networks (e.g., transi-tivity or triadic closure) as well as analyzing triadic configurations as the basis for various social network theories (e.g., social balance, strength of weak ties, stability of ties, or trust 16). In biology graphlets were used to infer protein structure 17 , to compare biological networks 14,15 , and to characterize the relationship between disease and structure of networks 18. Many of the real-world networks are directed, but until now no method has been proposed based on graphlets that can provide information about local structure of directed networks. Here, we offer a graphlet-based approach for analysis of the local structure of a directed network. In the method proposed in this manuscript, we compute for each vertex, a vector of structural features, called signature vector, based on the number of graphlets associated with the vertex, and for the network its graphlet correlation matrix, measuring graphlet dependencies which reveal unknown organizational principles of the network. We applied the technique to brain effective networks of 40 healthy subjects, and we found that many of the subjects share similar patterns in their network's local structure. In brain networks a node is associated with different types of elements, depending on the level of interest in the brain, and an edge represents the connection or interaction between two elements 19. If the brain is studied on
We show that three basic actor characteristics, namely normalized reciprocity, three cycles, and ... more We show that three basic actor characteristics, namely normalized reciprocity, three cycles, and triplets, can be expressed using an unified framework that is based on computing the similarity index between two sets associated with the actor: the set of her/his friends and the set of those considering her/him as a friend. These metrics are extended to multiplex networks and then computed for two friendship networks generated by collecting data from two groups of undergraduate students. We found that in offline communication strong and weak ties are (almost) equally presented, while in online communication weak ties are dominant. Moreover, weak ties are much less reciprocal than strong ties. However, across different layers of the multiplex network reciprocities are preserved, while triads (measured with normalized three cycles and triplets) are not significant.

—Cloud-Radio Access Network (C-RAN) is characterized by a hierarchical structure in which the bas... more —Cloud-Radio Access Network (C-RAN) is characterized by a hierarchical structure in which the baseband processing functionalities of remote radio heads (RRHs) are implemented by means of cloud computing at a Central Unit (CU). A key limitation of C-RANs is given by the capacity constraints of the fronthaul links connecting RRHs to the CU. In this letter, the impact of this architectural constraint is investigated for the fundamental functions of random access and active User Equipment (UE) identification in the presence of a potentially massive number of UEs. In particular, the standard C-RAN approach based on quantize-and-forward and centralized detection is compared to a scheme based on an alternative CU-RRH functional split that enables local detection. Both techniques leverage Bayesian sparse detection. Numerical results illustrate the relative merits of the two schemes as a function of the system parameters.

There are different approaches towards the problem of separation of words from the audio signal. ... more There are different approaches towards the problem of separation of words from the audio signal. Some include previously obtained data or external knowledge to process the speech, and those are aided segmentation methods, and others are blind, where there is no pre-existing knowledge regarding linguistic properties. Speech separation is useful in many real-world applications though it is a challenging problem.
In this paper we propose a blind method for separating the words. We filter the signal using a butterworth filter to eliminate the noise, extract features such as short-time energy, intensity, pitch, zero crossing rate from the audio and employ different machine learning techniques in order to provide for successful separation of the speech. Separation can be formulated as a classification problem, so we train a model that determines in which point in the sequence there is a segment that contains word. From theoretical studies, it has been observed that energy and magnitude for voiced segments are high, whereas ZCR rate is low for voiced signals. Therefore, these methods are proved to be effective in separation of words. Experimental results have been presented in this paper to verify the theoretical studies.
We have seen a huge growth in the volume of online
text documents available on the Internet, digi... more We have seen a huge growth in the volume of online
text documents available on the Internet, digital libraries, news
sources, and company-wide intranets. It has been forecasted that these documents (with other unstructured data) will become the predominant data type stored online. Automatic text categorization, which is the task of assigning text documents to pre-specified classes (topics or themes) of documents, is an important task that can help people to find information on these huge resources.
In this paper we present different approaches to solving this
problem, and also compare them and their performances in a
different environment.
Uploads
Papers by Tamara Dimitrova
In this paper we propose a blind method for separating the words. We filter the signal using a butterworth filter to eliminate the noise, extract features such as short-time energy, intensity, pitch, zero crossing rate from the audio and employ different machine learning techniques in order to provide for successful separation of the speech. Separation can be formulated as a classification problem, so we train a model that determines in which point in the sequence there is a segment that contains word. From theoretical studies, it has been observed that energy and magnitude for voiced segments are high, whereas ZCR rate is low for voiced signals. Therefore, these methods are proved to be effective in separation of words. Experimental results have been presented in this paper to verify the theoretical studies.
text documents available on the Internet, digital libraries, news
sources, and company-wide intranets. It has been forecasted that these documents (with other unstructured data) will become the predominant data type stored online. Automatic text categorization, which is the task of assigning text documents to pre-specified classes (topics or themes) of documents, is an important task that can help people to find information on these huge resources.
In this paper we present different approaches to solving this
problem, and also compare them and their performances in a
different environment.
In this paper we propose a blind method for separating the words. We filter the signal using a butterworth filter to eliminate the noise, extract features such as short-time energy, intensity, pitch, zero crossing rate from the audio and employ different machine learning techniques in order to provide for successful separation of the speech. Separation can be formulated as a classification problem, so we train a model that determines in which point in the sequence there is a segment that contains word. From theoretical studies, it has been observed that energy and magnitude for voiced segments are high, whereas ZCR rate is low for voiced signals. Therefore, these methods are proved to be effective in separation of words. Experimental results have been presented in this paper to verify the theoretical studies.
text documents available on the Internet, digital libraries, news
sources, and company-wide intranets. It has been forecasted that these documents (with other unstructured data) will become the predominant data type stored online. Automatic text categorization, which is the task of assigning text documents to pre-specified classes (topics or themes) of documents, is an important task that can help people to find information on these huge resources.
In this paper we present different approaches to solving this
problem, and also compare them and their performances in a
different environment.