This paper deals with residual life prediction methodology for more than 12.5 years service expos... more This paper deals with residual life prediction methodology for more than 12.5 years service exposed main steam pipes of various boilers in a thermal power plant. Health assessment was made using destructive accelerated stress rupture and tensile tests at dierent temperatures, and some nondestructive tests. There was no evidence of localised damage in the form of surface cracks, cavitation or dents in the service exposed main steam pipes of all the boilers. So far as the remaining life at 5508C is concerned, it is possible to obtain a life of greater than 100,000 h at the hoop stress level of the service exposed pipes, provided no localised damage in the form of cracks or dents have been developed. It is recommended that a health check may be carried out after 50,000 h of service exposure at 5508C. 7
Thyroid hormone insufficiency adversely affects cortical development; however, its effect on apop... more Thyroid hormone insufficiency adversely affects cortical development; however, its effect on apoptosis modulation during cerebral cortex development is not understood. We investigated the effect of perinatal hypothyroidism on apoptosis and its mechanisms during rat cerebral cortex development. Primary hypothyroidism was induced by feeding methimazole (0.025% wt/vol) in the drinking water to pregnant and lactating rats and continued until the animals were killed (hypothyroid group). Cerebral cortices from pups were harvested at different postnatal ages (postnatal d 0, 8, 16, and 24 and adult), and apoptosis was quantitated by terminal deoxynucleotide transferase-mediated dUTP nick end labeling and cleaved caspase-3 immunoreactivity. Compared with the euthyroid, primary somatosensory cortex (S1) in the hypothyroid group exhibited enhanced apoptosis. In S1 of euthyroid rats, apoptotic cells were mostly found in cortical layers I-III and the proportion of apoptotic cells enhanced signifi-cantly in the hypothyroid group (P < 0.001). Most of the apoptotic cells were neurons, as assessed by double immunolabeling. A significantly increased activation of caspase-3 and -7, decreased levels of antiapoptotic proteins Bcl-2 and Bcl-x L , and increased levels of proapoptotic protein Bax was observed in the developing cerebral cortex of hypothyroid rats, compared with the euthyroid (P < 0.001). In addition, hypothyroidism significantly elevated the levels of 53-kDa pronerve growth factor (P < 0.001) and p75 neurotrophin receptor (P < 0.001) and decreased TrkA expression. Taken together, we provide evidence for the possible contribution of pronerve growth factor/p75 neurotrophin receptor pathway in hypothyroidism-enhanced apoptosis during rat cortical development. Thus, the present study may help in explaining the mechanism of the deleterious effect of thyroid hormone deficiency on cerebral cortex development in children. (Endocrinology 147: [4893][4894][4895][4896][4897][4898][4899][4900][4901][4902][4903] 2006)
Our investigation into the front-end signal processing for maximum likelihood based speaker norma... more Our investigation into the front-end signal processing for maximum likelihood based speaker normalization reveals that in the linear scaling model, it is more appropriate (and evidently more correct) to assume that the spectral envelopes of any two speakers for same sound are linearly scaled versions of one and another, rather than assuming that the whole magnitude spectra (including pitch harmonics) are scaled. The use of the proposed model and its implementation results in about 4% and 7% relative improvement for adults and children respectively on a digit recognition task.
In this paper, we propose a mathematical model to describe the relation between the formant frequ... more In this paper, we propose a mathematical model to describe the relation between the formant frequencies of speakers and show that with the proposed affine model, speaker differences separate out as translation factors when a "mel-like" warping is performed. Using speech data we estimate the parameters of this warping function and show that it is close to the usual mel-formula. This model is motivated by Rohit et al.'s [1] shift-based non-uniform speakernormalization method, which provides improvement over the conventional maximum-likelihood based speaker normalization methods. We therefore provide a unified framework that relates the relationship between formants of speakers and method of removing speakers difference (which involves mel-warping) in a neat mathematical framework which is substantiated by our recognition experiments.
IEEE Transactions on Audio, Speech & Language Processing, 2006
Broadcast News (BN) transcription has been a challenging research area for many years. In the las... more Broadcast News (BN) transcription has been a challenging research area for many years. In the last couple of years the availability of large amounts of roughly transcribed acoustic training data and advanced model training techniques has offered the opportunity to greatly reduce the error rate on this task. This paper describes the design and performance of BN transcription systems which make use of these developments. First the effects of using lightly-supervised training data and advanced acoustic modelling techniques are discussed. The design of a real-time broadcast news recognition system is then detailed using these new models. As system combination has been found to yield large gains in performance, a range of frameworks that allow multiple recognition outputs to be combined are next described. These include the use of multiple types of acoustic models and multiple segmentations. As a contrast a system developed by multiple sites allowing cross-site combination, the "SuperEARS" system, is also described. The various models and recognition configurations are evaluated using several recent BN development and evaluation test sets. These new BN transcription systems can give gains of over 25% relative to the CU-HTK 2003 BN system.
In this paper, we present results of non-uniform vowel normalization and show that the frequency-... more In this paper, we present results of non-uniform vowel normalization and show that the frequency-warping necessary to do nonuniform vowel normalization is similar to the mel-scale. We compare our methods to Fant's non-uniform vowel normalization method and show that with proposed frequency warping approach we can achieve similar performance without any knowledge of the spoken vowel and the formant number. The proposed approach is motivated by a desire to perform non-uniform speaker normalization in automatic speech recognition systems. We also present results of a more comprehensive study of our earlier work on nonuniform scaling which again shows that mel-scale is the appropriate warping function. All the results in this paper are based on data from Peterson & Barney and Hillenbrand et al. vowel databases.
NON-UNIFORM SCALING BASED SPEAKER NORMALIZATION Rohit Sinha and S. Umesh Department of Electrical... more NON-UNIFORM SCALING BASED SPEAKER NORMALIZATION Rohit Sinha and S. Umesh Department of Electrical Engineering Indian Institute of Technology Kanpur, 208 016, INDIA {srohit, sumesh}@iitk.ac.in ABSTRACT ... [10] S. Umesh, SVBharath Kumar, MKVinay, Rajesh ...
This paper discusses the development of the CU-HTK Mandarin Broadcast News (BN) transcription sys... more This paper discusses the development of the CU-HTK Mandarin Broadcast News (BN) transcription system. The Mandarin BN task includes a significant amount of English data. Hence techniques have been investigated to allow the same system to handle both Mandarin and English by augmenting the Mandarin training sets with English acoustic and language model training data. A range of acoustic models were built including models based on Gaussianised features, speaker adaptive training and feature-space MPE. A multi-branch system architecture is described in which multiple acoustic model types, alternate phone sets and segmentations can be used in a system combination framework to generate the final output. The final system shows state-of-the-art performance over a range of test sets.
The majority of state-of-the-art speech recognition systems make use of system combination. The c... more The majority of state-of-the-art speech recognition systems make use of system combination. The combination approaches adopted have traditionally been tuned to minimising Word Error Rates (WERs). In recent years there has been growing interest in taking the output from speech recognition systems in one language and translating it into another. This paper investigates the use of cross-site combination approaches in terms of both WER and impact on translation performance. In addition the stages involved in modifying the output from a Speech-to-Text (STT) system to be suitable for translation are described. Two source languages, Mandarin and Arabic, are recognised and then translated using a phrase-based statistical machine translation system into English. Performance of individual systems and cross-site combination using cross-adaptation and ROVER are given. Results show that the best STT combination scheme in terms of WER is not necessarily the most appropriate when translating speech.
This paper describes the development of the Cambridge University RT-04 diarisation system, includ... more This paper describes the development of the Cambridge University RT-04 diarisation system, including details of the new segmentation and clustering components. The final system gives a diarisation error rate of 23.9% on the RT-04 evaluation data, a 34% relative improvement over the RT-03s evaluation system. A further reduction down to 18.1% is shown to be possible when using the segmentation algorithm alone.
Gross Domestic Capital Formation: Capital formation means creation of physical assets and non-phy... more Gross Domestic Capital Formation: Capital formation means creation of physical assets and non-physical capital consisting of public health efficiency, visible and no visible capital.Gross capital formation which includes two components such as (a) Gross domestic fixed capital formation (b) change in stock.
This paper deals with residual life prediction methodology for more than 12.5 years service expos... more This paper deals with residual life prediction methodology for more than 12.5 years service exposed main steam pipes of various boilers in a thermal power plant. Health assessment was made using destructive accelerated stress rupture and tensile tests at dierent temperatures, and some nondestructive tests. There was no evidence of localised damage in the form of surface cracks, cavitation or dents in the service exposed main steam pipes of all the boilers. So far as the remaining life at 5508C is concerned, it is possible to obtain a life of greater than 100,000 h at the hoop stress level of the service exposed pipes, provided no localised damage in the form of cracks or dents have been developed. It is recommended that a health check may be carried out after 50,000 h of service exposure at 5508C. 7
Thyroid hormone insufficiency adversely affects cortical development; however, its effect on apop... more Thyroid hormone insufficiency adversely affects cortical development; however, its effect on apoptosis modulation during cerebral cortex development is not understood. We investigated the effect of perinatal hypothyroidism on apoptosis and its mechanisms during rat cerebral cortex development. Primary hypothyroidism was induced by feeding methimazole (0.025% wt/vol) in the drinking water to pregnant and lactating rats and continued until the animals were killed (hypothyroid group). Cerebral cortices from pups were harvested at different postnatal ages (postnatal d 0, 8, 16, and 24 and adult), and apoptosis was quantitated by terminal deoxynucleotide transferase-mediated dUTP nick end labeling and cleaved caspase-3 immunoreactivity. Compared with the euthyroid, primary somatosensory cortex (S1) in the hypothyroid group exhibited enhanced apoptosis. In S1 of euthyroid rats, apoptotic cells were mostly found in cortical layers I-III and the proportion of apoptotic cells enhanced signifi-cantly in the hypothyroid group (P < 0.001). Most of the apoptotic cells were neurons, as assessed by double immunolabeling. A significantly increased activation of caspase-3 and -7, decreased levels of antiapoptotic proteins Bcl-2 and Bcl-x L , and increased levels of proapoptotic protein Bax was observed in the developing cerebral cortex of hypothyroid rats, compared with the euthyroid (P < 0.001). In addition, hypothyroidism significantly elevated the levels of 53-kDa pronerve growth factor (P < 0.001) and p75 neurotrophin receptor (P < 0.001) and decreased TrkA expression. Taken together, we provide evidence for the possible contribution of pronerve growth factor/p75 neurotrophin receptor pathway in hypothyroidism-enhanced apoptosis during rat cortical development. Thus, the present study may help in explaining the mechanism of the deleterious effect of thyroid hormone deficiency on cerebral cortex development in children. (Endocrinology 147: [4893][4894][4895][4896][4897][4898][4899][4900][4901][4902][4903] 2006)
Our investigation into the front-end signal processing for maximum likelihood based speaker norma... more Our investigation into the front-end signal processing for maximum likelihood based speaker normalization reveals that in the linear scaling model, it is more appropriate (and evidently more correct) to assume that the spectral envelopes of any two speakers for same sound are linearly scaled versions of one and another, rather than assuming that the whole magnitude spectra (including pitch harmonics) are scaled. The use of the proposed model and its implementation results in about 4% and 7% relative improvement for adults and children respectively on a digit recognition task.
In this paper, we propose a mathematical model to describe the relation between the formant frequ... more In this paper, we propose a mathematical model to describe the relation between the formant frequencies of speakers and show that with the proposed affine model, speaker differences separate out as translation factors when a "mel-like" warping is performed. Using speech data we estimate the parameters of this warping function and show that it is close to the usual mel-formula. This model is motivated by Rohit et al.'s [1] shift-based non-uniform speakernormalization method, which provides improvement over the conventional maximum-likelihood based speaker normalization methods. We therefore provide a unified framework that relates the relationship between formants of speakers and method of removing speakers difference (which involves mel-warping) in a neat mathematical framework which is substantiated by our recognition experiments.
IEEE Transactions on Audio, Speech & Language Processing, 2006
Broadcast News (BN) transcription has been a challenging research area for many years. In the las... more Broadcast News (BN) transcription has been a challenging research area for many years. In the last couple of years the availability of large amounts of roughly transcribed acoustic training data and advanced model training techniques has offered the opportunity to greatly reduce the error rate on this task. This paper describes the design and performance of BN transcription systems which make use of these developments. First the effects of using lightly-supervised training data and advanced acoustic modelling techniques are discussed. The design of a real-time broadcast news recognition system is then detailed using these new models. As system combination has been found to yield large gains in performance, a range of frameworks that allow multiple recognition outputs to be combined are next described. These include the use of multiple types of acoustic models and multiple segmentations. As a contrast a system developed by multiple sites allowing cross-site combination, the "SuperEARS" system, is also described. The various models and recognition configurations are evaluated using several recent BN development and evaluation test sets. These new BN transcription systems can give gains of over 25% relative to the CU-HTK 2003 BN system.
In this paper, we present results of non-uniform vowel normalization and show that the frequency-... more In this paper, we present results of non-uniform vowel normalization and show that the frequency-warping necessary to do nonuniform vowel normalization is similar to the mel-scale. We compare our methods to Fant's non-uniform vowel normalization method and show that with proposed frequency warping approach we can achieve similar performance without any knowledge of the spoken vowel and the formant number. The proposed approach is motivated by a desire to perform non-uniform speaker normalization in automatic speech recognition systems. We also present results of a more comprehensive study of our earlier work on nonuniform scaling which again shows that mel-scale is the appropriate warping function. All the results in this paper are based on data from Peterson & Barney and Hillenbrand et al. vowel databases.
NON-UNIFORM SCALING BASED SPEAKER NORMALIZATION Rohit Sinha and S. Umesh Department of Electrical... more NON-UNIFORM SCALING BASED SPEAKER NORMALIZATION Rohit Sinha and S. Umesh Department of Electrical Engineering Indian Institute of Technology Kanpur, 208 016, INDIA {srohit, sumesh}@iitk.ac.in ABSTRACT ... [10] S. Umesh, SVBharath Kumar, MKVinay, Rajesh ...
This paper discusses the development of the CU-HTK Mandarin Broadcast News (BN) transcription sys... more This paper discusses the development of the CU-HTK Mandarin Broadcast News (BN) transcription system. The Mandarin BN task includes a significant amount of English data. Hence techniques have been investigated to allow the same system to handle both Mandarin and English by augmenting the Mandarin training sets with English acoustic and language model training data. A range of acoustic models were built including models based on Gaussianised features, speaker adaptive training and feature-space MPE. A multi-branch system architecture is described in which multiple acoustic model types, alternate phone sets and segmentations can be used in a system combination framework to generate the final output. The final system shows state-of-the-art performance over a range of test sets.
The majority of state-of-the-art speech recognition systems make use of system combination. The c... more The majority of state-of-the-art speech recognition systems make use of system combination. The combination approaches adopted have traditionally been tuned to minimising Word Error Rates (WERs). In recent years there has been growing interest in taking the output from speech recognition systems in one language and translating it into another. This paper investigates the use of cross-site combination approaches in terms of both WER and impact on translation performance. In addition the stages involved in modifying the output from a Speech-to-Text (STT) system to be suitable for translation are described. Two source languages, Mandarin and Arabic, are recognised and then translated using a phrase-based statistical machine translation system into English. Performance of individual systems and cross-site combination using cross-adaptation and ROVER are given. Results show that the best STT combination scheme in terms of WER is not necessarily the most appropriate when translating speech.
This paper describes the development of the Cambridge University RT-04 diarisation system, includ... more This paper describes the development of the Cambridge University RT-04 diarisation system, including details of the new segmentation and clustering components. The final system gives a diarisation error rate of 23.9% on the RT-04 evaluation data, a 34% relative improvement over the RT-03s evaluation system. A further reduction down to 18.1% is shown to be possible when using the segmentation algorithm alone.
Gross Domestic Capital Formation: Capital formation means creation of physical assets and non-phy... more Gross Domestic Capital Formation: Capital formation means creation of physical assets and non-physical capital consisting of public health efficiency, visible and no visible capital.Gross capital formation which includes two components such as (a) Gross domestic fixed capital formation (b) change in stock.
Uploads
Papers by Rohit Sinha