Dr Tony Robinson

Publications

Notable achievements:

1986: First implementation of Deep Neural Networks, x_i = fn(∑^i-1 w_ij x_j) (Speech Recognition with Associative Networks.)
1987: First publication of Real Time Recurrent Learning (The utility driven dynamic error propagation network.)
1991: First state-of-the-art ASR with neural networks (Several improvements to a recurrent error propagation network phone recognition system)
1992: First real time large vocabulary continuous speech recognition system (Resource Management on DSP32C and SPARCstation)
1994: Shorten - the lossless audio compressor (Shorten_(file_format))
1996: First end-to-end training of neural nets and HMMs (Forward-backward retraining of recurrent neural networks )
1998: The time-first decoder (Time-first search for large vocabulary speech recognition.)
As supervisor to MPhil students:
- First speech editor - edit audio as text
- First editor for correcting continuous speech recognition transcripts
- First automatically scrolling teleprompter (Autocue)

Full list:

Speech processing system and method. T. W. J. Ash and A. J. Robinson. Patent application PCT/GB2016/053456, 2016.
Scaling Recurrent Neural Network Language Models. Will Williams, Niranjani Prasad, David Mrva, Tom Ash and Tony Robinson. ICASSP, 2015.
Audio coding systems and methods. R. C. F. Tucker, C. W. Seymour and A. J. Robinson. Patent US6675144. January 2004.
Connectionist speech recognition of broadcast news. A. J. Robinson, G. D. Cook, D. P. W. Ellis, E. Fosler-Lussier, S. J. Renals, and D. A. G. Williams. Speech Communication, 37(1), 2002.
Adaptive model-based speech enhancement. Beth Logan and Tony Robinson. Speech Communication, 2001.
Improved language modelling though better language model evaluation measures. Philip Clarkson and Tony Robinson. Computer Speech and Language, 2001.
Indexing and retrieval of broadcast news. Steve Renals, Dave Abberley, David Kirby and Tony Robinson. Speech Communication, 2000.
The THISL SDR system at TREC-8. D. Abberley, S. Renals, D. Ellis, and T. Robinson. Eighth Text Retrieval Conference (TREC-8), 2000.
Segmentation of a speech waveform according to glottal open and closed phases using an autoregressive-HMM. Gavin Smith and Tony Robinson. ICSLP, 2000.
Speech Modelling Using Subspace and EM Techniques. Gavin Smith, João FG de Freitas, Mahesan Niranjan and Tony Robinson. NeurIPS, 2000.
Subspace techniques in speech enhancement. Gavin Smith, Mahesan Niranjan and Tony Robinson. Neural Networks in Signal Processing 9, 1999.
Recognition of sequential data using finite state sequence models organized in a tree structure. A. J. Robinson. Patent US5983180, 1999.
Modelling the singing voice. Gavin Smith, Tony Robinson, and Mahesan Niranjan. In 1999 Cambridge Music Processing Colloquium, 1999.
The THISL System for Indexing and Retrieval of Broadcast News. S. Renals, D. Abberley, D. Kirby, and T. Robinson. IEEE Workshop on Multimedia Signal Processing, 1999.
Towards improved language model evaluation measures. Philip Clarkson and Tony Robinson. EUROSPEECH, 1999.
Speech coding using mixture of Gaussians polynomial model. Parham Zolfaghari and Tony Robinson. EUROSPEECH, 1999.
Recognition, indexing and retrieval of british broadcast news with the THISL system. Tony Robinson, Dave Abberley, David Kirby, and Steve Renals. EUROSPEECH, 1999.
Compression of acoustic features - are perceptual quality and recognition performance incompatible goals? Roger Tucker, Tony Robinson, James Christie and Carl Seymour. EUROSPEECH, 1999.
Accessing Information in Spoken Audio, ESCA Tutorial and Research Workshop (ITRW). Tony Robinson and Steve Renals (eds.), Cambridge, UK, April 18-19, 1999.
The THISL broadcast news retrieval system. Dave Abberley, David Kirby, Steve Renals and Tony Robinson. ESCA workshop: Accessing information in spoken audio, 1999.
Recognition-compatible speech compression for stored speech. Roger Tucker, Tony Robinson, James Christie, and Carl Seymour. ESCA workshop: Accessing information in spoken audio, 1999.
Summarisation of spoken audio through information extraction. Robin Valenza, Tony Robinson, Marianne Hickey and Roger Tucker. ESCA workshop: Accessing information in spoken audio, 1999.
An overview of the SPRACH system for the transcription of broadcast news. Gary Cook, James Christie, Dan Ellis, Eric Fosler-Lussier, Yoshi Gotoh, Brian Kingsbury, Nelson Morgan, Steve Renals, Tony Robinson, and Gethin Williams. DARPA Broadcast News Workshop, 1999.
Retrieval of broadcast news documents with the THISL system. Dave Abberley, Steve Renals, Gary Cook and Tony Robinson. Seventh Text Retrieval Conference (TREC-7), 1999.
Speech modelling: Models, parameter estimation and its application to speech enhancement. G. A. Smith, A. J. Robinson, and M. Niranjan. CUED/F-INFENG/TR.345, CUED, 1999.
A practical perceptual frequency autoregressive HMM enhancement system. Beth Logan and Tony Robinson. ICSLP, 1998.
The applicability of adaptive language models to the broadcast news task. Philip Clarkson and Tony Robinson. ICSLP, 1998.
Real-time recognition of broadcast news. Gary Cook, Tony Robinson, and James Christie. ICSLP, 1998.
Connectionist acoustic modelling in the Abbot system. G. D. Cook, A. J. Robinson, and J. deM. Christie. Institute of Acoustics Autumn conference on Speech and Hearing, 1998.
An off-line cursive handwriting recognition system. Andrew Senior and Tony Robinson. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998.
Time-first search for large vocabulary speech recognition. Tony Robinson and James Christie. ICASSP, 1998.
CLEAN TO HERE
Transcribing broadcast news with the 1997 Abbot system. Gary Cook and Tony Robinson. ICASSP, 1998.
Joint Prediction and Vector Quantisation. Carl Seymour and Tony Robinson. 1997.
The 1997 Abbot system for the transcription of broadcast news. G. D. Cook and A. J. Robinson. Broadcast News Transcription and Understanding workshop, 1998.
The THISL spoken document retrieval system. Dave Abberley, Steve Renals, Gary Cook, and Tony Robinson. TREC-6 Proceedings, 1998.
Low bit rate audio coder and decoder operating in a transform domain using vector quantization. A. J. Robinson. Patent US5999899. October 1997.
Multilingual large vocabulary speech recognition: the European sqale project. S. J. Young, M. Adda-Dekker, X. Aubert, C. Dugast, J.-L. Gauvain, D. J. Kershaw, L. Lamel, D. A. Leeuwen, D. Pye, A. J. Robinson, H. J. M. Steeneken, and P. C. Woodland. Computer Speech and Language, 1997.
Transcription of broadcast television and radio news: The 1996 Abbot system. G. D. Cook, D. J. Kershaw, J. D. M. Christie, and A. J. Robinson. In DARPA Speech Recognition Workshop, 1997.
Ensemble methods for connectionist acoustic modelling. G. D. Cook, S. R. Waterhouse, and A. J. Robinson. EUROSPEECH, 1997.
A low-bit-rate speech coder using adaptive line spectral frequency prediction. C. W. Seymour and A. J. Robinson. EUROSPEECH, 1997.
A segmental formant vocoder based on linearly varying mixtures of Gaussians. Parham Zolfaghari and Tony Robinson. EUROSPEECH, 1997.
Improving autoregressive hidden Markov model recognition accuracy using a non-linear frequency scale with application to speech enhancement. B. T. Logan and A. J. Robinson. EUROSPEECH, 1997.
Transcription of broadcast television and radio news: The 1996 Abbot system. G. D. Cook, D. J. Kershaw, J. D. M. Christie, C. W. Seymour and S. R. Waterhouse. ICASSP, 1997.
Language model adaptation using mixtures and an exponentially decaying cache. Philip Clarkson and Tony Robinson. ICASSP, 1997.
Enhancement and recognition of noisy speech within an autoregressive hidden Markov model framework using noise estimates from the noisy signal. B. T. Logan and A. J. Robinson. ICASSP, 1997.
A formant vocoder based on mixtures of Gaussians. Parham Zolfaghari and Tony Robinson. ICASSP, 1997.
Noise estimation for enhancement and recognition within an autoregressive hidden Markov model framework. B. T. Logan and A. J. Robinson. The Sixth Australian International Conference on Speech Science and Technology, 1996.
The 1995 ABBOT LVCSR system for multiple unknown microphones. Dan Kershaw, Tony Robinson, and Steve Renals. ICSLP, 1996.
Smoothed local adaptation of connectionist systems. Steve Waterhouse, Dan Kershaw, and Tony Robinson. ICSLP, 1996.
Boosting the performance of connectionist large vocabulary speech recognition. Gary Cook and Tony Robinson. ICSLP, 1996.
Formant analysis using mixtures of Gaussians.Parham Zolfaghari and Tony Robinson. ICSLP, 1996.
Bayesian methods for mixtures of experts. Steve Waterhouse, David MacKay and Tony Robinson. NeurIPS 8, 1996.
Constructive algorithms for hierarchical mixtures of experts. S. R. Waterhouse and A. J. Robinson. NeurIPS 8, 1996.
Context-dependent classes in a hybrid recurrent network-HMM speech recognition system. Dan Kershaw, Tony Robinson, and Mike Hochberg. NeurIPS 8, 1996.
Forward-backward retraining of recurrent neural networks. Andrew Senior and Tony Robinson. NeurIPS 8, 1996.
Real-time recognition of broadcast radio speech. G. D. Cook, J. D. Christie, P. R. Clarkson, M. M. Hochberg, B. T. Logan, A. J. Robinson, and C. W. Seymour. ICASSP, 1996.
The 1995 ABBOT hybrid connectionist-HMM large vocabulary recognition system. D. J. Kershaw, A. J. Robinson, and S. J. Renals. Speech Recognition Workshop. Morgan Kaufmann, February 1996. ISBN 1-55860-422-7.
The use of recurrent networks in continuous speech recognition. Tony Robinson, Mike Hochberg, and Steve Renals. In Chin-Hui Lee, Frank K. Soong, and Kuldip K. Paliwal, editors, Automatic Speech and Speaker Recognition: Advanced Topics, Chapter 10. Kluwer Academic Publishers, 1996.
Context-dependent modelling in the ABBOT LVSCR system. D. J. Kershaw, M. M. Hochberg, and A. J. Robinson. IEEE ASR workshop at Snowbird, 1995.
Context-dependent classes in a hybrid recurrent network-HMM speech recognition system. D. J. Kershaw, M. M. Hochberg, and A. J. Robinson. CUED/F-INFENG/TR.217, CUED, 1995.
Utterance clustering for large vocabulary speech recognition. G. D. Cook and A. J. Robinson. EUROSPEECH, 1995.
Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system. João Neto, Luis Almeida, Mike Hochberg, Ciro Martins, Luis Nunes, Steve Renals, and Tony Robinson. EUROSPEECH, 1995.
Pruning and growing hierarchical mixtures of experts. Steven R. Waterhouse and A. J. Robinson. IEE conference on Artificial Neural Networks, 1995.
Training MLPs via the expectation-maximisation algorithm. Gary Cook and Tony Robinson. IEE conference on Artificial Neural Networks, 1995.
WSJCAM0: A British English speech corpus for large vocabulary continuous speech recognition. Tony Robinson, Jeroen Fransen, David Pye, Jonathan Foote, and Steve Renals. ICASSP, 1995.
The 1994 Abbot hybrid connectionist/HMM large-vocabulary recognition system. M. M. Hochberg, G. D. Cook, S. J. Renals, A. J. Robinson, and R. T. Schechtman. ARPA Spoken Language Systems, 1995.
Recent improvements to the Abbot large vocabulary CSR system. M. M. Hochberg, S. J. Renals, A. J. Robinson and G. D. Cook. ICASSP, 1995.
Non-linear prediction of acoustic vectors using hierarchical mixtures of experts. S. R. Waterhouse and A. J. Robinson. NeurIPS 7, 1994.
Spiral: A vibrotactile based speech listening aid. E. M. Ellis and A. J. Robinson. Third International Conference on Tactile Aids, Hearing Aids and Cochlear Implants, 1994.
ABBOT: the CUED hybrid conectionist-HMM large vocabulary recognition system. M. M. Hochberg, S. J. Renals, and A. J. Robinson. SLS Workshop, 1994.
SHORTEN: Simple lossless and near-lossless waveform compression. Tony Robinson. CUED/F-INFENG/TR.156, CUED, 1994.
Connectionist model combination for large vocabulary speech recognition. Mike Hochberg, Gary Cook, Steve Renals, and Tony Robinson. IEEE Workshop on Neural Networks for Signal Processing, 1994.
Large vocabulary continuous speech recognition using a hybrid connectionist-HMM system. Mike Hochberg, Tony Robinson, Steve Renals and Dan Kershaw. ICSLP, 1994.
Classification using hierarhical mixtures of experts. S. R. Waterhouse and A. J. Robinson. IEEE Workshop on Neural Networks for Signal Processing, 1994.
WSJCAM0 corpus and recording description. Jeroen Fransen, Dave Pye, Tony Robinson, Phil Woodland, and Steve Young. CUED/F-INFENG/TR.192, CUED, 1994.
The development of file formats for very large speech corpora: Sphere and shorten. John Garofolo, Tony Robinson, and Jonathan Fiscus. ICASSP, 1994.
IPA: Improved phone modelling with recurrent neural networks. Tony Robinson, Mike Hochberg, and Steve Renals. ICASSP, 1994.
The application of recurrent nets to phone probability estimation. Tony Robinson. IEEE Transactions on Neural Networks, 1994.
Learning temporal dependencies in connectionist speech recognition. Steve Renals, Mike Hochberg and Tony Robinson. NeurIPS, 1994.
A neural network based, speaker independent, large vocabulary, continuous speech recognition system: The WERNICKE project. Tony Robinson, Luis Almeida, Jean-Marc Boite, Herve Bourlard, Frank Fallside, Mike Hochberg, Dan Kershaw, Phil Kohn, Yochai Konig, Steve Renals, Marco Saerens, João Paulo Neto, Nelson Morgan, and Chuck Wooters. EUROSPEECH, 1993.
Speech synthesis using artificial neural networks trained on cepstral coefficients. Christine Tuerk and Tony Robinson. EUROSPEECH, 1993.
A new frequency shift function for reducing inter-speaker variance. Christine Tuerk and Tony Robinson. EUROSPEECH, 1993.
A tactile system for speech listening based on phonetic representation. E. M. Ellis and A. J. Robinson. ESCA Conference on Speech and Language Technology for Disabled Persons, 1993.
A phonetic tactile speech listening listening system. E. M. Ellis and A. J. Robinson. CUED/F-INFENG/TR.122, CUED, 1993.
Application of an auditory model to the computer simulation of hearing impairment: Preliminary results. C. Giguère, P. C. Woodland, and A. J Robinson. Canadian Acoustics, 1993.
Artifcial Neural Networks: The mole-grips of the speech scientist. Tony Robinson. Visual Representations of Speech Signals, John Wiley and Sons, 1993.
The state space and "ideal input" representations of recurrent networks. Tony Robinson. Visual Representations of Speech Signals, John Wiley and Sons, 1993.
Recurrent nets for phone probability estimation. Tony Robinson. ARPA Continuous Speech Recognition Workshop, 1992.
Practical network design and implementation. Tony Robinson. Cambridge Neural Network Summer School, 1992.
A real-time recurrent error propagation network word recognition system. Tony Robinson. ICASSP, 1992.
Response time as a metric for comparison of speech recognition by humans and machines. Anne Cutler and Tony Robinson. ICSLP, 1992.
A multiple-speaker phoneme durational model. Christine Tuerk and Tony Robinson. Institute of Acoustics Autumn Conference on Speech and Hearing, 1992.
Two dimensional representation of phonemes of the English language. Errol M. Ellis and Tony Robinson. Institute of Acoustics Autumn Conference on Speech and Hearing, 1992.
Several improvements to a recurrent error propagation network phone recognition system. Tony Robinson. CUED/F-INFENG/TR.82, CEUD, 1991.
The development of a connectionist multiple-voice text-to-speech system. Christine Tuerk, Peter Monaco, and Tony Robinson. ICASSP, 1991.
Lexical access using a recurrent error propagation network. N. H. Russell, F. Fallside, A. J. Robinson and R. W. Prager. EUROSPEECH, 1991.
Recognition of continuous speech using recurrent error propagation networks. Tony Robinson. Proceedings of Voice Systems Worldwide, 1991.
A recurrent error propagation network speech recognition system. Tony Robinson and Frank Fallside. Computer Speech and Language, 1991.
Word recognition from the DARPA resource management database with the Cambridge recurrent error propagation network speech recognition system. Tony Robinson and Frank Fallside. Third Australian International Conference on Speech Science and Technology, 1990.
A comparison of preprocessors for the Cambridge recurrent error propagation network speech recognition system. Tony Robinson, John Holdsworth, Roy Patterson, and Frank Fallside. ICSLP, 1990.
Continuous speech recognition for the TIMIT database using neural networks. F. Fallside, H. Lucke, T. P. Marsland, P. J. O'Shea, M. St. J. Owen, R. W. Prager, A. J. Robinson, and N. H. Russell. ICASSP, 1990.
Phoneme recognition from the TIMIT database using recurrent error propagation networks. Tony Robinson and Frank Fallside. CUED/F-INFENG/TR.42, CUED, 1990.
Dynamic reinforcement driven error propagation networks with application to game playing. Tony Robinson and Frank Fallside. Eleventh Annual Conference of the Cognitive Science Society, 1989.
Dynamic Error Propagation Networks. A. J. Robinson. PhD thesis, CUED, 1989.
A dynamic connectionist model for phoneme recognition. A. J. Robinson and F. Fallside. Neural Networks from Models to Applications: Proceedings of nEuro'88, 1989.
Generalising the nodes of the error propagation network. A. J. Robinson, M. Niranjan, and F. Fallside. CUED/F-INFENG/TR.25, CUED, 1988.
Static and dynamic error propagation networks with application to speech coding. A. J. Robinson and F. Fallside. NeurIPS, 1988.
A comparison of three connectionist models for phoneme recognition in continuous speech. F. Fallside, T. D. Harrison, R. W. Prager, and A. J. Robinson. ATR Workshop on Neural Networks and Parallel Distributed Processing, 1988.
A dynamic connectionist model for phoneme recognition: Preliminary results. A. J. Robinson and F. Fallside. CUED/F-INFENG/TR.14, CUED, 1988.
The utility driven dynamic error propagation network. A. J. Robinson and F. Fallside. CUED/F-INFENG/TR.1, CUED, 1987.
Speech Recognition with Associative Networks. Tony Robinson, MPhil thesis, CUED, 1986.

Abbreviation	Publisher
CUED	Cambridge University Engineering Department
EUROSPEECH	European Conference on Speech Technology
ICASSP	International Conference on Acoustics, Speech and Signal Processing
ICSLP	International Conference on Spoken Language Processing
NeurIPS	Neural Information Processing Systems

Email me if you want a copy of something that doesn't yet have a PDF.