1,651 to 1,700 of 1,819 Results
Jun 17, 2019
Ramabhadran, Bhuvana; Gustman, Samuel; Byrne, William; Hajič, Jan; Oard, Douglas; Olsson, J. Scott; Picheny, Michael; Psutka, Josef, 2019, "USC-SFI MALACH Interviews and Transcripts English – Speech Recognition Edition", https://hdl.handle.net/11272.1/AB2/SGOMWO, Abacus Data Network, V1
USC-SFI MALACH Interviews and Transcripts English – Speech Recognition Edition, LDC Catalog Number LDC2019S11 and ISBN 1-58563-889-7, was developed by IBM as part of the MALACH (Multilingual Access to Large Spoken ArCHives) Project. This edition augments USC-SFI MALACH Interviews... |
May 15, 2019
Mena, Carlos Daniel Hernández, 2019, "CIEMPIESS Experimentation", https://hdl.handle.net/11272.1/AB2/DUUYQV, Abacus Data Network, V1
CIEMPIESS (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) Experimentation was developed by the social service program "Desarrollo de Tecnologías del Habla" of the "Facultad de Ingeniería" (FI) at the National Autonomous Univer... |
May 15, 2019
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2019, "TAC KBP Chinese Regular Slot Filling - Comprehensive Training and Evaluation Data 2014", https://hdl.handle.net/11272.1/AB2/ZZMOPP, Abacus Data Network, V1
TAC KBP Chinese Regular Slot Filling - Comprehensive Training and Evaluation Data 2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Chinese Regular Slot Filling evaluation track conducted in 201... |
May 15, 2019
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2019, "Multi-Language Conversational Telephone Speech 2011 -- English Group", https://hdl.handle.net/11272.1/AB2/ACDWDL, Abacus Data Network, V1
Multi-Language Conversational Telephone Speech 2011 – English Group was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 18 hours of telephone speech in two general varieties of English: American and South Asian. The data were collected primaril... |
Apr 15, 2019
Li, Xuansong; Peterson, Katherine; Grimes, Stephen; Strassel, Stephanie, 2019, "BOLT Egyptian-English Word Alignment -- Discussion Forum Training", https://hdl.handle.net/11272.1/AB2/AR1QCS, Abacus Data Network, V1
BOLT Egyptian-English Word Alignment – Discussion Forum Training was developed by the Linguistic Data Consortium (LDC) and consists of 400,448 words of Egyptian Arabic and English parallel text enhanced with linguistic tags to indicate word relations. The DARPA BOLT (Broad Operat... |
Apr 15, 2019
Li, Bin; Wen, Yuan; Song, Li; Dai, Rubing; Qu, Weiguang; Xue, Nianwen, 2019, "Chinese Abstract Meaning Representation 1.0", https://hdl.handle.net/11272.1/AB2/TT5KRI, Abacus Data Network, V1
Chinese Abstract Meaning Representation was developed by Brandeis University and Nanjing Normal University and is comprised of semantic representations of a set of Chinese sentences from Chinese Treebank 8.0 (LDC2013T21). Abstract Meaning Representation (AMR) captures "who is doi... |
Mar 15, 2019
Prasad, Rashmi; Webber, Bonnie; Lee, Alan; Joshi, Aravind, 2019, "Penn Discourse Treebank Version 3.0", https://hdl.handle.net/11272.1/AB2/SUU9CB, Abacus Data Network, V1
Penn Discourse Treebank (PDTB) Version 3.0 is the third release in the Penn Discourse Treebank project, the goal of which is to annotate the Wall Street Journal (WSJ) section of Treebank-2 (LDC95T7) with discourse relations. Penn Discourse Treebank Version 2 (LDC2008T05) contains... |
Mar 15, 2019
Canavan, Alexandra; Zipperlen, George; Bartlett, John, 2019, "CALLFRIEND Egyptian Arabic Second Edition", https://hdl.handle.net/11272.1/AB2/4LCUFC, Abacus Data Network, V1
CALLFRIEND Egyptian Arabic Second Edition was developed by the Linguistic Data Consortium (LDC) and consists of approximately 25 hours of unscripted telephone conversations between native speakers of Egyptian Arabic. This second edition updates the audio files to wav format, simp... |
Mar 15, 2019
Tracey, Jennifer; Strassel, Stephanie; Kuster, Neil, 2019, "VAST Chinese Speech and Transcripts", https://hdl.handle.net/11272.1/AB2/OE8XTX, Abacus Data Network, V1
VAST Chinese Speech and Transcripts was developed by the Linguistic Data Consortium (LDC) for the VAST (Video Annotation for Speech Technologies) project and is comprised of approximately 29 hours of Mandarin Chinese audio extracted from amateur video content harvested from the w... |
Feb 15, 2019
Tracey, Jennifer; Arrigo, Michael; Kuster, Neil; Strassel, Stephanie, 2019, "DEFT Chinese Committed Belief Annotation", https://hdl.handle.net/11272.1/AB2/EGZOQ9, Abacus Data Network, V1
DEFT Chinese Committed Belief Annotation was developed by the Linguistic Data Consortium (LDC) and consists of approximately 83,000 tokens of Chinese discussion forum text annotated for “committed belief,” which marks the level of commitment displayed by the author to the truth o... |
Feb 15, 2019
Upadhyay, Shyam; Hakkani-Tur, Dilek; Tur, Gokhan; Rastogi, Abhinav, 2019, "Multilingual ATIS", https://hdl.handle.net/11272.1/AB2/AGMWIU, Abacus Data Network, V1
Multilingual ATIS was developed by Google Inc. and consists of 5,871 utterances from ATIS2 (LDC93S5), ATIS3 Training Data (LDC94S19), and ATIS3 Test Data (LDC95S26) annotated and translated into Hindi and Turkish. The ATIS (Air Travel Information Services) collection was develope... |
Feb 15, 2019
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2019, "Multi-Language Conversational Telephone Speech 2011 -- Arabic Group", https://hdl.handle.net/11272.1/AB2/A5UT97, Abacus Data Network, V1
Multi-Language Conversational Telephone Speech 2011 – Arabic Group was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 117 hours of telephone speech in distinct dialects of colloquial Arabic: Iraqi, Levantine and Maghrebi. The data were collect... |
Jan 15, 2019
Richey, Colleen; D'Angelo, Cynthia; Alozie, Nonye; Bratt, Harry; Shriberg, Elizabeth, 2019, "SRI Speech-Based Collaborative Learning Corpus", https://hdl.handle.net/11272.1/AB2/YJWBEU, Abacus Data Network, V1
SRI Speech-Based Collaborative Learning Corpus was developed by SRI International and is comprised of approximately 120 hours of English speech from 134 US middle school students working collaboratively. The data set also contains orthographic transcriptions, manual annotation of... |
Jan 15, 2019
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2019, "TAC KBP Entity Discovery and Linking - Comprehensive Training and Evaluation Data 2014-2015", https://hdl.handle.net/11272.1/AB2/LCPM63, Abacus Data Network, V1
TAC KBP Entity Discovery and Linking - Comprehensive Training and Evaluation Data 2014-2015 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Entity Discovery and Linking (EDL) tasks in 2014 and 2015... |
Jan 15, 2019
Song, Zhiyi; Tracey, Jennifer; Walker, Christopher; Stephanie, Strassel,, 2019, "BOLT Arabic Discussion Forum Parallel Training Data", https://hdl.handle.net/11272.1/AB2/CZR6SG, Abacus Data Network, V1
BOLT Arabic Discussion Forum Parallel Training Data was developed by the Linguistic Data Consortium (LDC) and consists of 1,169,599 tokens of Egyptian Arabic discussion forum data collected for the DARPA BOLT program along with their corresponding English translations. The BOLT (... |
Dec 17, 2018
Linguistic Data Consortium, 2018, "HUB5 Mandarin Telephone Speech and Transcripts Second Edition", https://hdl.handle.net/11272.1/AB2/2JAJJE, Abacus Data Network, V1
HUB5 Mandarin Telephone Speech and Transcripts Second Edition was developed by the Linguistic Data Consortium (LDC) in support of US government projects for language recognition and Large Vocabulary Conversational Speech Recognition (LVCSR). The first edition was released by LDC... |
Dec 15, 2018
Zhong, Victor; Zhang, Yuhao; Chen, Danqi; Angeli, Gabor; Manning, Christopher, 2018, "TAC Relation Extraction Dataset", https://hdl.handle.net/11272.1/AB2/SOYGGB, Abacus Data Network, V1
TAC Relation Extraction Dataset (TACRED) was developed by The Stanford NLP Group and is a large-scale relation extraction dataset with 106,264 examples built over English newswire and web text used in the NIST TAC KBP English slot filling evaluations during the period 2009-2014.... |
Nov 15, 2018
Bills, Aric; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Hammond, Simon; Harper, Mary; Kaiser-Schatzlein, Alice; Melot, Jennifer; Paget, Shelley; Ray, Jessica; Rytting, Anton; Shen, Sinney; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2018, "IARPA Babel Telugu Language Pack IARPA-babel303b-v1.0a", https://hdl.handle.net/11272.1/AB2/OTDPUV, Abacus Data Network, V1
Introduction IARPA Babel Telugu Language Pack IARPA-babel303b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 201 hours of Telugu conversational and scripted telephone speech collected in 2013... |
Nov 15, 2018
Maamouri, Mohamed; Bies, Ann; Kulick, Seth; Krouna, Sondos; Tabassi,Dalila; Ciul, Michael, 2018, "BOLT Egyptian Arabic Treebank - Discussion Forum", https://hdl.handle.net/11272.1/AB2/CAA0JW, Abacus Data Network, V1
BOLT Egyptian Arabic Treebank – Discussion Forum was developed by the Linguistic Data Consortium (LDC) and consists of Egyptian Arabic web discussion forum data with part-of-speech annotation, morphology, gloss and syntactic tree annotation. The DARPA BOLT (Broad Operational Lang... |
Nov 15, 2018
Maciel, Alexandre M. A.; Rodrigues, Rodrigo L.; Barbosa, Danilo S., 2018, "Avatar Education Portuguese", https://hdl.handle.net/11272.1/AB2/BSQ4NP, Abacus Data Network, V1
Avatar Education Portuguese was developed by the University of Pernambuco and consists of approximately 80 minutes of Brazilian Portuguese microphone speech with phonetic and orthographic transcriptions. The data was developed for Avatar Education, an animated virtual assistant d... |
Oct 15, 2018
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2018, "TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014", https://hdl.handle.net/11272.1/AB2/B3R0J4, Abacus Data Network, V1
TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Slot Filling evaluation track conducted from 2009 to 2014... |
Sep 17, 2018
Bills, Aric; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Hefright, Brook; Kozlov, Kirill; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Phillips, Josh; Walter, Marle; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2018, "IARPA Babel Kazakh Language Pack IARPA-babel302b-v1.0a", https://hdl.handle.net/11272.1/AB2/KGA4ZX, Abacus Data Network, V1
Introduction IARPA Babel Kazakh Language Pack IARPA-babel302b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 203 hours of Kazakh conversational and scripted telephone speech collected in 2013... |
Sep 17, 2018
Morris, Amanda; Strassel, Stephanie; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan G., 2018, "HAVIC MED Event E051-E060 -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/XNNWD1, Abacus Data Network, V1
Introduction HAVIC MED Event E051-E060 – Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 53 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and related techn... |
Sep 17, 2018
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2018, "Multi-Language Conversational Telephone Speech 2011 -- Spanish", https://hdl.handle.net/11272.1/AB2/9Q4DIQ, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 – Spanish was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 23 hours of telephone speech in Spanish. The data were collected primarily to support research and technology evaluat... |
Sep 17, 2018
Griffitt, Kira; Strassel, Stephanie, 2018, "BOLT Information Retrieval Comprehensive Training and Evaluation", https://hdl.handle.net/11272.1/AB2/EDRQLG, Abacus Data Network, V1
Introduction BOLT Information Retrieval Comprehensive Training and Evaluation was developed by the Linguistic Data Consortium (LDC) and consists of all data produced in support of the Information Retrieval (IR) task within the DARPA Broad Operational Language Translation (BOLT) P... |
Aug 15, 2018
Hernández Mena, Carlos Daniel, 2018, "CIEMPIESS Balance", https://hdl.handle.net/11272.1/AB2/JWRYUR, Abacus Data Network, V1
CIEMPIESS (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) Balance was developed by the Development of Speech Technologies program at the School of Engineering at the National Autonomous University of Mexico (UNAM) and consists... |
Aug 15, 2018
Greenberg, Craig; Martin, Alvin; Graff, David; Walker, Kevin; Jones, Karen; Strassel, Stephanie, 2018, "2011 NIST Language Recognition Evaluation Test Set", https://hdl.handle.net/11272.1/AB2/0ZCWPS, Abacus Data Network, V1
2011 NIST Language Recognition Evaluation Test Set contains selected training data and the evaluation test set for the 2011 NIST Language Recognition Evaluation. It consists of approximately 204 hours of conversational telephone speech and broadcast audio collected by the Linguis... |
Aug 15, 2018
Song, Zhiyi; Fore, Dana; Strassel, Stephanie; Lee, Haejoong; Wright, Jonathan, 2018, "BOLT English SMS/Chat", https://hdl.handle.net/11272.1/AB2/RNIGFD, Abacus Data Network, V1
BOLT English SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection involving native speakers of English. The corpus contains 18,429 co... |
Jul 18, 2018
Bills, Aric; Conners, Thomas; Corris, Miriam; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Kaiser-Schatzlein, Alice; Melot, Jennifer; Paget, Shelley; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Viswanath, Arun, 2018, "IARPA Babel Tamil Language Pack IARPA-babel204b-v1.1b", https://hdl.handle.net/11272.1/AB2/8245NT, Abacus Data Network, V1
Introduction IARPA Babel Tamil Language Pack IARPA-babel204b-v1.1b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 350 hours of Tamil conversational and scripted telephone speech collected in 2012 an... |
Jul 16, 2018
Linguistic Data Consortium, 2018, "CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition", https://hdl.handle.net/11272.1/AB2/88OSWL, Abacus Data Network, V1
CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition was developed by the Linguistic Data Consortium (LDC) and consists of approximately 24 hours of unscripted telephone conversations between native speakers of the Mandarin Chinese dialect spoken in mainland China. This se... |
Jul 15, 2018
Linguistic Data Consortium, 2018, "RATS Language Identification", https://hdl.handle.net/11272.1/AB2/UP3WJC, Abacus Data Network, V1
RATS Language Identification was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 5,400 hours of Levantine Arabic, Farsi, Dari, Pashto and Urdu conversational telephone speech with annotation of speech segments. The corpus was created to provide... |
Jun 15, 2018
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2018, "TAC KBP English Entity Linking - Comprehensive Training and Evaluation Data 2009-2013", https://hdl.handle.net/11272.1/AB2/SRPNPS, Abacus Data Network, V1
TAC KBP English Entity Linking - Comprehensive Training and Evaluation Data 2009-2013 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Entity Linking tasks in 2009, 2010, 2011, 2012, and 201... |
Jun 15, 2018
Song, Zhiyi; Fore, Dana; Strassel, Stephanie; Lee, Haejoong; Wright, Jonathan, 2018, "BOLT Chinese SMS/Chat", https://hdl.handle.net/11272.1/AB2/MMNPUR, Abacus Data Network, V1
BOLT Chinese SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection involving native speakers of Chinese. The corpus contains 14,877 co... |
Jun 15, 2018
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2018, "Multi-Language Conversational Telephone Speech 2011 -- Central European", https://hdl.handle.net/11272.1/AB2/Y1F6XQ, Abacus Data Network, V1
Multi-Language Conversational Telephone Speech 2011 – Central European was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 44 hours of telephone speech in two distinct language varieties of Central Europe: Czech and Slovak. The data were collec... |
May 15, 2018
Dilley, Laura C.; Breen, Mara; Brown, Meredith; Gibson, Edward, 2018, "Rhythm and Pitch", https://hdl.handle.net/11272.1/AB2/JDLPMX, Abacus Data Network, V1
Rhythm and Pitch contains approximately 27 minutes of spontaneous English conversations and radio news stories annotated with the Rhythm and Pitch (RaP) scheme. Speech data for annotation was taken from two corpora released by LDC, CALLHOME American English Speech (LDC97S42) and... |
May 15, 2018
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2018, "GALE Phase 4 Arabic Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/DN3EXL, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 37 hours of Arabic broadcast news speech collected in 2008 and 2009 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia... |
May 15, 2018
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2018, "GALE Phase 4 Arabic Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/ODSQZW, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 37 hours of Arabic broadcast news speech collected in 2008 and 2009 by LDC and MediaNet, Tunis, Tunisia and MTC, Rabat, Morocco during Phase 4 of the... |
Apr 16, 2018
Berkling, Kay, 2018, "H2, E2, ERK1 Children's Writing", https://hdl.handle.net/11272.1/AB2/7GXGKW, Abacus Data Network, V1
Introduction H2, E2, ERK1 Children’s Writing was developed by the Cooperative State University Baden-Württemberg, University of Education. It consists of approximately 2,000 texts written over four months by 173 German school children age six through eleven years. The data in thi... |
Apr 16, 2018
Linguistic Data Consortium, 2018, "TRAD Arabic-French Parallel Text -- Newsgroup", https://hdl.handle.net/11272.1/AB2/0DET8M, Abacus Data Network, V1
Introduction TRAD Arabic-French Parallel Text – Newsgroup was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Arabic words from GALE Phase 1 Arabic Newsgroup Parallel Text - Part 1 (LDC2009T03). The PEA-TRAD p... |
Mar 15, 2018
Arase, Yuki; Tsujii, Junichi, 2018, "SPADE", https://hdl.handle.net/11272.1/AB2/V6GR5J, Abacus Data Network, V1
SPADE (Syntactic Phrase Alignment Dataset for Evaluation) consists of annotated parse trees and alignment on English sentential paraphrases extracted from machine translation evaluation corpora and separated into development and test sets. Reference translations from machine tran... |
Mar 15, 2018
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Ma, Xiaoyi; Wright, Jonathan, 2018, "LORELEI Somali Representative Language Pack - Monolingual and Parallel Text", https://hdl.handle.net/11272.1/AB2/75GGBX, Abacus Data Network, V1
LORELEI Somali Representative Language Pack - Monolingual and Parallel Text was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 13 million words of monolingual Somali text, approximately 800,000 of which are translated into English. Another 100... |
Feb 16, 2018
Ellis, Joe; Getman, Jeremy; Graff, David; Strassel, Stephanie, 2018, "TAC KBP Comprehensive English Source Corpora 2009-2014", https://hdl.handle.net/11272.1/AB2/VC89SM, Abacus Data Network, V1
Introduction TAC KBP Comprehensive English Source Corpora 2009-2014 was developed by the Linguistic Data Consortium (LDC) and contains the 3,877,207 English source documents used in support of the TAC KBP tasks from 2009-2014. Text Analysis Conference (TAC) is a series of worksho... |
Feb 16, 2018
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Ma, Xiaoyi; Wright, Jonathan, 2018, "LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text", https://hdl.handle.net/11272.1/AB2/5TNZPX, Abacus Data Network, V1
Introduction LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text was developed by the Linguistic Data Consortium and is comprised of approximately 25 million words of monolingual Amharic text, approximately 600,000 of which are translated into English. An... |
Feb 16, 2018
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2018, "Multi-Language Conversational Telephone Speech 2011 -- Central Asian", https://hdl.handle.net/11272.1/AB2/YW9PX3, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 – Central Asian was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 37 hours of telephone speech in three distinct language varieties of Central Asia: Dari, Farsi and Pashto. The... |
Jan 16, 2018
Ravanelli, Mirco; Cristoforetti, Luca; Omologo, Maurizio, 2018, "DIRHA English WSJ Audio", https://hdl.handle.net/11272.1/AB2/8WSEVY, Abacus Data Network, V1
Introduction DIRHA English WSJ Audio was developed as part of the Distant-Speech Interaction for Robust Home Applications (DIRHA) Project which addressed natural spontaneous speech interaction with distant microphones in a domestic environment. It is comprised of approximately 85... |
Jan 16, 2018
Taulé, Mariona; Martí, Maria Antonia; Bies, Ann; Garí, Aina; Nofre, Montserrat; Song, Zhiyi; Strassel, Stephanie; Ellis, Joe, 2018, "DEFT Spanish Treebank", https://hdl.handle.net/11272.1/AB2/Z3OEWX, Abacus Data Network, V1
Introduction DEFT Spanish Treebank was developed by the Linguistic Data Consortium (LDC) and the Language and Computation Center (CLiC), University of Barcelona. It contains treebank annotation of international Spanish newswire text and Latin American Spanish discussion forum dat... |
Jan 16, 2018
Linguistic Data Consortium; ELDA, 2018, "TRAD Chinese-French Parallel Text -- Blog", https://hdl.handle.net/11272.1/AB2/ATYE6I, Abacus Data Network, V1
Introduction TRAD Chinese-French Parallel Text – Blog was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Chinese words from GALE Phase 1 Chinese Blog Parallel Text (LDC2008T06). The PEA-TRAD project (Translat... |
Dec 15, 2017
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2017, "GALE Phase 4 Chinese Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/KTVMHA, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 134 hours of Chinese broadcast news speech collected in 2008 by LDC and Hong University of Science and Technology (HKUST... |
Dec 15, 2017
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 4 Chinese Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/4ADDAM, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 134 hours of Mandarin Chinese broadcast news speech collected in 2008 by LDC and Hong University of Science and Technology (HKUST), Hong... |
Nov 17, 2017
Mena, Carlos Daniel Hernández; Herrera, Abel, 2017, "CIEMPIESS Light", https://hdl.handle.net/11272.1/AB2/JXHBRG, Abacus Data Network, V1
CIEMPIESS (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) Light was developed by the Speech Processing Laboratory of the Faculty of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximate... |