1,701 to 1,750 of 1,819 Results
Nov 17, 2017
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2017, "TAC KBP Chinese Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2011-2014", https://hdl.handle.net/11272.1/AB2/XOE0NF, Abacus Data Network, V1
TAC KBP Chinese Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2011-2014 was developed by the Linguistic Data Consortium and contains training and evaluation data produced in support of the TAC KBP Chinese Cross-lingual Entity Linking tasks in 2011, 201... |
Oct 18, 2017
Chen, Xiaohe; Li, Bin; Feng, Minxuan; Xu, Chao; Xu, Runhua; Shi, Min; Yu, Lili; Xiao, Lei; Wang, Qingqing, 2017, "Ancient Chinese Corpus", https://hdl.handle.net/11272.1/AB2/4HYBFE, Abacus Data Network, V1
Ancient Chinese Corpus was developed at Nanjing Normal University. It contains word-segmented and part-of-speech tagged text from Zuozhuan, an ancient Chinese work believed to date from the Warring States Period (475-221 BC). Zuozhuan is a commentary on the Chunqui, a history of... |
Oct 18, 2017
Kato, Akihiko; Shindo, Hiroyuki; Matsumoto, Yuji, 2017, "MWE-Aware English Dependency Corpus 2.0", https://hdl.handle.net/11272.1/AB2/GKYOY9, Abacus Data Network, V1
MWE-Aware English Dependency Corpus Version 2.0 was developed by the Nara Institute of Science and Technology Computational Linguistics Laboratory and consists of English compound function words annotated in dependency format. The data is derived from OntoNotes Release 5.0 (LDC20... |
Oct 18, 2017
Graff, David; Ma, Xiaoyi; Strassel, Stephanie; Walker, Kevin; Jones, Karen, 2017, "RATS Keyword Spotting", https://hdl.handle.net/11272.1/AB2/IFVKNB, Abacus Data Network, V1
RATS Keyword Spotting was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 3,100 hours of Levantine Arabic and Farsi conversational telephone speech with automatic and manual annotation of speech segments, transcripts and keywords generated from... |
Oct 18, 2017
O'Gorman, Tim; Conger, Katherine; Palmer, Martha, 2017, "English Web Treebank Propbank", https://hdl.handle.net/11272.1/AB2/Q8LILM, Abacus Data Network, V1
English Web Treebank Propbank, LDC Catalog Number LDC2017T15 and ISBN 1-58563-818-8, was developed by the University of Colorado Boulder - CLEAR (Computational Language and Education Research) and provides predicate-argument structure annotation for English Web Treebank (LDC2012T... |
Oct 15, 2017
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2017, "Multi-Language Conversational Telephone Speech 2011 -- South Asian", https://hdl.handle.net/11272.1/AB2/JPGPJM, Abacus Data Network, V1
Multi-Language Conversational Telephone Speech 2011 – South Asian was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 118 hours of telephone speech in five distinct language varieties of South Asia (i.e. the Indian sub-continent): Bengali, Hind... |
Sep 14, 2017
Shriberg, Elizabeth; Kathol, Andreas; Graciarena, Martin; Bratt, Harry; Kajarekar, Sachin; Jameel, Huda; Richey, Colleen; Goodman, Fred, 2017, "SRI-FRTIV", https://hdl.handle.net/11272.1/AB2/YONFH9, Abacus Data Network, V1
SRI-FRTIV (Five-way Recorded Toastmaster Intrinsic Variation) was developed by SRI International in 2007-2008 and is comprised of approximately 232 hours of English speech from thirty-four speakers who were members of Toastmaster clubs. Participants were asked to speak at three d... |
Sep 14, 2017
Xue, Nianwen; Ng, Hwee Tou; Pradhan, Sameer; Rutherford, Attapol T.; Webber, Bonnie; Wang, Chuan; Wang, Hong Min; Prasad, Rashmi, 2017, "2015-2016 CoNLL Shared Task", https://hdl.handle.net/11272.1/AB2/TSNLNO, Abacus Data Network, V1
2015-2016 CoNLL Shared Task, LDC Catalog Number LDC2017T13 and ISBN 1-58563-812-9, contains the Chinese and English training, development and test data for the 2015 and 2016 CoNLL (Conference on Computational Natural Language Learning) Shared Task Evaluation which focused on shal... |
Aug 15, 2017
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 4 Arabic Broadcast Conversation Speech", https://hdl.handle.net/11272.1/AB2/XFDC1A, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast Conversation Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 75 hours of Arabic broadcast conversation speech collected in 2008 and 2009 by LDC, MediaNet, Tunis, Tunisia and MTC, Rabat, Morocco during Ph... |
Aug 15, 2017
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2017, "GALE Phase 4 Arabic Broadcast Conversation Transcripts", https://hdl.handle.net/11272.1/AB2/WLEBLW, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast Conversation Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 75 hours of Arabic broadcast conversation speech collected in 2008 and 2009 by LDC, MediaNet, Tunis, Tunisia and MTC, Rabat, M... |
Jul 18, 2017
Petukhova, Volha; Malchanau, Andrei; Oualil, Youssef; Klakow, Dietrich; Stevens, Christopher; Weerd, Harmen de; Taatgen, Niels, 2017, "Metalogue Multi-Issue Bargaining Dialogue", https://hdl.handle.net/11272.1/AB2/U57KQP, Abacus Data Network, V1
Metalogue Multi-Issue Bargaining Dialogue was developed by the Metalogue Consortium under the European Community’s Seventh Framework Programme for Research and Technological Development. This release consists of approximately 2.5 hours of semantically annotated English dialogue d... |
Jul 18, 2017
Meftah, Ali Hamid; Alotaibi, Yousef Ajami; Selouani, Sid-Ahmed, 2017, "KSUEmotions", https://hdl.handle.net/11272.1/AB2/3HNHPQ, Abacus Data Network, V1
KSUEmotions was developed by King Saud University (KSU) and contains approximately five hours of emotional Modern Standard Arabic (MSA) speech from 23 subjects. Speakers were from three countries: Yemen, Saudi Arabia and Syria. Subjects read MSA sentences from newswire text in th... |
Jun 15, 2017
Knight, Kevin; Badarau, Bianca; Baranescu, Laura; Bonial, Claire; Bardocz, Madalina; Griffitt, Kira; Hermjakob, Ulf; Marcu, Daniel; Palmer, Martha; O'Gorman, Tim; Schneider, Nathan, 2017, "Abstract Meaning Representation (AMR) Annotation Release 2.0", https://hdl.handle.net/11272.1/AB2/8MN4GE, Abacus Data Network, V1
Abstract Meaning Representation (AMR) Annotation Release 2.0 was developed by the Linguistic Data Consortium (LDC), SDL/Language Weaver, Inc., the University of Colorado’s Computational Language and Educational Research group and the Information Sciences Institute at the Universi... |
May 15, 2017
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2017, "Multi-Language Conversational Telephone Speech 2011 -- Turkish", https://hdl.handle.net/11272.1/AB2/FPNZZV, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 -- Turkish was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 18 hours of telephone speech in Turkish. The data were collected primarily to support research and technology evalua... |
May 15, 2017
Huang, Ruihong; Jurafsky, Daniel; Riloff, Ellen, 2017, "The EventStatus Corpus", https://hdl.handle.net/11272.1/AB2/EGUSOP, Abacus Data Network, V1
Introdution The EventStatus Corpus was developed by researchers at Texas A&M University, Stanford University and The University of Utah. It consists of approximately 3,000 English and 1,500 Spanish news articles about civil unrest events annotated with temporal tags. This corpus... |
May 15, 2017
Benowitz, Daniel; Bills, Aric; Conners, Thomas; Dubinski, Eyal; Fiscus, Jonathan; Harper, Mary; Heighway, Melanie; Le, Hanh; Melot, Jennifer; Onaka, Akiko; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2017, "IARPA Babel Lao Language Pack IARPA-babel203b-v3.1a", https://hdl.handle.net/11272.1/AB2/ME10OS, Abacus Data Network, V1
Introduction IARPA Babel Lao Language Pack IARPA-babel203b-v3.1a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 207 hours of Lao conversational and scripted telephone speech collected in 2013 along... |
May 15, 2017
Chamberlain, Jon; Poesio, Massimo; Kruschwitz, Udo, 2017, "Phrase Detectives Corpus", https://hdl.handle.net/11272.1/AB2/NN2QFX, Abacus Data Network, V1
Introduction Phrase Detectives Corpus was developed by the School of Computer Science and Electronic Engineering at the University of Essex and consists of approximately 19,012 words across 40 documents anaphorically-annotated by the Phrase Detectives game, an online interactive... |
Apr 17, 2017
Vincent, Emmanuel; Barker, Jon; Watanabe, Shinji; Le Roux, Jonathan; Nesta, Francesco; Matassoni, Marco, 2017, "CHiME2 Grid", https://hdl.handle.net/11272.1/AB2/ASLFRE, Abacus Data Network, V1
Introduction CHiME2 Grid was developed as part of The 2nd CHiME Speech Separation and Recognition Challenge and contains approximately 120 hours of English speech from a noisy living room environment. The CHiME Challenges focus on distant-microphone automatic speech recognition (... |
Apr 17, 2017
Song, Zhiyi; Fore, Dana; Strassel, Stephanie; Lee, Haejoong; Wright, Jonathan, 2017, "BOLT Egyptian Arabic SMS/Chat and Transliteration", https://hdl.handle.net/11272.1/AB2/7I6ANJ, Abacus Data Network, V1
Introduction BOLT Egyptian Arabic SMS/Chat and Transliteration was developed by the Linguistic Data Consortium (LDC) and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection involving native speakers... |
Mar 17, 2017
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie; Ma, Xiaoyi; Xue, Nianwen; Marcus, Mitch; Taylor, Ann, 2017, "GALE English-Chinese Parallel Aligned Treebank -- Training", https://hdl.handle.net/11272.1/AB2/QROJQB, Abacus Data Network, V1
Introduction GALE English-Chinese Parallel Aligned Treebank – Training was developed by the Linguistic Data Consortium (LDC) and contains 196,123 tokens of word aligned English and Chinese parallel text with treebank annotations. This material was used as training data in the DAR... |
Mar 17, 2017
Song, Zhiyi; Garland, Jennifer; Walker, Christopher; Strassel, Stephanie, 2017, "BOLT Chinese Discussion Forum Parallel Training Data", https://hdl.handle.net/11272.1/AB2/EWIO27, Abacus Data Network, V1
Introduction BOLT Chinese Discussion Forum Parallel Training Data was developed by the Linguistic Data Consortium (LDC) and consists of 1,876,799 tokens of Chinese discussion forum data collected for the DARPA BOLT program along with their corresponding English translations. The... |
Feb 15, 2017
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 3 Arabic Broadcast News Speech Part 2", https://hdl.handle.net/11272.1/AB2/SRRGAW, Abacus Data Network, V1
Introduction GALE Phase 3 Arabic Broadcast News Speech Part 2 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 128 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia and... |
Jan 19, 2017
Andrus, Tony; Bills, Aric; Corris, Miriam; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Hefright, Brook; Jarrett, Amy; Le, Hanh; Ray, Jessica; Rytting, Anton; Silber, Ronnie; Shen, Wade; Tzoukermann, Evelyne, 2017, "IARPA Babel Vietnamese Language Pack IARPA-babel107b-v0.7", https://hdl.handle.net/11272.1/AB2/CSHOZ8, Abacus Data Network, V1
Introduction IARPA Babel Vietnamese Language Pack IARPA-babel107b-v0.7 was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 201 hours of Vietnamese conversational and scripted telephone speech collected i... |
Dec 15, 2016
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2016, "TAC KBP Spanish Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014", https://hdl.handle.net/11272.1/AB2/HL83QO, Abacus Data Network, V1
Introduction TAC KBP Spanish Cross-Lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Spanish Cross-lingual Entity Linking... |
Dec 15, 2016
Bamba, Moussa, 2016, "Bamanankan Lexicon", https://hdl.handle.net/11272.1/AB2/OOCBVZ, Abacus Data Network, V1
Introduction Bamanankan Lexicon was developed by the Linguistic Data Consortium (LDC) and contains 5,978 entries of the Bamanankan language presented as a Bamanankan-English lexicon and a Bamanankan-French lexicon. It is the third publication in an LDC project to build an electro... |
Dec 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Newswire Parallel Sentences", https://hdl.handle.net/11272.1/AB2/R1M8ZY, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Newswire Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program.... |
Dec 15, 2016
Conners, Thomas; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Jarrett, Amy; Lin, Willa; Molina, María Encarnación Pérez; Rafalko, Shawna; Ray, Jessica; Rytting, Anton; Shen, Wade; Tzoukermann, Evelyne, 2016, "IARPA Babel Tagalog Language Pack IARPA-babel106-v0.2g", https://hdl.handle.net/11272.1/AB2/IULTZX, Abacus Data Network, V1
Introduction IARPA Babel Tagalog Language Pack IARPA-babel106-v0.2g was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 213 hours of Tagalog conversational and scripted telephone speech collected in 2012... |
Nov 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Newswire Parallel Text", https://hdl.handle.net/11272.1/AB2/KYZUJ0, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Newswire Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploitation)... |
Nov 15, 2016
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2016, "Multi-Language Conversational Telephone Speech 2011 -- Slavic Group", https://hdl.handle.net/11272.1/AB2/OL5RQH, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 -- Slavic Group was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 60 hours of telephone speech in each of three distinct Slavic languages: Polish, Russian and Ukranian. The data... |
Nov 15, 2016
Bills, Aric; David, Anne; Dubinski, Eyal; Fiscus, Jonathan; Hammond, Simon; Gann, Ketty; Harper, Mary; Hefright, Brook; Kazi, Michael; Lam, Julie; Ray, Jessica; Richardson, Fred; Rytting, Anton; Walter, Marle, 2016, "IARPA Babel Georgian Language Pack IARPA-babel404b-v1.0a", https://hdl.handle.net/11272.1/AB2/W0TIWB, Abacus Data Network, V1
Introduction IARPA Babel Georgian Language Pack IARPA-babel404b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 190 hours of Georgian conversational and scripted telephone speech collected in 2... |
Oct 19, 2016
Andresen, Jess; Bills, Aric; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; J. Hazen, T.; Jarrett, Amy; Roomi, Bergul; Ray, Jessica; Rytting, Anton; Shen, Wade; Tzoukermann, Evelyne, 2016, "IARPA Babel Turkish Language Pack IARPA-babel105b-v0.5", https://hdl.handle.net/11272.1/AB2/GYXA1F, Abacus Data Network, V1
Introduction IARPA Babel Turkish Language Pack IARPA-babel105b-v0.5 was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 213 hours of Turkish conversational and scripted telephone speech collected in 2012... |
Oct 19, 2016
O'Gorman, Tim; Palmer, Martha, 2016, "Richer Event Description", https://hdl.handle.net/11272.1/AB2/H5RQJH, Abacus Data Network, V1
Introduction Richer Event Description was developed by the University of Colorado Boulder-CLEAR (Computational Language and Education Research, Carnegie Mellon University and LDC. It consists of coreference, bridging and event-event relations (temporal, causal, subevent and repor... |
Sep 15, 2016
Adams, Nikki; Bills, Aric; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Jarrett, Amy; Khugyani, Kamila; Lin, Willa; Ray, Jessica; Rytting, Anton; Shen, Wade; Strahan, Tania; Tzoukermann, Evelyne, 2016, "IARPA Babel Pashto Language Pack IARPA-babel104b-v0.4bY", https://hdl.handle.net/11272.1/AB2/GLFN3X, Abacus Data Network, V1
Introduction IARPA Babel Pashto Language Pack IARPA-babel104b-v0.4bY was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 214 hours of Pashto conversational and scripted telephone speech collected in 2011... |
Sep 15, 2016
Tratz, Stephen, 2016, "ARL Arabic Dependency Treebank", https://hdl.handle.net/11272.1/AB2/GKAG4O, Abacus Data Network, V1
Introduction ARL Arabic Dependency Treebank was developed by the US Army Research Laboratory (ARL) and was derived from four LDC resources: Arabic Treebank (ATB) Part 1 v 4.1 (LDC2010T13), Part 2 v 3.1 (LDC2011T09), Part 3 v 3.2 (LDC2010T08) and Broadcast News v 1.0 (LDC2012T07).... |
Aug 16, 2016
Bills, Aric; David, Anne; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Gnanadesikan, Amalia; Harper, Mary; Hammond, Simon; Jarrett, Amy; Molina, María; Ray, Jessica; Rytting, Anton; Paget, Shelly; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Wong, Jamie, 2016, "IARPA Babel Assamese Language Pack IARPA-babel102b-v0.5a", https://hdl.handle.net/11272.1/AB2/9JCM5S, Abacus Data Network, V1
Introduction IARPA Babel Assamese Language Pack IARPA-babel102b-v0.5a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 205 hours of Assamese conversational and scripted telephone speech collected in 2... |
Aug 16, 2016
Bills, Aric; David, Anne; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Jarrett, Amy; Molina, María; Ray, Jessica; Rytting, Anton; Paget, Shelly; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Wong, Jamie, 2016, "IARPA Babel Bengali Language Pack IARPA-babel103b-v0.4b", https://hdl.handle.net/11272.1/AB2/WKL40N, Abacus Data Network, V1
Introduction IARPA Babel Bengali Language Pack IARPA-babel103b-v0.4b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 215 hours of Bengali conversational and scripted telephone speech collected in 201... |
Aug 15, 2016
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio Denise; Strassel, Stephanie, 2016, "GALE Phase 3 Arabic Broadcast News Speech Part 1", https://hdl.handle.net/11272.1/AB2/B0XGQD, Abacus Data Network, V1
GALE Phase 3 Arabic Broadcast News Speech Part 1 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 132 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia and MTC, Rabat, M... |
Aug 15, 2016
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2016, "GALE Phase 3 Arabic Broadcast News Transcripts Part 1", https://hdl.handle.net/11272.1/AB2/IQOADN, Abacus Data Network, V1
GALE Phase 3 Arabic Broadcast News Transcripts Part 1 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 132 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia a... |
Jul 19, 2016
Andrus, Tony; Dubinski, Eyal; Fiscus, Jonathan G.; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Hefright, Brook; Jarrett, Amy; Lin, Willa; Ray, Jessica; Rytting, Anton; Shen, Wade; Tzoukermann, Evelyne; Wong, Jamie, 2016, "IARPA Babel Cantonese Language Pack IARPA-babel101b-v0.4c", https://hdl.handle.net/11272.1/AB2/01SD6T, Abacus Data Network, V1
IARPA Babel Cantonese Language Pack IARPA-babel101b-v0.4c was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 215 hours of Cantonese conversational and scripted telephone speech collected in 2011 along w... |
Jul 15, 2016
Song, Zhiyi; Krug, Gary; Jiang, Zixin; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Broadcast News Parallel Text", https://hdl.handle.net/11272.1/AB2/CE2DP3, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Broadcast News Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploit... |
Jul 15, 2016
Muir, Kate; Joinson, Adam; Cotterill, Rachel; Dewdney, Nigel, 2016, "English Speed Networking Conversational Transcripts", https://hdl.handle.net/11272.1/AB2/LX2FQA, Abacus Data Network, V1
Introduction English Speed Networking Conversational Transcripts was developed at the University of the West of England and contains 388 transcripts of English face-to-face and instant messaging conversations about business ideas collected in 2014 and 2015 from participants (unde... |
Jul 15, 2016
Kretzschmar Jr., William; Bounds, Paulina; Hettel, Jacqueline; Coats, Steven; Pederson, Lee; Lena Opas-Hänninen, Lisa; Juuso, Ilkka; Seppänen, Tapio, 2016, "Digital Archive of Southern Speech - NLP Version", https://hdl.handle.net/11272.1/AB2/F4QH6S, Abacus Data Network, V1
Introduction Digital Archive of Southern Speech - NLP Version (DASS-NLP) was developed by LDC as an alternate version of Digital Archive of Southern Speech (DASS) (LDC2012S03) suitable for natural language processing and human language technology applications. Specifically, the o... |
Jun 15, 2016
Song, Zhiyi; Krug, Gary; Jiang, Zixin; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Weblog Parallel Sentences", https://hdl.handle.net/11272.1/AB2/3GAMIQ, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Weblog Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program. T... |
Jun 15, 2016
Xue, Nianwen; Zhang, Xiuhong; Jiang, Zixin; Palmer, Martha; Xia, Fei; Chiou, Fu-Dong; Chang, Meiyu, 2016, "Chinese Treebank 9.0", https://hdl.handle.net/11272.1/AB2/YYY4FY, Abacus Data Network, V1
Introduction Chinese Treebank 9.0 consists of approximately two million words of annotated and parsed text from Chinese newswire, government documents, magazine articles, various broadcast news and broadcast conversation programs, web newsgroups, weblogs, discussion forums, chat... |
May 16, 2016
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2016, "GALE Phase 4 Chinese Broadcast Conversation Transcripts", https://hdl.handle.net/11272.1/AB2/QOKU34, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 172 hours of Chinese broadcast conversation speech collected in 2008 by LDC and Hong Kong University of Science... |
May 16, 2016
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2016, "GALE Phase 4 Chinese Broadcast Conversation Speech", https://hdl.handle.net/11272.1/AB2/Y6ZKMX, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 172 hours of Mandarin Chinese broadcast conversation speech collected in 2008 by LDC and Hong Kong University of Science and Tec... |
Apr 18, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Broadcast Conversation Parallel Sentences", https://hdl.handle.net/11272.1/AB2/FGSLZN, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Broadcast Conversation Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploita... |
Apr 18, 2016
Tracey, Jennifer; Strassel, Stephanie; Morris, Amanda; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan, 2016, "HAVIC Pilot Transcription", https://hdl.handle.net/11272.1/AB2/ODUSVC, Abacus Data Network, V1
Introduction HAVIC Pilot Transcription was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 72 hours of user-generated videos with transcripts based on the English speech audio extracted from the videos. This data set was created in collaboratio... |
Apr 18, 2016
Berkling, Kay, 2016, "H1 Children's Writing", https://hdl.handle.net/11272.1/AB2/OJCHNV, Abacus Data Network, V1
Introduction H1 Children's Writing was developed by the Cooperative State University Baden-Württemberg, University of Education. It consists of 996 texts written over three months by 88 German school children age seven through eleven years. The data in this corpus was collected b... |
Mar 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Broadcast Conversation Parallel Text", https://hdl.handle.net/11272.1/AB2/JVLMY4, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Broadcast Conversation Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language... |