Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

251 to 300 of 403 Results
Dec 15, 2018
Zhong, Victor; Zhang, Yuhao; Chen, Danqi; Angeli, Gabor; Manning, Christopher, 2018, "TAC Relation Extraction Dataset", https://hdl.handle.net/11272.1/AB2/SOYGGB, Abacus Data Network, V1
TAC Relation Extraction Dataset (TACRED) was developed by The Stanford NLP Group and is a large-scale relation extraction dataset with 106,264 examples built over English newswire and web text used in the NIST TAC KBP English slot filling evaluations during the period 2009-2014....
Nov 15, 2018
Bills, Aric; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Hammond, Simon; Harper, Mary; Kaiser-Schatzlein, Alice; Melot, Jennifer; Paget, Shelley; Ray, Jessica; Rytting, Anton; Shen, Sinney; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2018, "IARPA Babel Telugu Language Pack IARPA-babel303b-v1.0a", https://hdl.handle.net/11272.1/AB2/OTDPUV, Abacus Data Network, V1
Introduction IARPA Babel Telugu Language Pack IARPA-babel303b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 201 hours of Telugu conversational and scripted telephone speech collected in 2013...
Nov 15, 2018
Maamouri, Mohamed; Bies, Ann; Kulick, Seth; Krouna, Sondos; Tabassi,Dalila; Ciul, Michael, 2018, "BOLT Egyptian Arabic Treebank - Discussion Forum", https://hdl.handle.net/11272.1/AB2/CAA0JW, Abacus Data Network, V1
BOLT Egyptian Arabic Treebank – Discussion Forum was developed by the Linguistic Data Consortium (LDC) and consists of Egyptian Arabic web discussion forum data with part-of-speech annotation, morphology, gloss and syntactic tree annotation. The DARPA BOLT (Broad Operational Lang...
Nov 15, 2018
Maciel, Alexandre M. A.; Rodrigues, Rodrigo L.; Barbosa, Danilo S., 2018, "Avatar Education Portuguese", https://hdl.handle.net/11272.1/AB2/BSQ4NP, Abacus Data Network, V1
Avatar Education Portuguese was developed by the University of Pernambuco and consists of approximately 80 minutes of Brazilian Portuguese microphone speech with phonetic and orthographic transcriptions. The data was developed for Avatar Education, an animated virtual assistant d...
Oct 15, 2018
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2018, "TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014", https://hdl.handle.net/11272.1/AB2/B3R0J4, Abacus Data Network, V1
TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Slot Filling evaluation track conducted from 2009 to 2014...
Sep 17, 2018
Bills, Aric; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Hefright, Brook; Kozlov, Kirill; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Phillips, Josh; Walter, Marle; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2018, "IARPA Babel Kazakh Language Pack IARPA-babel302b-v1.0a", https://hdl.handle.net/11272.1/AB2/KGA4ZX, Abacus Data Network, V1
Introduction IARPA Babel Kazakh Language Pack IARPA-babel302b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 203 hours of Kazakh conversational and scripted telephone speech collected in 2013...
Sep 17, 2018
Morris, Amanda; Strassel, Stephanie; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan G., 2018, "HAVIC MED Event E051-E060 -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/XNNWD1, Abacus Data Network, V1
Introduction HAVIC MED Event E051-E060 – Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 53 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and related techn...
Sep 17, 2018
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2018, "Multi-Language Conversational Telephone Speech 2011 -- Spanish", https://hdl.handle.net/11272.1/AB2/9Q4DIQ, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 – Spanish was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 23 hours of telephone speech in Spanish. The data were collected primarily to support research and technology evaluat...
Sep 17, 2018
Griffitt, Kira; Strassel, Stephanie, 2018, "BOLT Information Retrieval Comprehensive Training and Evaluation", https://hdl.handle.net/11272.1/AB2/EDRQLG, Abacus Data Network, V1
Introduction BOLT Information Retrieval Comprehensive Training and Evaluation was developed by the Linguistic Data Consortium (LDC) and consists of all data produced in support of the Information Retrieval (IR) task within the DARPA Broad Operational Language Translation (BOLT) P...
Aug 15, 2018
Hernández Mena, Carlos Daniel, 2018, "CIEMPIESS Balance", https://hdl.handle.net/11272.1/AB2/JWRYUR, Abacus Data Network, V1
CIEMPIESS (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) Balance was developed by the Development of Speech Technologies program at the School of Engineering at the National Autonomous University of Mexico (UNAM) and consists...
Aug 15, 2018
Greenberg, Craig; Martin, Alvin; Graff, David; Walker, Kevin; Jones, Karen; Strassel, Stephanie, 2018, "2011 NIST Language Recognition Evaluation Test Set", https://hdl.handle.net/11272.1/AB2/0ZCWPS, Abacus Data Network, V1
2011 NIST Language Recognition Evaluation Test Set contains selected training data and the evaluation test set for the 2011 NIST Language Recognition Evaluation. It consists of approximately 204 hours of conversational telephone speech and broadcast audio collected by the Linguis...
Aug 15, 2018
Song, Zhiyi; Fore, Dana; Strassel, Stephanie; Lee, Haejoong; Wright, Jonathan, 2018, "BOLT English SMS/Chat", https://hdl.handle.net/11272.1/AB2/RNIGFD, Abacus Data Network, V1
BOLT English SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection involving native speakers of English. The corpus contains 18,429 co...
Jul 18, 2018
Bills, Aric; Conners, Thomas; Corris, Miriam; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Kaiser-Schatzlein, Alice; Melot, Jennifer; Paget, Shelley; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Viswanath, Arun, 2018, "IARPA Babel Tamil Language Pack IARPA-babel204b-v1.1b", https://hdl.handle.net/11272.1/AB2/8245NT, Abacus Data Network, V1
Introduction IARPA Babel Tamil Language Pack IARPA-babel204b-v1.1b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 350 hours of Tamil conversational and scripted telephone speech collected in 2012 an...
Jul 16, 2018
Linguistic Data Consortium, 2018, "CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition", https://hdl.handle.net/11272.1/AB2/88OSWL, Abacus Data Network, V1
CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition was developed by the Linguistic Data Consortium (LDC) and consists of approximately 24 hours of unscripted telephone conversations between native speakers of the Mandarin Chinese dialect spoken in mainland China. This se...
Jul 15, 2018
Linguistic Data Consortium, 2018, "RATS Language Identification", https://hdl.handle.net/11272.1/AB2/UP3WJC, Abacus Data Network, V1
RATS Language Identification was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 5,400 hours of Levantine Arabic, Farsi, Dari, Pashto and Urdu conversational telephone speech with annotation of speech segments. The corpus was created to provide...
Jun 15, 2018
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2018, "TAC KBP English Entity Linking - Comprehensive Training and Evaluation Data 2009-2013", https://hdl.handle.net/11272.1/AB2/SRPNPS, Abacus Data Network, V1
TAC KBP English Entity Linking - Comprehensive Training and Evaluation Data 2009-2013 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Entity Linking tasks in 2009, 2010, 2011, 2012, and 201...
Jun 15, 2018
Song, Zhiyi; Fore, Dana; Strassel, Stephanie; Lee, Haejoong; Wright, Jonathan, 2018, "BOLT Chinese SMS/Chat", https://hdl.handle.net/11272.1/AB2/MMNPUR, Abacus Data Network, V1
BOLT Chinese SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection involving native speakers of Chinese. The corpus contains 14,877 co...
Jun 15, 2018
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2018, "Multi-Language Conversational Telephone Speech 2011 -- Central European", https://hdl.handle.net/11272.1/AB2/Y1F6XQ, Abacus Data Network, V1
Multi-Language Conversational Telephone Speech 2011 – Central European was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 44 hours of telephone speech in two distinct language varieties of Central Europe: Czech and Slovak. The data were collec...
May 15, 2018
Dilley, Laura C.; Breen, Mara; Brown, Meredith; Gibson, Edward, 2018, "Rhythm and Pitch", https://hdl.handle.net/11272.1/AB2/JDLPMX, Abacus Data Network, V1
Rhythm and Pitch contains approximately 27 minutes of spontaneous English conversations and radio news stories annotated with the Rhythm and Pitch (RaP) scheme. Speech data for annotation was taken from two corpora released by LDC, CALLHOME American English Speech (LDC97S42) and...
May 15, 2018
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2018, "GALE Phase 4 Arabic Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/DN3EXL, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 37 hours of Arabic broadcast news speech collected in 2008 and 2009 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia...
May 15, 2018
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2018, "GALE Phase 4 Arabic Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/ODSQZW, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 37 hours of Arabic broadcast news speech collected in 2008 and 2009 by LDC and MediaNet, Tunis, Tunisia and MTC, Rabat, Morocco during Phase 4 of the...
Apr 16, 2018
Berkling, Kay, 2018, "H2, E2, ERK1 Children's Writing", https://hdl.handle.net/11272.1/AB2/7GXGKW, Abacus Data Network, V1
Introduction H2, E2, ERK1 Children’s Writing was developed by the Cooperative State University Baden-Württemberg, University of Education. It consists of approximately 2,000 texts written over four months by 173 German school children age six through eleven years. The data in thi...
Apr 16, 2018
Linguistic Data Consortium, 2018, "TRAD Arabic-French Parallel Text -- Newsgroup", https://hdl.handle.net/11272.1/AB2/0DET8M, Abacus Data Network, V1
Introduction TRAD Arabic-French Parallel Text – Newsgroup was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Arabic words from GALE Phase 1 Arabic Newsgroup Parallel Text - Part 1 (LDC2009T03). The PEA-TRAD p...
Mar 15, 2018
Arase, Yuki; Tsujii, Junichi, 2018, "SPADE", https://hdl.handle.net/11272.1/AB2/V6GR5J, Abacus Data Network, V1
SPADE (Syntactic Phrase Alignment Dataset for Evaluation) consists of annotated parse trees and alignment on English sentential paraphrases extracted from machine translation evaluation corpora and separated into development and test sets. Reference translations from machine tran...
Mar 15, 2018
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Ma, Xiaoyi; Wright, Jonathan, 2018, "LORELEI Somali Representative Language Pack - Monolingual and Parallel Text", https://hdl.handle.net/11272.1/AB2/75GGBX, Abacus Data Network, V1
LORELEI Somali Representative Language Pack - Monolingual and Parallel Text was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 13 million words of monolingual Somali text, approximately 800,000 of which are translated into English. Another 100...
Feb 16, 2018
Ellis, Joe; Getman, Jeremy; Graff, David; Strassel, Stephanie, 2018, "TAC KBP Comprehensive English Source Corpora 2009-2014", https://hdl.handle.net/11272.1/AB2/VC89SM, Abacus Data Network, V1
Introduction TAC KBP Comprehensive English Source Corpora 2009-2014 was developed by the Linguistic Data Consortium (LDC) and contains the 3,877,207 English source documents used in support of the TAC KBP tasks from 2009-2014. Text Analysis Conference (TAC) is a series of worksho...
Feb 16, 2018
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Ma, Xiaoyi; Wright, Jonathan, 2018, "LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text", https://hdl.handle.net/11272.1/AB2/5TNZPX, Abacus Data Network, V1
Introduction LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text was developed by the Linguistic Data Consortium and is comprised of approximately 25 million words of monolingual Amharic text, approximately 600,000 of which are translated into English. An...
Feb 16, 2018
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2018, "Multi-Language Conversational Telephone Speech 2011 -- Central Asian", https://hdl.handle.net/11272.1/AB2/YW9PX3, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 – Central Asian was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 37 hours of telephone speech in three distinct language varieties of Central Asia: Dari, Farsi and Pashto. The...
Jan 16, 2018
Ravanelli, Mirco; Cristoforetti, Luca; Omologo, Maurizio, 2018, "DIRHA English WSJ Audio", https://hdl.handle.net/11272.1/AB2/8WSEVY, Abacus Data Network, V1
Introduction DIRHA English WSJ Audio was developed as part of the Distant-Speech Interaction for Robust Home Applications (DIRHA) Project which addressed natural spontaneous speech interaction with distant microphones in a domestic environment. It is comprised of approximately 85...
Jan 16, 2018
Taulé, Mariona; Martí, Maria Antonia; Bies, Ann; Garí, Aina; Nofre, Montserrat; Song, Zhiyi; Strassel, Stephanie; Ellis, Joe, 2018, "DEFT Spanish Treebank", https://hdl.handle.net/11272.1/AB2/Z3OEWX, Abacus Data Network, V1
Introduction DEFT Spanish Treebank was developed by the Linguistic Data Consortium (LDC) and the Language and Computation Center (CLiC), University of Barcelona. It contains treebank annotation of international Spanish newswire text and Latin American Spanish discussion forum dat...
Jan 16, 2018
Linguistic Data Consortium; ELDA, 2018, "TRAD Chinese-French Parallel Text -- Blog", https://hdl.handle.net/11272.1/AB2/ATYE6I, Abacus Data Network, V1
Introduction TRAD Chinese-French Parallel Text – Blog was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 10,000 Chinese words from GALE Phase 1 Chinese Blog Parallel Text (LDC2008T06). The PEA-TRAD project (Translat...
Dec 15, 2017
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2017, "GALE Phase 4 Chinese Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/KTVMHA, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 134 hours of Chinese broadcast news speech collected in 2008 by LDC and Hong University of Science and Technology (HKUST...
Dec 15, 2017
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 4 Chinese Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/4ADDAM, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 134 hours of Mandarin Chinese broadcast news speech collected in 2008 by LDC and Hong University of Science and Technology (HKUST), Hong...
Nov 17, 2017
Mena, Carlos Daniel Hernández; Herrera, Abel, 2017, "CIEMPIESS Light", https://hdl.handle.net/11272.1/AB2/JXHBRG, Abacus Data Network, V1
CIEMPIESS (Corpus de Investigación en Español de México del Posgrado de Ingeniería Eléctrica y Servicio Social) Light was developed by the Speech Processing Laboratory of the Faculty of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximate...
Nov 17, 2017
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2017, "TAC KBP Chinese Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2011-2014", https://hdl.handle.net/11272.1/AB2/XOE0NF, Abacus Data Network, V1
TAC KBP Chinese Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2011-2014 was developed by the Linguistic Data Consortium and contains training and evaluation data produced in support of the TAC KBP Chinese Cross-lingual Entity Linking tasks in 2011, 201...
Oct 18, 2017
Chen, Xiaohe; Li, Bin; Feng, Minxuan; Xu, Chao; Xu, Runhua; Shi, Min; Yu, Lili; Xiao, Lei; Wang, Qingqing, 2017, "Ancient Chinese Corpus", https://hdl.handle.net/11272.1/AB2/4HYBFE, Abacus Data Network, V1
Ancient Chinese Corpus was developed at Nanjing Normal University. It contains word-segmented and part-of-speech tagged text from Zuozhuan, an ancient Chinese work believed to date from the Warring States Period (475-221 BC). Zuozhuan is a commentary on the Chunqui, a history of...
Oct 18, 2017
Kato, Akihiko; Shindo, Hiroyuki; Matsumoto, Yuji, 2017, "MWE-Aware English Dependency Corpus 2.0", https://hdl.handle.net/11272.1/AB2/GKYOY9, Abacus Data Network, V1
MWE-Aware English Dependency Corpus Version 2.0 was developed by the Nara Institute of Science and Technology Computational Linguistics Laboratory and consists of English compound function words annotated in dependency format. The data is derived from OntoNotes Release 5.0 (LDC20...
Oct 18, 2017
Graff, David; Ma, Xiaoyi; Strassel, Stephanie; Walker, Kevin; Jones, Karen, 2017, "RATS Keyword Spotting", https://hdl.handle.net/11272.1/AB2/IFVKNB, Abacus Data Network, V1
RATS Keyword Spotting was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 3,100 hours of Levantine Arabic and Farsi conversational telephone speech with automatic and manual annotation of speech segments, transcripts and keywords generated from...
Oct 18, 2017
O'Gorman, Tim; Conger, Katherine; Palmer, Martha, 2017, "English Web Treebank Propbank", https://hdl.handle.net/11272.1/AB2/Q8LILM, Abacus Data Network, V1
English Web Treebank Propbank, LDC Catalog Number LDC2017T15 and ISBN 1-58563-818-8, was developed by the University of Colorado Boulder - CLEAR (Computational Language and Education Research) and provides predicate-argument structure annotation for English Web Treebank (LDC2012T...
Oct 15, 2017
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2017, "Multi-Language Conversational Telephone Speech 2011 -- South Asian", https://hdl.handle.net/11272.1/AB2/JPGPJM, Abacus Data Network, V1
Multi-Language Conversational Telephone Speech 2011 – South Asian was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 118 hours of telephone speech in five distinct language varieties of South Asia (i.e. the Indian sub-continent): Bengali, Hind...
Sep 14, 2017
Shriberg, Elizabeth; Kathol, Andreas; Graciarena, Martin; Bratt, Harry; Kajarekar, Sachin; Jameel, Huda; Richey, Colleen; Goodman, Fred, 2017, "SRI-FRTIV", https://hdl.handle.net/11272.1/AB2/YONFH9, Abacus Data Network, V1
SRI-FRTIV (Five-way Recorded Toastmaster Intrinsic Variation) was developed by SRI International in 2007-2008 and is comprised of approximately 232 hours of English speech from thirty-four speakers who were members of Toastmaster clubs. Participants were asked to speak at three d...
Sep 14, 2017
Xue, Nianwen; Ng, Hwee Tou; Pradhan, Sameer; Rutherford, Attapol T.; Webber, Bonnie; Wang, Chuan; Wang, Hong Min; Prasad, Rashmi, 2017, "2015-2016 CoNLL Shared Task", https://hdl.handle.net/11272.1/AB2/TSNLNO, Abacus Data Network, V1
2015-2016 CoNLL Shared Task, LDC Catalog Number LDC2017T13 and ISBN 1-58563-812-9, contains the Chinese and English training, development and test data for the 2015 and 2016 CoNLL (Conference on Computational Natural Language Learning) Shared Task Evaluation which focused on shal...
Aug 15, 2017
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 4 Arabic Broadcast Conversation Speech", https://hdl.handle.net/11272.1/AB2/XFDC1A, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast Conversation Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 75 hours of Arabic broadcast conversation speech collected in 2008 and 2009 by LDC, MediaNet, Tunis, Tunisia and MTC, Rabat, Morocco during Ph...
Aug 15, 2017
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2017, "GALE Phase 4 Arabic Broadcast Conversation Transcripts", https://hdl.handle.net/11272.1/AB2/WLEBLW, Abacus Data Network, V1
GALE Phase 4 Arabic Broadcast Conversation Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 75 hours of Arabic broadcast conversation speech collected in 2008 and 2009 by LDC, MediaNet, Tunis, Tunisia and MTC, Rabat, M...
Jul 18, 2017
Petukhova, Volha; Malchanau, Andrei; Oualil, Youssef; Klakow, Dietrich; Stevens, Christopher; Weerd, Harmen de; Taatgen, Niels, 2017, "Metalogue Multi-Issue Bargaining Dialogue", https://hdl.handle.net/11272.1/AB2/U57KQP, Abacus Data Network, V1
Metalogue Multi-Issue Bargaining Dialogue was developed by the Metalogue Consortium under the European Community’s Seventh Framework Programme for Research and Technological Development. This release consists of approximately 2.5 hours of semantically annotated English dialogue d...
Jul 18, 2017
Meftah, Ali Hamid; Alotaibi, Yousef Ajami; Selouani, Sid-Ahmed, 2017, "KSUEmotions", https://hdl.handle.net/11272.1/AB2/3HNHPQ, Abacus Data Network, V1
KSUEmotions was developed by King Saud University (KSU) and contains approximately five hours of emotional Modern Standard Arabic (MSA) speech from 23 subjects. Speakers were from three countries: Yemen, Saudi Arabia and Syria. Subjects read MSA sentences from newswire text in th...
Jun 15, 2017
Knight, Kevin; Badarau, Bianca; Baranescu, Laura; Bonial, Claire; Bardocz, Madalina; Griffitt, Kira; Hermjakob, Ulf; Marcu, Daniel; Palmer, Martha; O'Gorman, Tim; Schneider, Nathan, 2017, "Abstract Meaning Representation (AMR) Annotation Release 2.0", https://hdl.handle.net/11272.1/AB2/8MN4GE, Abacus Data Network, V1
Abstract Meaning Representation (AMR) Annotation Release 2.0 was developed by the Linguistic Data Consortium (LDC), SDL/Language Weaver, Inc., the University of Colorado’s Computational Language and Educational Research group and the Information Sciences Institute at the Universi...
May 15, 2017
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2017, "Multi-Language Conversational Telephone Speech 2011 -- Turkish", https://hdl.handle.net/11272.1/AB2/FPNZZV, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 -- Turkish was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 18 hours of telephone speech in Turkish. The data were collected primarily to support research and technology evalua...
May 15, 2017
Huang, Ruihong; Jurafsky, Daniel; Riloff, Ellen, 2017, "The EventStatus Corpus", https://hdl.handle.net/11272.1/AB2/EGUSOP, Abacus Data Network, V1
Introdution The EventStatus Corpus was developed by researchers at Texas A&M University, Stanford University and The University of Utah. It consists of approximately 3,000 English and 1,500 Spanish news articles about civil unrest events annotated with temporal tags. This corpus...
May 15, 2017
Benowitz, Daniel; Bills, Aric; Conners, Thomas; Dubinski, Eyal; Fiscus, Jonathan; Harper, Mary; Heighway, Melanie; Le, Hanh; Melot, Jennifer; Onaka, Akiko; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2017, "IARPA Babel Lao Language Pack IARPA-babel203b-v3.1a", https://hdl.handle.net/11272.1/AB2/ME10OS, Abacus Data Network, V1
Introduction IARPA Babel Lao Language Pack IARPA-babel203b-v3.1a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 207 hours of Lao conversational and scripted telephone speech collected in 2013 along...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =