Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

1 to 50 of 322 Results
Jan 26, 2023
Arrigo, Michael; Strassel, Stephanie; Caruso, Christopher, 2023, "CAMIO Transcription Languages", https://hdl.handle.net/11272.1/AB2/IEJLCN, Abacus Data Network, V1
Abstract Introduction CAMIO Transcription Languages was developed by the Linguistic Data Consortium and contains nearly 70,000 images of machine printed text with corresponding annotations and transcripts in the following 13 languages: Arabic, Chinese, English, Farsi, Hindi, Japa...
Jan 25, 2023
Gadalla, Hassan; Kilany, Hanaa; Arram, Howaida; Yacoub, Ashraf; El-Habashi, Alaa; Shalaby, Amr; Karins, Krisjanis; Rowson, Everett; MacIntyre, Robert; Kingsbury, Paul; Graff, David; McLemore, Cynthia, 2023, "CALLHOME Egyptian Arabic Transcripts", https://hdl.handle.net/11272.1/AB2/Y03PCU, Abacus Data Network, V1
Abstract Introduction The text component of the CALLHOME Egyptian Arabic package includes transcripts and documentation files. The transcripts cover a contiguous five or ten minute segment taken from 120 unscripted telephone conversations between native speakers of Egyptian Collo...
Jan 25, 2023
Canavan, Alexandra; Zipperlen, George; Graff, David, 2023, "CALLHOME Egyptian Arabic Speech", https://hdl.handle.net/11272.1/AB2/J3CPAE, Abacus Data Network, V1
Abstract Introduction The CALLHOME Egyptian Arabic corpus of telephone speech consists of 120 unscripted telephone conversations between native speakers of Egyptian Colloquial Arabic (ECA), the spoken variety of Arabic found in Egypt. The dialect of ECA that this dictionary repre...
Jan 25, 2023
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2023, "GALE Phase 2 Arabic Broadcast News Transcripts Part 1", https://hdl.handle.net/11272.1/AB2/YPCAIR, Abacus Data Network, V1
Abstract Introduction GALE Phase 2 Arabic Broadcast News Transcripts Part 1 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 165 hours of Arabic broadcast news speech collected in 2006 and 2007 by LDC, MediaNet, Tunis, Tunisia and...
Jan 25, 2023
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2023, "GALE Phase 2 Arabic Broadcast News Speech Part 1", https://hdl.handle.net/11272.1/AB2/CXPTR7, Abacus Data Network, V1
Abstract Introduction GALE Phase 2 Arabic Broadcast News Speech Part 1 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 165 hours of Arabic broadcast news speech collected in 2006 and 2007 by LDC, MediaNet, Tunis, Tunisia and MTC, Rabat, Mor...
Jan 25, 2023
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2023, "GALE Phase 2 Arabic Broadcast Conversation Transcripts Part 2", https://hdl.handle.net/11272.1/AB2/CS2DU6, Abacus Data Network, V1
Abstract Introduction GALE Phase 2 Arabic Broadcast Conversation Transcripts Part 2 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 128 hours of Arabic broadcast conversation speech collected in 2007 by LDC, MediaNet, Tunis, Tuni...
Jan 25, 2023
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2023, "GALE Phase 2 Arabic Broadcast Conversation Speech Part 2", https://hdl.handle.net/11272.1/AB2/AJ2CAE, Abacus Data Network, V1
Abstract Introduction GALE Phase 2 Arabic Broadcast Conversation Speech Part 2 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 128 hours of Arabic broadcast conversation speech collected in 2007 by LDC, MediaNet, Tunis, Tunisia and MTC, Rab...
Jan 24, 2023
Ryant, Neville; Liberman, Mark; Fiumara, James; Cieri, Christopher, 2023, "Third DIHARD Challenge Evaluation", https://hdl.handle.net/11272.1/AB2/VQPCKU, Abacus Data Network, V1
Abstract Introduction Third DIHARD Challenge Evaluation was developed by the Linguistic Data Consortium (LDC) and contains approximately 33 hours of English and Chinese speech data along with corresponding annotations used in support of the Third DIHARD Challenge. The DIHARD Chal...
Jan 24, 2023
Liberman, Mark; Yuan, Jiahong; Cieri, Christopher; Wright, Jonathan, 2023, "Global TIMIT Thai", https://hdl.handle.net/11272.1/AB2/JY8T3N, Abacus Data Network, V1
Abstract Introduction Global TIMIT Thai was developed by the Linguistic Data Consortium and consists of approximately 12 hours of read speech and time-aligned transcripts in Standard Thai. The Global TIMIT project aimed to create a series of corpora in a variety of languages with...
Dec 8, 2022
Ryant, Neville; Liberman, Mark; Fiumara, James; Cieri, Christopher, 2022, "Third DIHARD Challenge Development", https://hdl.handle.net/11272.1/AB2/UY5O0X, Abacus Data Network, V1
Abstract Introduction Third DIHARD Challenge Development was developed by Linguistic Data Consortium (LDC) and contains approximately 34 hours of English and Chinese speech data along with corresponding annotations used in support of the Third DIHARD Challenge. The DIHARD Challen...
Dec 8, 2022
Bies, Ann; Mott, Justin; Warner, Colin; Kulick, Seth, 2022, "BOLT English Translation Treebank - Egyptian Arabic SMS/Chat", https://hdl.handle.net/11272.1/AB2/SPCYLS, Abacus Data Network, V1
Abstract Introduction BOLT English Translation Treebank - Egyptian Arabic SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of SMS and chat text data translated from Egyptian Arabic to English and annotated for part-of-speech and syntactic structure. The...
Nov 30, 2022
Byrne, William; Knodt, Eva; Bernstein, Jared; Emami, Farzhad, 2022, "Hispanic-English Database", https://hdl.handle.net/11272.1/AB2/IIJZCH, Abacus Data Network, V1
Abstract Introduction Hispanic-English Database contains approximately 30 hours of English and Spanish conversational and read speech with transcripts (24 hours) and metadata collected from 22 non-native English speakers between 1996 and 1998. The corpus was developed by Entropic...
Nov 30, 2022
Greenberg, Craig; Sadjadi, Omid; Reynolds, Douglas; Singer, Elliot; Graff, David, 2022, "2017 NIST Language Recognition Evaluation Training and Development Sets", https://hdl.handle.net/11272.1/AB2/K7LOKJ, Abacus Data Network, V1
Abstract Introduction 2017 NIST Language Recognition Evaluation Training and Development Sets contains training and development material for the 2017 NIST Language Recognition Evaluation. It consists of approximately 2,100 hours of conversational telephone speech, broadcast conve...
Nov 29, 2022
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Kulick, Seth; Griffitt, Kira; Delgado, Dana; Arrigo, Michael, 2022, "LORELEI Bengali Representative Language Pack", https://hdl.handle.net/11272.1/AB2/IG4DBS, Abacus Data Network, V1
Abstract Introduction LORELEI Bengali Representative Language Pack consists of Bengali monolingual text, Bengali-English parallel text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LORELEI program. The LO...
Nov 29, 2022
Lau, Mingfei; Zhong, Muhan; Lau, Chaak-ming; Su, Jian; Chan, Henry; Cheung, Bing, 2022, "Rime-Cantonese: A Normalized Cantonese Jyutping Lexicon", https://hdl.handle.net/11272.1/AB2/URBMXM, Abacus Data Network, V1
Abstract Introduction Rime-Cantonese: A Normalized Cantonese Jyutping Lexicon was developed by the Cantonese Computational Linguistics Infrastructure Working Group. It contains approximately 130,000 Cantonese character, word, and phrase entries paired with their corresponding rom...
Oct 13, 2022
Appen Pty Ltd. Sydney, Australia, 2022, "Gulf Arabic Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/SCSMSJ, Abacus Data Network, V1
Abstract Introduction Gulf Arabic Conversational Telephone Speech is a database developed by Appen Pty Ltd., Sydney, Australia and contains roughly 2,800 min of spontaneous telephone conversations in Colloquial Gulf Arabic. This corpus was collected and transcribed in 2004 by App...
Oct 13, 2022
Appen Pty Ltd. Sydney, Australia, 2022, "Iraqi Arabic Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/YBQF3Y, Abacus Data Network, V1
Abstract Introduction Iraqi Arabic Conversational Telephone Speech was developed by Appen Pty Ltd, Sydney, Australia and contains roughly 3000 mins of speech from Iraqi Arabic speakers taking part in spontaneous telephone conversations in Colloquial Iraqi Arabic. This corpus was...
Oct 13, 2022
Appen Pty Ltd. Sydney, Australia, 2022, "Gulf Arabic Conversational Telephone Speech, Transcripts", https://hdl.handle.net/11272.1/AB2/ZLBR2M, Abacus Data Network, V1
Abstract Introduction Gulf Arabic Conversational Telephone Speech, Transcripts is a database developed by Appen Pty Ltd., Sydney, Australia and contains transcripts of roughly 2,800 min of spontaneous telephone conversations in Colloquial Gulf Arabic. A total of 976 conversation...
Oct 13, 2022
Appen Pty Ltd. Sydney, Australia, 2022, "Iraqi Arabic Conversational Telephone Speech, Transcripts", https://hdl.handle.net/11272.1/AB2/ELQDGO, Abacus Data Network, V1
Abstract Introduction Iraqi Arabic Conversational Telephone Speech, Transcripts was developed by Appen Pty Ltd, Sydney, Australia and contains transcripts for roughly 3000 mins of speech from Iraqi Arabic speakers taking part in spontaneous telephone conversations in Colloquial I...
Oct 13, 2022
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2022, "GALE Phase 2 Arabic Broadcast Conversation Transcripts Part 1", https://hdl.handle.net/11272.1/AB2/MZSDMN, Abacus Data Network, V1
Abstract Introduction GALE Phase 2 Arabic Broadcast Conversation Transcripts Part 1 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 123 hours of Arabic broadcast conversation speech collected in 2006 and 2007 by LDC, MediaNet, Tu...
Oct 12, 2022
Alsulaiman, Mansour; Muhammad, Ghulam; Abdelkader, Bencherif Mohamed; Mahmood, Awais; Ali, Zulfiqar, 2022, "King Saud University Arabic Speech Database", https://hdl.handle.net/11272.1/AB2/4YVL4A, Abacus Data Network, V1
Abstract Introduction King Saud University Arabic Speech Database was developed by Speech Group (SG) at King Saud University and contains 590 hours of recorded Arabic speech from 269 male and female speakers. The utterances include read and spontaneous speech. The recordings were...
Oct 12, 2022
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2022, "GALE Phase 2 Arabic Broadcast Conversation Speech Part 1", https://hdl.handle.net/11272.1/AB2/GGD0CB, Abacus Data Network, V1
Abstract Introduction GALE Phase 2 Arabic Broadcast Conversation Speech Part 1 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 123 hours of Arabic broadcast conversation speech collected in 2006 and 2007 by LDC as part of the DARPA GALE (Gl...
Oct 12, 2022
Cieri, Christopher; Zhan, Juhong; Jiang, Yue; Liberman, Mark; Yuan, Jiahong; Chen, Yiya; Scharenborg, Odette, 2022, "Xi'an Guanzhong Object Naming", https://hdl.handle.net/11272.1/AB2/D2DBLV, Abacus Data Network, V1
Abstract Introduction Xi'an Guanzhong Object Naming is comprised of approximately 15 hours of audio recordings from speakers of the Guanzhong dialect of Mandarin Chinese living in or near Xi'an in Shaangxi Province (China) naming objects that appeared in colored line drawings. Th...
Sep 20, 2022
Li, Xuansong; Strassel, Stephanie; Jones, Karen; Antonishek, Brian; Fiscus, Jonathan G., 2022, "HAVIC MED Novel 2 Test -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/GNUQ1A, Abacus Data Network, V1
Abstract Introduction HAVIC MED Novel 2 Test -- Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 6,200 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and rel...
Aug 9, 2022
Carvalho, Vitor R.; Kiran, Yigit; Borthwick, Andrew, 2022, "American English Nickname Collection", https://hdl.handle.net/11272.1/AB2/JR1WG6, Abacus Data Network, V1
Abstract Introduction American English Nickname Collection was developed by Intelius, Inc. and is a compilation of American English nicknames to given name mappings based on information in US government records, public web profiles and financial and property reports. This corpus...
Aug 9, 2022
Ahmed, Abdelhamid M.; Myhill, Debra; Abdollahzadeh, Esmaeel; McCallum, Lee; Zaghouani, Wajdi; Rezk, Lameya; Jrad, Anissa; Zhang, Xiao, 2022, "Qatari Corpus of Argumentative Writing", https://hdl.handle.net/11272.1/AB2/F2P2EY, Abacus Data Network, V1
Abstract Introduction Qatari Corpus of Argumentative Writing was developed by Qatar University, University of Exeter and Hamad Bin Khalifa University and is comprised of approximately 200,000 tokens of Arabic and English writing by undergraduate students (159 female, 36 male) alo...
Jul 7, 2022
Ryant, Neville; Liberman, Mark; Fiumara, James; Cieri, Christopher, 2022, "Second DIHARD Challenge Evaluation - Eleven Sources", https://hdl.handle.net/11272.1/AB2/ML7KD5, Abacus Data Network, V1
Abstract Introduction Second DIHARD Challenge Evaluation - Eleven Sources was developed by the Linguistic Data Consortium (LDC) and contains approximately 20 hours of English and Chinese speech data along with corresponding annotations used in support of the Second DIHARD Challen...
Jul 7, 2022
Lewis, Gwyneth; van Rijn, Pol; Gwilliams, Laura; Larrouy-Maestri, Pauline; Poeppel, David; Ghitza, Oded, 2022, "NUBUC", https://hdl.handle.net/11272.1/AB2/IUFKIG, Abacus Data Network, V1
Abstract Introduction NUBUC (NyU-BU contextually controlled stories Corpus) was developed by New York University, Max Planck Institute for Empirical Aesthetics and Boston University. It contains approximately three hours of English read speech from eight stories focused on lingui...
Jun 10, 2022
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Griffitt, Kira; Delgado, Dana; Arrigo, Michael, 2022, "LORELEI Wolof Representative Language Pack", https://hdl.handle.net/11272.1/AB2/1M9HI6, Abacus Data Network, V1
Abstract Introduction LORELEI Wolof Representative Language Pack consists of Wolof monolingual text, Wolof-English parallel text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LORELEI program. The LORELEI...
Mar 31, 2022
Li, Xuansong; Strassel, Stephanie; Jones, Karen; Antonishek, Brian; Fiscus, Jonathan G., 2022, "HAVIC MED Novel 1 Test -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/SXVGS7, Abacus Data Network, V1
Abstract Introduction HAVIC MED Novel 1 Test -- Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 3,800 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and rel...
Mar 31, 2022
Alsaif, Amal; Alyahya, Tasniem; Alotibi, Madawi; Almuzaini, Huda; Alqahtani, Abeer, 2022, "AttImam", https://hdl.handle.net/11272.1/AB2/9FBCBG, Abacus Data Network, V1
Abstract Introduction AttImam was developed by Al-Imam Mohammad Ibn Saud Islamic University and consists of approximately 2,000 attribution relations applied to Arabic newswire text from Arabic Treebank: Part 1 v 4.1 (LDC2010T13). Attribution refers to the process of reporting or...
Mar 18, 2022
Andrus, Tony; Bills, Aric; Corris, Miriam; Dubinski, Eyal; Fiscus, Jonathan G.; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Hefright, Brook; Jarrett, Amy; Le, Hanh; Ray, Jessica; Rytting, Anton; Silber, Ronnie; Shen, Wade; Tzoukermann, Evelyne, 2022, "IARPA Babel Vietnamese Language Pack IARPA-babel107b-v0.7", https://hdl.handle.net/11272.1/AB2/WJGWAP, Abacus Data Network, V1
Abstract Introduction IARPA Babel Vietnamese Language Pack IARPA-babel107b-v0.7 was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 201 hours of Vietnamese conversational and scripted telephone speech co...
Mar 18, 2022
Bills, Aric; Conners, Thomas; Corris, Miriam; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Gann, Ketty; Harper, Mary; Kazi, Michael; Malyska, Nicolas; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Zawaydeh, Bushra, 2022, "IARPA Babel Dholuo Language Pack IARPA-babel403b-v1.0b", https://hdl.handle.net/11272.1/AB2/HSAU9N, Abacus Data Network, V1
Abstract Introduction IARPA Babel Dholuo Language Pack IARPA-babel403b-v1.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 204 hours of Dholuo conversational and scripted telephone speech collected...
Mar 18, 2022
Andresen, Lucy; Bills, Aric; Conners, Thomas; Cruz, Luanne Dela; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Le, Hanh; Maurillo, Arlene; Melot, Jennifer; Phillips, Josh; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2022, "IARPA Babel Cebuano Language Pack IARPA-babel301b-v2.0b", https://hdl.handle.net/11272.1/AB2/3EYPZM, Abacus Data Network, V1
Abstract Introduction IARPA Babel Cebuano Language Pack IARPA-babel301b-v2.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 191 hours of Cebuano conversational and scripted telephone speech collect...
Mar 18, 2022
Lulich, Steven M.; Alwan, Abeer; Sommers, Mitchell S.; Yeung, Gary, 2022, "The Child Subglottal Resonances Database", https://hdl.handle.net/11272.1/AB2/O4SRBR, Abacus Data Network, V1
Abstract Introduction The Child Subglottal Resonances Database was developed by Washington University and University of California Los Angeles and consists of 15.5 hours of simultaneous microphone and subglottal accelerometer recordings of 19 male and 9 female child speakers of A...
Mar 18, 2022
Vijayalakshmi, P.; Celin, T. A. Mariya; Nagarajan, T., 2022, "The SSNCE Database of Tamil Dysarthric Speech", https://hdl.handle.net/11272.1/AB2/QXP9LM, Abacus Data Network, V1
Abstract Introduction The SSNCE Database of Tamil Dysarthric Speech was developed by the Speech Lab, SSN College of Engineering, India, in collaboration with the Indian National Institute of Empowerment of Persons with Multiple Disabilities (NIEPMD) and contains approximately eig...
Mar 18, 2022
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Ma, Xiaoyi; Kulick, Seth; Delgado, Dana; Arrigo, Michael, 2022, "LORELEI Ukrainian Representative Language Pack", https://hdl.handle.net/11272.1/AB2/GUYCZL, Abacus Data Network, V1
Abstract Introduction LORELEI Ukrainian Representative Language Pack consists of Ukrainian monolingual text, Ukrainian-English parallel and comparable text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LO...
Mar 18, 2022
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Bies, Ann, 2022, "LORELEI Tigrinya Incident Language Pack", https://hdl.handle.net/11272.1/AB2/CTYB7Q, Abacus Data Network, V1
Abstract Introduction LORELEI Tigrinya Incident Language Pack was developed by the Linguistic Data Consortium and is comprised of approximately 4.5 million words of Tigrinya monolingual text, 25,000 words of English monolingual text, 235,000 words of parallel and comparable Tigri...
Mar 18, 2022
Palmer, Martha; Hwang, Jena D.; Bonial, Claire; O'Gorman, Tim; Gung, James; Stowe, Kevin; Green, Meredith, 2022, "BOLT English PropBank and Sense -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/QABG8N, Abacus Data Network, V1
Abstract Introduction BOLT English PropBank and Sense -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech was developed by the University of Colorado Boulder - CLEAR (Computational Language and Education Research) and consists of propbank and verb sense disambiguat...
Mar 18, 2022
Agarwal, Nitin; Franchini, Michelle; Kappler, Michelle; Micciulla, Linnea; Pradhan, Sameer; Ramshaw, Lance, 2022, "BOLT English Co-reference -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/3JEVXI, Abacus Data Network, V1
Abstract Introduction BOLT English Co-reference -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech was developed by Raytheon BBN Technologies and consists of co-reference annotation on English discussion forum (DF), SMS/Chat and conversational telephone speech (CT...
Mar 18, 2022
Chen, Song; Strassel, Stephanie; Mott, Justin, 2022, "DEFT Chinese Light and Rich ERE Annotation", https://hdl.handle.net/11272.1/AB2/MUVS7U, Abacus Data Network, V1
Abstract Introduction DEFT Chinese Light and Rich ERE Annotation was developed by the Linguistic Data Consortium (LDC) and consists of 157 Chinese discussion forum documents annotated for entities, relations and events (ERE). DARPA's Deep Exploration and Filtering of Text (DEFT)...
Mar 18, 2022
Ellis, Joe; Getman, Jeremy; Chen, Song; Strassel, Stephanie, 2022, "TAC KBP Event Argument - Comprehensive Training and Evaluation Data 2016-2017", https://hdl.handle.net/11272.1/AB2/KSIXIZ, Abacus Data Network, V1
Abstract Introduction TAC KBP Event Argument - Comprehensive Training and Evaluation Data 2016-2017 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the 2016 TAC KBP Event Argument Linking Pilot and Evaluation...
Mar 18, 2022
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Kulick, Seth; Griffitt, Kira; Delgado, Dana; Arrigo, Michael, 2022, "LORELEI Vietnamese Representative Language Pack", https://hdl.handle.net/11272.1/AB2/JWPEIA, Abacus Data Network, V1
Abstract Introduction LORELEI Vietnamese Representative Language Pack consists of Vietnamese monolingual text, Vietnamese-English parallel text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LORELEI progra...
Mar 18, 2022
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie, 2022, "BOLT Chinese-English Word Alignment and Tagging -- Conversational Telephone Speech Training", https://hdl.handle.net/11272.1/AB2/N2DIGA, Abacus Data Network, V1
Abstract Introduction BOLT Chinese-English Word Alignment and Tagging -- Conversational Telephone Speech Training was developed by the Linguistic Data Consortium (LDC) and consists of 158,651 words of Chinese and English parallel text enhanced with linguistic tags to indicate wor...
Mar 18, 2022
Chen, Eric Y.; Lu, Zhiyun; Xu, Hao; Cao, Liangliang; Zhang, Yu; Fan, James, 2022, "Speech Sentiment Annotations", https://hdl.handle.net/11272.1/AB2/HD3CEY, Abacus Data Network, V1
Abstract Introduction Speech Sentiment Annotations was developed by Google Inc. It consists of sentiment labels (positive, negative, neutral) for approximately 49,500 utterances covering 140 hours of audio from Switchboard-1 Release 2 (LDC97S62). Switchboard-1 Release 2 consists...
Mar 18, 2022
Bies, Ann; Ellis, Joe; Getman, Jeremy; Chen, Song; Strassel, Stephanie, 2022, "TAC KBP English Event Nugget Detection and Coreference - Comprehensive Training and Evaluation Data 2014-2015", https://hdl.handle.net/11272.1/AB2/UHLXHR, Abacus Data Network, V1
Abstract Introduction TAC KBP English Event Nugget Detection and Coreference - Comprehensive Training and Evaluation Data 2014-2015 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Event Nug...
Mar 18, 2022
Wang, Shichang; Huang, Chu-Ren; Yao, Yao; Chan, Angel, 2022, "SemTransCNC", https://hdl.handle.net/11272.1/AB2/TV07UB, Abacus Data Network, V1
Abstract Introduction SemTransCNC was developed by The Hong Kong Polytechnic University. It is comprised of a semantic transparency dataset of Chinese nominal compounds built using a series of crowd-based experiments. Nominal compounds were selected from the Sinica Corpus and a m...
Mar 18, 2022
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2022, "TAC KBP English Temporal Slot Filling - Comprehensive Training and Evaluation Data 2011 and 2013", https://hdl.handle.net/11272.1/AB2/TVJSBF, Abacus Data Network, V1
Abstract Introduction TAC KBP English Temporal Slot Filling - Comprehensive Training and Evaluation Data 2011 and 2013 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Temporal Slot Filling...
Mar 18, 2022
Damonte, Marco; Cohen, Shay, 2022, "Abstract Meaning Representation 2.0 - Four Translations", https://hdl.handle.net/11272.1/AB2/5OU0AQ, Abacus Data Network, V1
Abstract Introduction Abstract Meaning Representation 2.0 - Four Translations was developed by researchers at the University of Edinburgh, School of Informatics and consists of Spanish, German, Italian and Chinese Mandarin translations of a subset of sentences from Abstract Meani...
Mar 18, 2022
Santus, Enrico; Liu, Hongchao; Huang, Chu-Ren, 2022, "EVALution", https://hdl.handle.net/11272.1/AB2/JQ231B, Abacus Data Network, V1
Abstract Introduction EVALution was developed by The Hong Kong Polytechnic University. It is comprised of English and Mandarin Chinese data sets -- EVALution 1.0 and EVALution-Man, respectively -- that contain semantic relations and metadata for training and evaluating distributi...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =