Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

301 to 350 of 2,452 Results
Mar 18, 2022 - Linguistic Data Consortium
Bills, Aric; Conners, Thomas; Corris, Miriam; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Gann, Ketty; Harper, Mary; Kazi, Michael; Malyska, Nicolas; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Zawaydeh, Bushra, 2022, "IARPA Babel Dholuo Language Pack IARPA-babel403b-v1.0b", https://hdl.handle.net/11272.1/AB2/HSAU9N, Abacus Data Network, V1
Abstract Introduction IARPA Babel Dholuo Language Pack IARPA-babel403b-v1.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 204 hours of Dholuo conversational and scripted telephone speech collected...
Mar 18, 2022 - Linguistic Data Consortium
Andresen, Lucy; Bills, Aric; Conners, Thomas; Cruz, Luanne Dela; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Le, Hanh; Maurillo, Arlene; Melot, Jennifer; Phillips, Josh; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2022, "IARPA Babel Cebuano Language Pack IARPA-babel301b-v2.0b", https://hdl.handle.net/11272.1/AB2/3EYPZM, Abacus Data Network, V1
Abstract Introduction IARPA Babel Cebuano Language Pack IARPA-babel301b-v2.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 191 hours of Cebuano conversational and scripted telephone speech collect...
Mar 18, 2022 - Linguistic Data Consortium
Lulich, Steven M.; Alwan, Abeer; Sommers, Mitchell S.; Yeung, Gary, 2022, "The Child Subglottal Resonances Database", https://hdl.handle.net/11272.1/AB2/O4SRBR, Abacus Data Network, V1
Abstract Introduction The Child Subglottal Resonances Database was developed by Washington University and University of California Los Angeles and consists of 15.5 hours of simultaneous microphone and subglottal accelerometer recordings of 19 male and 9 female child speakers of A...
Mar 18, 2022 - Linguistic Data Consortium
Vijayalakshmi, P.; Celin, T. A. Mariya; Nagarajan, T., 2022, "The SSNCE Database of Tamil Dysarthric Speech", https://hdl.handle.net/11272.1/AB2/QXP9LM, Abacus Data Network, V1
Abstract Introduction The SSNCE Database of Tamil Dysarthric Speech was developed by the Speech Lab, SSN College of Engineering, India, in collaboration with the Indian National Institute of Empowerment of Persons with Multiple Disabilities (NIEPMD) and contains approximately eig...
Mar 18, 2022 - Linguistic Data Consortium
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Ma, Xiaoyi; Kulick, Seth; Delgado, Dana; Arrigo, Michael, 2022, "LORELEI Ukrainian Representative Language Pack", https://hdl.handle.net/11272.1/AB2/GUYCZL, Abacus Data Network, V1
Abstract Introduction LORELEI Ukrainian Representative Language Pack consists of Ukrainian monolingual text, Ukrainian-English parallel and comparable text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LO...
Mar 18, 2022 - Linguistic Data Consortium
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Bies, Ann, 2022, "LORELEI Tigrinya Incident Language Pack", https://hdl.handle.net/11272.1/AB2/CTYB7Q, Abacus Data Network, V1
Abstract Introduction LORELEI Tigrinya Incident Language Pack was developed by the Linguistic Data Consortium and is comprised of approximately 4.5 million words of Tigrinya monolingual text, 25,000 words of English monolingual text, 235,000 words of parallel and comparable Tigri...
Mar 18, 2022 - Linguistic Data Consortium
Palmer, Martha; Hwang, Jena D.; Bonial, Claire; O'Gorman, Tim; Gung, James; Stowe, Kevin; Green, Meredith, 2022, "BOLT English PropBank and Sense -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/QABG8N, Abacus Data Network, V1
Abstract Introduction BOLT English PropBank and Sense -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech was developed by the University of Colorado Boulder - CLEAR (Computational Language and Education Research) and consists of propbank and verb sense disambiguat...
Mar 18, 2022 - Linguistic Data Consortium
Agarwal, Nitin; Franchini, Michelle; Kappler, Michelle; Micciulla, Linnea; Pradhan, Sameer; Ramshaw, Lance, 2022, "BOLT English Co-reference -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/3JEVXI, Abacus Data Network, V1
Abstract Introduction BOLT English Co-reference -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech was developed by Raytheon BBN Technologies and consists of co-reference annotation on English discussion forum (DF), SMS/Chat and conversational telephone speech (CT...
Mar 18, 2022 - Linguistic Data Consortium
Chen, Song; Strassel, Stephanie; Mott, Justin, 2022, "DEFT Chinese Light and Rich ERE Annotation", https://hdl.handle.net/11272.1/AB2/MUVS7U, Abacus Data Network, V1
Abstract Introduction DEFT Chinese Light and Rich ERE Annotation was developed by the Linguistic Data Consortium (LDC) and consists of 157 Chinese discussion forum documents annotated for entities, relations and events (ERE). DARPA's Deep Exploration and Filtering of Text (DEFT)...
Mar 18, 2022 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Chen, Song; Strassel, Stephanie, 2022, "TAC KBP Event Argument - Comprehensive Training and Evaluation Data 2016-2017", https://hdl.handle.net/11272.1/AB2/KSIXIZ, Abacus Data Network, V1
Abstract Introduction TAC KBP Event Argument - Comprehensive Training and Evaluation Data 2016-2017 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the 2016 TAC KBP Event Argument Linking Pilot and Evaluation...
Mar 18, 2022 - Linguistic Data Consortium
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Kulick, Seth; Griffitt, Kira; Delgado, Dana; Arrigo, Michael, 2022, "LORELEI Vietnamese Representative Language Pack", https://hdl.handle.net/11272.1/AB2/JWPEIA, Abacus Data Network, V1
Abstract Introduction LORELEI Vietnamese Representative Language Pack consists of Vietnamese monolingual text, Vietnamese-English parallel text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LORELEI progra...
Mar 18, 2022 - Linguistic Data Consortium
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie, 2022, "BOLT Chinese-English Word Alignment and Tagging -- Conversational Telephone Speech Training", https://hdl.handle.net/11272.1/AB2/N2DIGA, Abacus Data Network, V1
Abstract Introduction BOLT Chinese-English Word Alignment and Tagging -- Conversational Telephone Speech Training was developed by the Linguistic Data Consortium (LDC) and consists of 158,651 words of Chinese and English parallel text enhanced with linguistic tags to indicate wor...
Mar 18, 2022 - Linguistic Data Consortium
Chen, Eric Y.; Lu, Zhiyun; Xu, Hao; Cao, Liangliang; Zhang, Yu; Fan, James, 2022, "Speech Sentiment Annotations", https://hdl.handle.net/11272.1/AB2/HD3CEY, Abacus Data Network, V1
Abstract Introduction Speech Sentiment Annotations was developed by Google Inc. It consists of sentiment labels (positive, negative, neutral) for approximately 49,500 utterances covering 140 hours of audio from Switchboard-1 Release 2 (LDC97S62). Switchboard-1 Release 2 consists...
Mar 18, 2022 - Linguistic Data Consortium
Bies, Ann; Ellis, Joe; Getman, Jeremy; Chen, Song; Strassel, Stephanie, 2022, "TAC KBP English Event Nugget Detection and Coreference - Comprehensive Training and Evaluation Data 2014-2015", https://hdl.handle.net/11272.1/AB2/UHLXHR, Abacus Data Network, V1
Abstract Introduction TAC KBP English Event Nugget Detection and Coreference - Comprehensive Training and Evaluation Data 2014-2015 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Event Nug...
Mar 18, 2022 - Linguistic Data Consortium
Wang, Shichang; Huang, Chu-Ren; Yao, Yao; Chan, Angel, 2022, "SemTransCNC", https://hdl.handle.net/11272.1/AB2/TV07UB, Abacus Data Network, V1
Abstract Introduction SemTransCNC was developed by The Hong Kong Polytechnic University. It is comprised of a semantic transparency dataset of Chinese nominal compounds built using a series of crowd-based experiments. Nominal compounds were selected from the Sinica Corpus and a m...
Mar 18, 2022 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2022, "TAC KBP English Temporal Slot Filling - Comprehensive Training and Evaluation Data 2011 and 2013", https://hdl.handle.net/11272.1/AB2/TVJSBF, Abacus Data Network, V1
Abstract Introduction TAC KBP English Temporal Slot Filling - Comprehensive Training and Evaluation Data 2011 and 2013 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Temporal Slot Filling...
Mar 18, 2022 - Linguistic Data Consortium
Damonte, Marco; Cohen, Shay, 2022, "Abstract Meaning Representation 2.0 - Four Translations", https://hdl.handle.net/11272.1/AB2/5OU0AQ, Abacus Data Network, V1
Abstract Introduction Abstract Meaning Representation 2.0 - Four Translations was developed by researchers at the University of Edinburgh, School of Informatics and consists of Spanish, German, Italian and Chinese Mandarin translations of a subset of sentences from Abstract Meani...
Mar 18, 2022 - Linguistic Data Consortium
Santus, Enrico; Liu, Hongchao; Huang, Chu-Ren, 2022, "EVALution", https://hdl.handle.net/11272.1/AB2/JQ231B, Abacus Data Network, V1
Abstract Introduction EVALution was developed by The Hong Kong Polytechnic University. It is comprised of English and Mandarin Chinese data sets -- EVALution 1.0 and EVALution-Man, respectively -- that contain semantic relations and metadata for training and evaluating distributi...
Mar 18, 2022 - Linguistic Data Consortium
Alshaari, Mohamed; ElHarati, Hussien; Kepuska, Veton, 2022, "Phonemes of Arabic", https://hdl.handle.net/11272.1/AB2/WSRL3A, Abacus Data Network, V1
Abstract Introduction Phonemes of Arabic was developed at the Florida Institute of Technology. It consists of approximately one hour of speech from native Arabic speakers that includes all Arabic sounds (consonants and vowels) and 24 words with specific consonant-vowel patterns....
Mar 18, 2022 - Linguistic Data Consortium
Jiang, Yue; Zhan, Juhong; Han, Hongjian; Xu, Zuohao; Zhou, Haiyan; Yuan, Jiahong; Liberman, Mark, 2022, "Global TIMIT Mandarin Chinese-Guanzhong Dialect", https://hdl.handle.net/11272.1/AB2/MFTAUQ, Abacus Data Network, V1
Abstract Introduction Global TIMIT Mandarin Chinese-Guanzhong Dialect was developed by the Linguistic Data Consortium and Xi'an Jiaotong University and consists of approximately five hours of read speech and transcripts in the Guanzhong dialect of Mandarin Chinese as spoken in Sh...
Mar 18, 2022 - Linguistic Data Consortium
Ding, Hongwei; Liao, Sishi; Zhan, Yuqing; Feng, Hui; He, Wenchao; Hu, Xiaoyan; Wu, Yu; Yuan, Jiahong; Liberman, Mark, 2022, "Global TIMIT Learner Simple English", https://hdl.handle.net/11272.1/AB2/NMUWWH, Abacus Data Network, V1
Abstract Introduction Global TIMIT Learner Simple English was developed by the Linguistic Data Consortium and Shanghai Jiao Tong University and consists of approximately 12 hours of L1 and L2 English read speech and transcripts. The Global TIMIT project aimed to create a series o...
Mar 18, 2022 - Linguistic Data Consortium
Luan, Huan; Wang, Yanhong; Feng, Hui; He, Wenchao; Hu, Xiaoyan; Wu, Yu; Yuan, Jiahong; Liberman, Mark, 2022, "Global TIMIT Learner Treebank English", https://hdl.handle.net/11272.1/AB2/A2ZRDI, Abacus Data Network, V1
Abstract Introduction Global TIMIT Learner Treebank English was developed by the Linguistic Data Consortium and LAIX Inc. and consists of approximately 24 hours of L1 and L2 English read speech and transcripts. The Global TIMIT project aimed to create a series of corpora in a var...
Mar 18, 2022 - Linguistic Data Consortium
Canavan, Alexandra; Zipperlen, George; Bartlett, John, 2022, "CALLFRIEND American English-Southern Dialect Second Edition", https://hdl.handle.net/11272.1/AB2/O0EZK5, Abacus Data Network, V1
Abstract Introduction CALLFRIEND American English-Southern Dialect Second Edition was developed by LDC and consists of approximately 26 hours of unscripted telephone conversations between native speakers of Southern dialects of American English. This second edition updates the au...
Mar 18, 2022 - Linguistic Data Consortium
Canavan, Alexandra; Zipperlen, George; Bartlett, John, 2022, "CALLFRIEND Mandarin Chinese-Taiwan Dialect Second Edition", https://hdl.handle.net/11272.1/AB2/AT8NRM, Abacus Data Network, V1
Abstract Introduction CALLFRIEND Mandarin Chinese-Taiwan Dialect Second Edition was developed by the Linguistic Data Consortium (LDC) and consists of approximately 27 hours of unscripted telephone conversations between native speakers of the Taiwan dialect of Mandarin Chinese. Th...
Mar 18, 2022 - Linguistic Data Consortium
Chen, Song; Yuan, Jiahong; Ma, Xiaoyi; Strassel, Stephanie, 2022, "Chinese Lexical Resources for Gender, Number, Animacy", https://hdl.handle.net/11272.1/AB2/2CSZDM, Abacus Data Network, V1
Abstract Introduction Chinese Lexical Resources for Gender, Number, Animacy was developed by the Linguistic Data Consortium (LDC) and consists of gender, number, and animacy lexicons produced in support of the DARPA DEFT program. Gender, number and animacy are lexical indicators...
Mar 18, 2022 - Linguistic Data Consortium
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2022, "GALE Phase 4 Chinese Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/TVASI8, Abacus Data Network, V1
Abstract Introduction GALE Phase 4 Chinese Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 134 hours of Chinese broadcast news speech collected in 2008 by LDC and Hong University of Science and Technolo...
Mar 18, 2022 - Linguistic Data Consortium
Hirschberg, Julia; Gravano, Agustin; Benus, Stefan; Ward, Gregory; Sneed German, Elisa, 2022, "Columbia Games Corpus", https://hdl.handle.net/11272.1/AB2/TPZYOR, Abacus Data Network, V1
Abstract Introduction Columbia Games Corpus was developed by the Spoken Language Group, Columbia University and the Department of Linguistics, Northwestern University. It consists of approximately 10 hours of spontaneous English conversation along with corresponding orthographic...
Mar 18, 2022 - Linguistic Data Consortium
Mohammadi, Ariana Negar, 2022, "Corpus of Law, Academic, and News", https://hdl.handle.net/11272.1/AB2/VMWYC0, Abacus Data Network, V1
Abstract Introduction Corpus of Law, Academic, and News consists of 400 Persian documents divided into three genres: legal, academic, and news. The legal section contains texts from official publications, including the civil penal code, the criminal penal code, and the constituti...
Mar 18, 2022 - Linguistic Data Consortium
Kroch, Anthony, 2022, "Penn Parsed Corpora of Historical English", https://hdl.handle.net/11272.1/AB2/NWMKHI, Abacus Data Network, V1
Abstract Introduction Penn Parsed Corpora of Historical English was developed at the University of Pennsylvania and consists of running texts and text samples of British English prose from the earliest Middle English documents (1100 CE) up to the period of the First World War (19...
Mar 18, 2022 - Linguistic Data Consortium
Jiang, Yue; Zhan, Juhong; Han, Hongjian; Xu, Zuohao; Zhou, Haiyan; Yuan, Jiahong; Liberman, Mark, 2022, "Global TIMIT Mandarin Chinese-Guanzhong Dialect", https://hdl.handle.net/11272.1/AB2/FF5DX5, Abacus Data Network, V1
Abstract Introduction Global TIMIT Mandarin Chinese-Guanzhong Dialect was developed by the Linguistic Data Consortium and Xi'an Jiaotong University and consists of approximately five hours of read speech and transcripts in the Guanzhong dialect of Mandarin Chinese as spoken in Sh...
Mar 18, 2022 - Linguistic Data Consortium
Bills, Aric; Conners, Thomas; David, Anne; Cruz, Luanne Dela; Dubinski, Eyal; Fiscus, Jonathan G.; Gann, Ketty; Harper, Mary; Kazi, Michael; Le, Hanh; Malyska, Nicolas; Melot, Jennifer; Ray, Jessica; Richardson, Fred; Rytting, Anton; Zwanenburg, Jacqui, 2022, "IARPA Babel Javanese Language Pack IARPA-babel402b-v1.0b", https://hdl.handle.net/11272.1/AB2/BBDKDK, Abacus Data Network, V1
Abstract Introduction IARPA Babel Javanese Language Pack IARPA-babel402b-v1.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 204 hours of Javanese conversational and scripted telephone speech colle...
Mar 18, 2022 - Linguistic Data Consortium
Andresen, Lucy; Bills, Aric; Brugman, Claudia; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Gann, Ketty; Harper, Mary; Kazi, Michael; Le, Hanh; Malyska, Nicolas; Maurillo, Arlene; Melot, Jennifer; Paget, Shelley; Prebble, Jane Elizabeth; Ray, Jessica; Richardson, Fred; Rytting, Anton; Shen, Sinney, 2022, "IARPA Babel Guarani Language Pack IARPA-babel305b-v1.0c", https://hdl.handle.net/11272.1/AB2/C2XGCW, Abacus Data Network, V1
Abstract Introduction IARPA Babel Guarani Language Pack IARPA-babel305b-v1.0c was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 198 hours of Guarani conversational and scripted telephone speech collect...
Mar 18, 2022 - Linguistic Data Consortium
Benowitz, Daniel; Bills, Aric; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Hefright, Brook; Le, Hanh; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Shen, Sinney; Smith, Rosanna; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2022, "IARPA Babel Lithuanian Language Pack IARPA-babel304b-v1.0b", https://hdl.handle.net/11272.1/AB2/5MR7Z2, Abacus Data Network, V1
Abstract Introduction IARPA Babel Lithuanian Language Pack IARPA-babel304b-v1.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 210 hours of Lithuanian conversational and scripted telephone speech c...
Mar 18, 2022 - Linguistic Data Consortium
Consortium, Linguistic Data, 2022, "2007 CoNLL Shared Task - Arabic & English", https://hdl.handle.net/11272.1/AB2/X7AEOJ, Abacus Data Network, V1
Abstract Introduction 2007 CoNLL Shared Task - Arabic & English consists of dependency treebanks in two languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are Arabic and English. LD...
Mar 18, 2022 - Linguistic Data Consortium
Country, University of the Basque; Catalunya, Technical University of; University, Charles; University, Middle East Technical; University, Sabanci, 2022, "2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish", https://hdl.handle.net/11272.1/AB2/R8ZR6Q, Abacus Data Network, V1
Abstract Introduction 2007 CoNLL Shared Task - Basque, Catalan, Czech & Turkish consists of dependency treebanks in four languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are: Basq...
Mar 18, 2022 - Linguistic Data Consortium
Bills, Aric; Conners, Thomas; Corris, Miriam; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Heighway, Melanie; Kozlov, Kirill; Malyska, Nicolas; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2022, "IARPA Babel Tok Pisin Language Pack IARPA-babel207b-v1.0e", https://hdl.handle.net/11272.1/AB2/CTDWII, Abacus Data Network, V1
Abstract Introduction IARPA Babel Tok Pisin Language Pack IARPA-babel207b-v1.0e was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 200 hours of Tok Pisin conversational and scripted telephone speech col...
Mar 18, 2022 - Linguistic Data Consortium
Bills, Aric; Conners, Thomas; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Heighway, Melanie; Lin, Willa; Melot, Jennifer; Paget, Shelley; Ray, Jessica; Roomi, Bergul; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Zwanenburg, Jacqui, 2022, "IARPA Babel Kurmanji Kurdish Language Pack IARPA-babel205b-v1.0a", https://hdl.handle.net/11272.1/AB2/HRUQMM, Abacus Data Network, V1
Abstract Introduction IARPA Babel Kurmanji Kurdish Language Pack IARPA-babel205b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 203 hours of Kurmanji Kurdish conversational and scripted teleph...
Mar 18, 2022 - Linguistic Data Consortium
Adams, Nikki; Bills, Aric; Conners, Thomas; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Lin, Willa; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Wong, Jamie, 2022, "IARPA Babel Zulu Language Pack IARPA-babel206b-v0.1e", https://hdl.handle.net/11272.1/AB2/SJQNLO, Abacus Data Network, V1
Abstract Introduction IARPA Babel Zulu Language Pack IARPA-babel206b-v0.1e was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 211 hours of Zulu conversational and scripted telephone speech collected in...
Mar 18, 2022 - Linguistic Data Consortium
Andrus, Tony; Bills, Aric; Conners, Thomas; Crabb, Erin Smith; Dubinski, Eyal; Fiscus, Jonathan G.; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Hefright, Brook; Jarrett, Amy; Le, Hanh; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne, 2022, "IARPA Babel Haitian Creole Language Pack IARPA-babel201b-v0.2b", https://hdl.handle.net/11272.1/AB2/O4K5VU, Abacus Data Network, V1
Abstract Introduction IARPA Babel Haitian Creole Language Pack IARPA-babel201b-v0.2b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 203 hours of Haitian Creole conversational and scripted telephone...
Mar 18, 2022 - Linguistic Data Consortium
Richie, Carolyn; Warburton, Sarah; Carter, Megan, 2022, "Audiovisual Database of Spoken American English", https://hdl.handle.net/11272.1/AB2/8KIBXB, Abacus Data Network, V1
Abstract Introduction The Audiovisual Database of Spoken American English, Linguistic Data Consortium (LDC) catalog number LDC2009V01 and isbn 1-58563-496-4, was developed at Butler University, Indianapolis, IN in 2007 for use by a a variety of researchers to evaluate speech prod...
Mar 18, 2022 - Linguistic Data Consortium
Fung, Pascale; Huang, Shudong; Graff, David, 2022, "HKUST Mandarin Telephone Transcript Data, Part 1", https://hdl.handle.net/11272.1/AB2/UOHG3I, Abacus Data Network, V1
Abstract Introduction HKUST Mandarin Telephone Transcript Data Part 1 was developed by Hong Kong University of Science and Technology (HKUST) and contains transcripts for 897 telephone conversations in Mandarin Chinese. In 2004 HKUST was contracted to collect and transcribe 200 h...
Mar 18, 2022 - Linguistic Data Consortium
Fung, Pascale; Huang, Shudong; Graff, David, 2022, "HKUST Mandarin Telephone Speech, Part 1", https://hdl.handle.net/11272.1/AB2/TKM8OR, Abacus Data Network, V1
Abstract Introduction HKUST Mandarin Telephone Speech, Part 1 was developed by Hong Kong University of Science and Technology (HKUST) and contains approximately 149 hours of conversational telephone speech (CTS) in Mandarin. Given that Standard Mandarin is not the native dialect...
Mar 16, 2022 - Statistics Canada - DLI
Statistics Canada, 2022, "Postal Code Conversion File, March 2022 Postal Codes, 2022", https://hdl.handle.net/11272.1/AB2/UAGJKN, Abacus Data Network, V1
The Postal Code Project is responsible for linking the approximately 900,000 single postal codes in Canada to Statistics Canada’s Census dissemination geography, (presently 2021 Census geography). This process is performed by using data provided by Canada Post Corporation and lin...
Mar 11, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "Preliminary dataset on confirmed cases of COVID-19, Public Health Agency of Canada [custom extraction]", https://hdl.handle.net/11272.1/AB2/ME03PG, Abacus Data Network, V1, UNF:6:3Eliht9aisreZk+zSfJPtw== [fileUNF]
This dataset provides Canadians and researchers with preliminary data on the confirmed cases of coronavirus (COVID-19) in Canada. Given the rapidly-evolving nature of this situation, these data are considered preliminary. The dataset was downloaded from Statistics Canada as a CSV...
Mar 2, 2022 - Statistics Canada - DLI
Statistics Canada, 2021, "Postal Code Conversion File, August 2021 Postal Codes, 2021", https://hdl.handle.net/11272.1/AB2/HJPB6W, Abacus Data Network, V2
The Postal Code Project is responsible for linking the approximately 900,000 single postal codes in Canada to Statistics Canada’s Census dissemination geography, (presently 2016 Census geography). This process is performed by using data provided by Canada Post Corporation and lin...
Feb 9, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "National Travel Survey, 2020", https://hdl.handle.net/11272.1/AB2/VM9BXS, Abacus Data Network, V1, UNF:6:4stdrvj4lIxqeysb1XWD7A== [fileUNF]
The National Travel Survey (NTS) was developed to fully replace the Travel Survey of Residents of Canada (TSRC record number 3810) and replace the Canadian resident component of the International Travel Survey (ITS record number 3152). The National Travel Survey collects informat...
Feb 7, 2022 - Linguistic Data Consortium
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Bies, Ann, 2022, "LORELEI Kinyarwanda Incident Language Pack", https://hdl.handle.net/11272.1/AB2/P1OIX0, Abacus Data Network, V1
Abstract Introduction LORELEI Kinyarwanda Incident Language Pack was developed by the Linguistic Data Consortium and is comprised of approximately 11.9 million words of Kinyarwanda monolingual text, 35,000 words of English monolingual text, 3.4 million words of parallel and compa...
Feb 7, 2022 - Linguistic Data Consortium
Byers, Frederick, 2022, "2017 NIST OpenSAT Pilot - SSSF", https://hdl.handle.net/11272.1/AB2/PTU0AQ, Abacus Data Network, V1
Abstract Introduction 2017 NIST OpenSAT Pilot - SSSF was developed by NIST (National Institute of Standards and Technology) and contains approximately one hour of operational speech data, transcripts and annotation files used in the speech activity detection, automatic speech rec...
Feb 7, 2022 - Linguistic Data Consortium
Bies, Ann; Mott, Justin; Warner, Colin; Kulick, Seth, 2022, "BOLT English Translation Treebank - Chinese SMS/Chat", https://hdl.handle.net/11272.1/AB2/JBOOKU, Abacus Data Network, V1
Abstract Introduction BOLT English Translation Treebank - Chinese SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of SMS and chat text data translated from Chinese to English and annotated for part-of-speech and syntactic structure. The DARPA BOLT (Bro...
Feb 4, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "Canadian Income Survey, 2018", https://hdl.handle.net/11272.1/AB2/G6T0LC, Abacus Data Network, V1, UNF:6:RlzI4LxHQ+ZRmY8Hn85cuw== [fileUNF]
The primary objective of the Canadian Income Survey (CIS) is to provide information on the income and income sources of Canadians, along with their individual and household characteristics. The data collected in the CIS is combined with Labour Force Survey (LFS, record number 370...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =