Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

301 to 350 of 403 Results
May 15, 2017
Chamberlain, Jon; Poesio, Massimo; Kruschwitz, Udo, 2017, "Phrase Detectives Corpus", https://hdl.handle.net/11272.1/AB2/NN2QFX, Abacus Data Network, V1
Introduction Phrase Detectives Corpus was developed by the School of Computer Science and Electronic Engineering at the University of Essex and consists of approximately 19,012 words across 40 documents anaphorically-annotated by the Phrase Detectives game, an online interactive...
Apr 17, 2017
Vincent, Emmanuel; Barker, Jon; Watanabe, Shinji; Le Roux, Jonathan; Nesta, Francesco; Matassoni, Marco, 2017, "CHiME2 Grid", https://hdl.handle.net/11272.1/AB2/ASLFRE, Abacus Data Network, V1
Introduction CHiME2 Grid was developed as part of The 2nd CHiME Speech Separation and Recognition Challenge and contains approximately 120 hours of English speech from a noisy living room environment. The CHiME Challenges focus on distant-microphone automatic speech recognition (...
Apr 17, 2017
Song, Zhiyi; Fore, Dana; Strassel, Stephanie; Lee, Haejoong; Wright, Jonathan, 2017, "BOLT Egyptian Arabic SMS/Chat and Transliteration", https://hdl.handle.net/11272.1/AB2/7I6ANJ, Abacus Data Network, V1
Introduction BOLT Egyptian Arabic SMS/Chat and Transliteration was developed by the Linguistic Data Consortium (LDC) and consists of naturally-occurring Short Message Service (SMS) and Chat (CHT) data collected through data donations and live collection involving native speakers...
Mar 17, 2017
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie; Ma, Xiaoyi; Xue, Nianwen; Marcus, Mitch; Taylor, Ann, 2017, "GALE English-Chinese Parallel Aligned Treebank -- Training", https://hdl.handle.net/11272.1/AB2/QROJQB, Abacus Data Network, V1
Introduction GALE English-Chinese Parallel Aligned Treebank – Training was developed by the Linguistic Data Consortium (LDC) and contains 196,123 tokens of word aligned English and Chinese parallel text with treebank annotations. This material was used as training data in the DAR...
Mar 17, 2017
Song, Zhiyi; Garland, Jennifer; Walker, Christopher; Strassel, Stephanie, 2017, "BOLT Chinese Discussion Forum Parallel Training Data", https://hdl.handle.net/11272.1/AB2/EWIO27, Abacus Data Network, V1
Introduction BOLT Chinese Discussion Forum Parallel Training Data was developed by the Linguistic Data Consortium (LDC) and consists of 1,876,799 tokens of Chinese discussion forum data collected for the DARPA BOLT program along with their corresponding English translations. The...
Feb 15, 2017
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 3 Arabic Broadcast News Speech Part 2", https://hdl.handle.net/11272.1/AB2/SRRGAW, Abacus Data Network, V1
Introduction GALE Phase 3 Arabic Broadcast News Speech Part 2 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 128 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia and...
Jan 19, 2017
Andrus, Tony; Bills, Aric; Corris, Miriam; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Hefright, Brook; Jarrett, Amy; Le, Hanh; Ray, Jessica; Rytting, Anton; Silber, Ronnie; Shen, Wade; Tzoukermann, Evelyne, 2017, "IARPA Babel Vietnamese Language Pack IARPA-babel107b-v0.7", https://hdl.handle.net/11272.1/AB2/CSHOZ8, Abacus Data Network, V1
Introduction IARPA Babel Vietnamese Language Pack IARPA-babel107b-v0.7 was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 201 hours of Vietnamese conversational and scripted telephone speech collected i...
Dec 15, 2016
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2016, "TAC KBP Spanish Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014", https://hdl.handle.net/11272.1/AB2/HL83QO, Abacus Data Network, V1
Introduction TAC KBP Spanish Cross-Lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Spanish Cross-lingual Entity Linking...
Dec 15, 2016
Bamba, Moussa, 2016, "Bamanankan Lexicon", https://hdl.handle.net/11272.1/AB2/OOCBVZ, Abacus Data Network, V1
Introduction Bamanankan Lexicon was developed by the Linguistic Data Consortium (LDC) and contains 5,978 entries of the Bamanankan language presented as a Bamanankan-English lexicon and a Bamanankan-French lexicon. It is the third publication in an LDC project to build an electro...
Dec 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Newswire Parallel Sentences", https://hdl.handle.net/11272.1/AB2/R1M8ZY, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Newswire Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program....
Dec 15, 2016
Conners, Thomas; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Jarrett, Amy; Lin, Willa; Molina, María Encarnación Pérez; Rafalko, Shawna; Ray, Jessica; Rytting, Anton; Shen, Wade; Tzoukermann, Evelyne, 2016, "IARPA Babel Tagalog Language Pack IARPA-babel106-v0.2g", https://hdl.handle.net/11272.1/AB2/IULTZX, Abacus Data Network, V1
Introduction IARPA Babel Tagalog Language Pack IARPA-babel106-v0.2g was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 213 hours of Tagalog conversational and scripted telephone speech collected in 2012...
Nov 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Newswire Parallel Text", https://hdl.handle.net/11272.1/AB2/KYZUJ0, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Newswire Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploitation)...
Nov 15, 2016
Jones, Karen; Graff, David; Walker, Kevin; Strassel, Stephanie, 2016, "Multi-Language Conversational Telephone Speech 2011 -- Slavic Group", https://hdl.handle.net/11272.1/AB2/OL5RQH, Abacus Data Network, V1
Introduction Multi-Language Conversational Telephone Speech 2011 -- Slavic Group was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 60 hours of telephone speech in each of three distinct Slavic languages: Polish, Russian and Ukranian. The data...
Nov 15, 2016
Bills, Aric; David, Anne; Dubinski, Eyal; Fiscus, Jonathan; Hammond, Simon; Gann, Ketty; Harper, Mary; Hefright, Brook; Kazi, Michael; Lam, Julie; Ray, Jessica; Richardson, Fred; Rytting, Anton; Walter, Marle, 2016, "IARPA Babel Georgian Language Pack IARPA-babel404b-v1.0a", https://hdl.handle.net/11272.1/AB2/W0TIWB, Abacus Data Network, V1
Introduction IARPA Babel Georgian Language Pack IARPA-babel404b-v1.0a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 190 hours of Georgian conversational and scripted telephone speech collected in 2...
Oct 19, 2016
Andresen, Jess; Bills, Aric; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; J. Hazen, T.; Jarrett, Amy; Roomi, Bergul; Ray, Jessica; Rytting, Anton; Shen, Wade; Tzoukermann, Evelyne, 2016, "IARPA Babel Turkish Language Pack IARPA-babel105b-v0.5", https://hdl.handle.net/11272.1/AB2/GYXA1F, Abacus Data Network, V1
Introduction IARPA Babel Turkish Language Pack IARPA-babel105b-v0.5 was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 213 hours of Turkish conversational and scripted telephone speech collected in 2012...
Oct 19, 2016
O'Gorman, Tim; Palmer, Martha, 2016, "Richer Event Description", https://hdl.handle.net/11272.1/AB2/H5RQJH, Abacus Data Network, V1
Introduction Richer Event Description was developed by the University of Colorado Boulder-CLEAR (Computational Language and Education Research, Carnegie Mellon University and LDC. It consists of coreference, bridging and event-event relations (temporal, causal, subevent and repor...
Sep 15, 2016
Adams, Nikki; Bills, Aric; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Jarrett, Amy; Khugyani, Kamila; Lin, Willa; Ray, Jessica; Rytting, Anton; Shen, Wade; Strahan, Tania; Tzoukermann, Evelyne, 2016, "IARPA Babel Pashto Language Pack IARPA-babel104b-v0.4bY", https://hdl.handle.net/11272.1/AB2/GLFN3X, Abacus Data Network, V1
Introduction IARPA Babel Pashto Language Pack IARPA-babel104b-v0.4bY was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 214 hours of Pashto conversational and scripted telephone speech collected in 2011...
Sep 15, 2016
Tratz, Stephen, 2016, "ARL Arabic Dependency Treebank", https://hdl.handle.net/11272.1/AB2/GKAG4O, Abacus Data Network, V1
Introduction ARL Arabic Dependency Treebank was developed by the US Army Research Laboratory (ARL) and was derived from four LDC resources: Arabic Treebank (ATB) Part 1 v 4.1 (LDC2010T13), Part 2 v 3.1 (LDC2011T09), Part 3 v 3.2 (LDC2010T08) and Broadcast News v 1.0 (LDC2012T07)....
Aug 16, 2016
Bills, Aric; David, Anne; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Gnanadesikan, Amalia; Harper, Mary; Hammond, Simon; Jarrett, Amy; Molina, María; Ray, Jessica; Rytting, Anton; Paget, Shelly; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Wong, Jamie, 2016, "IARPA Babel Assamese Language Pack IARPA-babel102b-v0.5a", https://hdl.handle.net/11272.1/AB2/9JCM5S, Abacus Data Network, V1
Introduction IARPA Babel Assamese Language Pack IARPA-babel102b-v0.5a was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 205 hours of Assamese conversational and scripted telephone speech collected in 2...
Aug 16, 2016
Bills, Aric; David, Anne; Dubinski, Eyal; Fiscus, Jonathan; Gillies, Breanna; Harper, Mary; Jarrett, Amy; Molina, María; Ray, Jessica; Rytting, Anton; Paget, Shelly; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Wong, Jamie, 2016, "IARPA Babel Bengali Language Pack IARPA-babel103b-v0.4b", https://hdl.handle.net/11272.1/AB2/WKL40N, Abacus Data Network, V1
Introduction IARPA Babel Bengali Language Pack IARPA-babel103b-v0.4b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 215 hours of Bengali conversational and scripted telephone speech collected in 201...
Aug 15, 2016
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio Denise; Strassel, Stephanie, 2016, "GALE Phase 3 Arabic Broadcast News Speech Part 1", https://hdl.handle.net/11272.1/AB2/B0XGQD, Abacus Data Network, V1
GALE Phase 3 Arabic Broadcast News Speech Part 1 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 132 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia and MTC, Rabat, M...
Aug 15, 2016
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2016, "GALE Phase 3 Arabic Broadcast News Transcripts Part 1", https://hdl.handle.net/11272.1/AB2/IQOADN, Abacus Data Network, V1
GALE Phase 3 Arabic Broadcast News Transcripts Part 1 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 132 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tunis, Tunisia a...
Jul 19, 2016
Andrus, Tony; Dubinski, Eyal; Fiscus, Jonathan G.; Gillies, Breanna; Harper, Mary; Hazen, T. J.; Hefright, Brook; Jarrett, Amy; Lin, Willa; Ray, Jessica; Rytting, Anton; Shen, Wade; Tzoukermann, Evelyne; Wong, Jamie, 2016, "IARPA Babel Cantonese Language Pack IARPA-babel101b-v0.4c", https://hdl.handle.net/11272.1/AB2/01SD6T, Abacus Data Network, V1
IARPA Babel Cantonese Language Pack IARPA-babel101b-v0.4c was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 215 hours of Cantonese conversational and scripted telephone speech collected in 2011 along w...
Jul 15, 2016
Song, Zhiyi; Krug, Gary; Jiang, Zixin; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Broadcast News Parallel Text", https://hdl.handle.net/11272.1/AB2/CE2DP3, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Broadcast News Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploit...
Jul 15, 2016
Muir, Kate; Joinson, Adam; Cotterill, Rachel; Dewdney, Nigel, 2016, "English Speed Networking Conversational Transcripts", https://hdl.handle.net/11272.1/AB2/LX2FQA, Abacus Data Network, V1
Introduction English Speed Networking Conversational Transcripts was developed at the University of the West of England and contains 388 transcripts of English face-to-face and instant messaging conversations about business ideas collected in 2014 and 2015 from participants (unde...
Jul 15, 2016
Kretzschmar Jr., William; Bounds, Paulina; Hettel, Jacqueline; Coats, Steven; Pederson, Lee; Lena Opas-Hänninen, Lisa; Juuso, Ilkka; Seppänen, Tapio, 2016, "Digital Archive of Southern Speech - NLP Version", https://hdl.handle.net/11272.1/AB2/F4QH6S, Abacus Data Network, V1
Introduction Digital Archive of Southern Speech - NLP Version (DASS-NLP) was developed by LDC as an alternate version of Digital Archive of Southern Speech (DASS) (LDC2012S03) suitable for natural language processing and human language technology applications. Specifically, the o...
Jun 15, 2016
Song, Zhiyi; Krug, Gary; Jiang, Zixin; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Weblog Parallel Sentences", https://hdl.handle.net/11272.1/AB2/3GAMIQ, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Weblog Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program. T...
Jun 15, 2016
Xue, Nianwen; Zhang, Xiuhong; Jiang, Zixin; Palmer, Martha; Xia, Fei; Chiou, Fu-Dong; Chang, Meiyu, 2016, "Chinese Treebank 9.0", https://hdl.handle.net/11272.1/AB2/YYY4FY, Abacus Data Network, V1
Introduction Chinese Treebank 9.0 consists of approximately two million words of annotated and parsed text from Chinese newswire, government documents, magazine articles, various broadcast news and broadcast conversation programs, web newsgroups, weblogs, discussion forums, chat...
May 16, 2016
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2016, "GALE Phase 4 Chinese Broadcast Conversation Transcripts", https://hdl.handle.net/11272.1/AB2/QOKU34, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 172 hours of Chinese broadcast conversation speech collected in 2008 by LDC and Hong Kong University of Science...
May 16, 2016
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2016, "GALE Phase 4 Chinese Broadcast Conversation Speech", https://hdl.handle.net/11272.1/AB2/Y6ZKMX, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 172 hours of Mandarin Chinese broadcast conversation speech collected in 2008 by LDC and Hong Kong University of Science and Tec...
Apr 18, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Broadcast Conversation Parallel Sentences", https://hdl.handle.net/11272.1/AB2/FGSLZN, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Broadcast Conversation Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploita...
Apr 18, 2016
Tracey, Jennifer; Strassel, Stephanie; Morris, Amanda; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan, 2016, "HAVIC Pilot Transcription", https://hdl.handle.net/11272.1/AB2/ODUSVC, Abacus Data Network, V1
Introduction HAVIC Pilot Transcription was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 72 hours of user-generated videos with transcripts based on the English speech audio extracted from the videos. This data set was created in collaboratio...
Apr 18, 2016
Berkling, Kay, 2016, "H1 Children's Writing", https://hdl.handle.net/11272.1/AB2/OJCHNV, Abacus Data Network, V1
Introduction H1 Children's Writing was developed by the Cooperative State University Baden-Württemberg, University of Education. It consists of 996 texts written over three months by 88 German school children age seven through eleven years. The data in this corpus was collected b...
Mar 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Broadcast Conversation Parallel Text", https://hdl.handle.net/11272.1/AB2/JVLMY4, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Broadcast Conversation Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language...
Mar 15, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Arabic Web Parallel Text", https://hdl.handle.net/11272.1/AB2/MRTO1E, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Arabic Web Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploitation) Progr...
Mar 15, 2016
Glenn, Meghan; Tracey, Jennifer; Fore, Dana; Strassel, Stephanie, 2016, "DEFT Narrative Text", https://hdl.handle.net/11272.1/AB2/L4XDPK, Abacus Data Network, V1
Introduction DEFT Narrative Text was developed by the Linguistic Data Consortium (LDC) and contains proxy reports and their source newswire used to support DARPA's Deep Exploration and Filtering of Text (DEFT) program. Among the goals of the DEFT program is to develop technologie...
Feb 16, 2016
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2016, "GALE Phase 3 Arabic Broadcast Conversation Transcripts Part 2", https://hdl.handle.net/11272.1/AB2/M1HQPS, Abacus Data Network, V1
Introduction GALE Phase 3 Arabic Broadcast Conversation Transcripts Part 2 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 129 hours of Arabic broadcast conversation speech collected in 2007 and 2008 by LDC, MediaNet, Tunis, Tuni...
Feb 15, 2016
Tracey, Jennifer; Lee, Haejoong; Strassel, Stephanie; Song, Zhiyi, 2016, "BOLT Chinese Discussion Forums", https://hdl.handle.net/11272.1/AB2/YW237I, Abacus Data Network, V1
Introduction BOLT Chinese Discussion Forums was developed by the Linguistic Data Consortium (LDC) and consists of 1,597,500 discussion forum threads in Chinese harvested from the Internet using a combination of manual and automatic processes. The DARPA BOLT (Broad Operational Lan...
Feb 15, 2016
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2016, "GALE Phase 3 Arabic Broadcast Conversation Speech Part 2", https://hdl.handle.net/11272.1/AB2/DB68HK, Abacus Data Network, V1
Introduction GALE Phase 3 Arabic Broadcast Conversation Speech Part 2 was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 129 hours of Arabic broadcast conversation speech collected in 2007 and 2008 by LDC, MediaNet, Tunis, Tunisia and MTC, Rab...
Jan 18, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Chinese Weblog Parallel Sentences", https://hdl.handle.net/11272.1/AB2/EHYDWM, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Weblog Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program....
Jan 18, 2016
Sauri, Roser; Domingo, Judith; Badia, Toni, 2016, "NewSoMe Corpus of Opinion in Blogs", https://hdl.handle.net/11272.1/AB2/BVFQAJ, Abacus Data Network, V1
Introduction NewSoMe Corpus of Opinion in Blogs was compiled at Barcelona Media and consists of English and Spanish blogs annotated for opinions. It is part of the NewSoMe (News and Social Media) set of corpora presenting opinion annotations across several genres and covering mul...
Jan 18, 2016
Maamouri, Mohamed; Bies, Ann; Kulick, Seth; Krouna, Sondos; Tabassi, Dalila; Ciul, Michael, 2016, "Arabic Treebank - Weblog", https://hdl.handle.net/11272.1/AB2/MLNQA9, Abacus Data Network, V1
Introduction Arabic Treebank - Weblog was developed by the Linguistic Data Consortium (LDC) and consists of Arabic weblog data with part-of-speech, morphology, gloss and syntactic tree annotation. The ongoing Penn Arabic Treebank Project (PATB) supports research in Arabic-languag...
Jan 1, 2016
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Arabic Broadcast News Parallel Sentences", https://hdl.handle.net/11272.1/AB2/HWLA9Y, Abacus Data Network, V1
Introduction GALE Phase 4 Arabic Broadcast News Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Pr...
Jan 1, 2016
Flickinger, Dan; Hajič, Jan; Ivanova, Angelina; Kuhlmann, Marco; Miyao, Yusuke; Oepen, Stephan; Zeman, Daniel, 2016, "SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing", https://hdl.handle.net/11272.1/AB2/FDYP3O, Abacus Data Network, V1
Introduction SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing consists of data, tools, system results, and publications associated with the 2014 and 2015 tasks on Broad-Coverage Semantic Dependency Parsing (SDP) conducted in conjunction with the International Workshop...
Jan 1, 2016
Li, Xuansong; Peterson, Katherine; Grimes, Stephen; Strassel, Stephanie, 2016, "BOLT Chinese-English Word Alignment and Tagging -- Discussion Forum Training", https://hdl.handle.net/11272.1/AB2/CVWHSG, Abacus Data Network, V1
Introduction BOLT Chinese-English Word Alignment and Tagging -- Discussion Forum Training was developed by the Linguistic Data Consortium (LDC) and consists of 448,094 words of Chinese and English parallel text enhanced with linguistic tags to indicate word relations. The DARPA B...
Dec 15, 2015
Meghan Glenn; Haejoong Lee; Stephanie Strassel; Kazuaki Maeda, 2015, "GALE Phase 3 Chinese Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/1FSF7R, Abacus Data Network, V1
Introduction GALE Phase 3 Chinese Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 150 hours of Chinese broadcast news speech collected in 2007 and 2008 by LDC and Hong University of Science and Technolo...
Dec 15, 2015
Kevin Walker; Christopher Caruso; Kazuaki Maeda; Denise DiPersio; Stephanie Strassel, 2015, "GALE Phase 3 Chinese Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/S6DMZM, Abacus Data Network, V1
Introduction GALE Phase 3 Chinese Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 150 hours of Mandarin Chinese broadcast news speech collected in 2007 and 2008 by LDC and Hong University of Science and Technology (HKU...
Nov 16, 2015
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2015, "GALE Phase 4 Chinese Newswire Parallel Sentences", https://hdl.handle.net/11272.1/AB2/WAB9QW, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Newswire Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program...
Nov 16, 2015
Schatz, Thomas; Cao, Xuan-Nga; Kolesnikova, Anna; Bergvelt, Tomas; Wright, Jonathan; Dupoux, Emmanuel, 2015, "Articulation Index LSCP", https://hdl.handle.net/11272.1/AB2/8BVW8H, Abacus Data Network, V1
Introduction Articulation Index LSCP was developed by researchers at Laboratoire de Sciences Cognitives et Psycholinguistique (LSCP), Ecole Normale Supérieure. It revises and enhances a subset of Articulation Index (AIC) (LDC2005S22), a corpus of persons speaking English syllable...
Oct 15, 2015
Walker, Christopher; Song, Zhiyi; Kazuaki, Maeda, 2015, "ACE 2007 Spanish DevTest - Pilot Evaluation", https://hdl.handle.net/11272.1/AB2/GBDW4K, Abacus Data Network, V1
Introduction ACE 2007 Spanish DevTest was developed by the Linguistic Data Consortium (LDC). This publication contains the complete set of Spanish development and test data to support the 2007 Automatic Content Extraction (ACE) technology evaluation, namely, newswire data annotat...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =