Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

101 to 150 of 1,855 Results
Plain Text - 348.0 KB - MD5: f6ecaed4930423f5a5423f3aa255a38c
Documentation
File manifest
Feb 3, 2025
Maamouri, Mohamed; Graff, David, 2025, "Iraqi Arabic - English Lexical Database", https://hdl.handle.net/11272.1/AB2/EUPXQD, Abacus Data Network, V1
Abstract Introduction Iraqi Arabic - English Lexical Database was developed by the Linguistic Data Consortium (LDC). It contains six interrelated tables presenting over 67,000 Iraqi Arabic words as orthographic forms in Arabic script and pronunciation forms in International Phone...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 6.1 MB - MD5: 5047ba657ed838ab7ed28361cf99a52a
Data
ISO disc image containing all documentation and data
Plain Text - 568 B - MD5: 7ec975313a646a6c4df31a6bb250fe96
Documentation
File manifest
Jan 21, 2025
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Kulick, Seth; Griffitt, Kira; Delgado, Dana; Arrigo, Michael, 2025, "LORELEI Yoruba Representative Language Pack", https://hdl.handle.net/11272.1/AB2/ATPB58, Abacus Data Network, V1
Abstract Introduction LORELEI Yoruba Representative Language Pack (LDC2024T10) consists of Yoruba monolingual text, Yoruba-English parallel text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LORELEI progr...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 460.8 MB - MD5: fdf8c1f55ca588f02ff3959fb038479f
Data
ISO disc image containing all documentation and data
Plain Text - 1.1 MB - MD5: d43196767d4cbedec9f8e50b1db4d57d
Documentation
File manifest
Jan 21, 2025
Hennig, Leonhard; Thomas, Philippe; Möller, Sebastian, 2025, "MultiTACRED", https://hdl.handle.net/11272.1/AB2/GIEQ7J, Abacus Data Network, V1
Abstract Introduction MultiTACRED was developed by the German Research Center for Artificial Intelligence (DFKI) Speech and Language Technology Lab and is a machine translation of TAC Relation Extraction Dataset (LDC2018T24) (TACRED) into twelve languages with projected entity an...
Jan 21, 2025 - MultiTACRED
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Jan 21, 2025 - MultiTACRED
Optical Disc Image - 756.2 MB - MD5: a0fe09e9df6275339122a385da5dfd16
Data
ISO disc image containing all documentation and data
Jan 21, 2025 - MultiTACRED
Plain Text - 2.9 KB - MD5: c116a5c547fa51b981286727f18d8ad1
Documentation
File manifest
Jan 21, 2025
Das, Debopam; Egg, Markus, 2025, "RST Continuity Corpus", https://hdl.handle.net/11272.1/AB2/YSIB2J, Abacus Data Network, V1
Abstract Introduction RST Continuity Corpus was developed at Åbo Akademi University and Humboldt-Universität zu Berlin and contains annotations for continuity dimensions added to RST Discourse Treebank (LDC2002T07). RST Discourse Treebank is a collection of English news texts fro...
Jan 21, 2025 - RST Continuity Corpus
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Jan 21, 2025 - RST Continuity Corpus
Optical Disc Image - 12.4 MB - MD5: c0f5e35cf7c7b61b86c391835c97ce11
Data
ISO disc image containing all documentation and data
Jan 21, 2025 - RST Continuity Corpus
Plain Text - 82.7 KB - MD5: 79de47849d089724428899455e0270ee
Documentation
File manifest
Oct 25, 2024
Larson, Brian N., 2024, "First-Year Law Students' Court Memoranda", https://hdl.handle.net/11272.1/AB2/CC9MT6, Abacus Data Network, V1
Abstract Introduction First-Year Law Students' Court Memoranda consists of 197 English law student writing samples of legal briefs annotated for certain characteristics along with accompanying survey responses by the student writers. The briefs were created in a law school writin...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 39.0 MB - MD5: cf404d0933694da975b133118172c6d5
Data
ISO disc image containing all documentation and data
Plain Text - 16.7 KB - MD5: a075e855c2ab6c4ff2ca490120beb7fb
Documentation
File manifest
Oct 25, 2024
Hedström, Staffan; Fong, Judy; Þórhallsdóttir, Ragnheiður; Mollberg, David; Guðmundsson, Smári Freyr; Jónsson, Ólafur Helgi; Þorsteinsdóttir, Sunneva; Magnusdottir, Eydis Huld; Gudnason, Jon, 2024, "Samrómur Queries Icelandic Speech 1.0", https://hdl.handle.net/11272.1/AB2/DGPHQR, Abacus Data Network, V1
Abstract Introduction Samrómur Queries Icelandic Speech 1.0 was developed by the Language and Voice Lab, Reykjavik University in cooperation with Almannarómur, Center for Language Technology. The corpus contains 20 hours of Icelandic prompted queries from 3,809 speakers represent...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 1.2 GB - MD5: e5c487ea38f6865426739e8b19b88c91
Data
ISO disc image containing all documentation and data
Plain Text - 1.0 MB - MD5: 0351d1443f49852f1cb857055c8a67f1
Documentation
File manifest
Oct 25, 2024
Consortium, Linguistic Data; ELDA,, 2024, "TRAD Arabic-French Parallel Text -- Newswire", https://hdl.handle.net/11272.1/AB2/48BBWO, Abacus Data Network, V1
Abstract Introduction TRAD Arabic-French Parallel Text -- Newswire was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 20,000 Arabic words from NIST 2008 Open Machine Translation (OpenMT) Evaluation (LDC2010T21). The...
Optical Disc Image - 2.4 MB - MD5: 4fa22f955df1d077a0730ff1da8e693c
Data
ISO disc image containing all documentation and data
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Plain Text - 506 B - MD5: 89af08785e64fb0622fddebf0890a914
Documentation
File manifest
Oct 25, 2024
Consortium, Linguistic Data; ELDA,, 2024, "TRAD Chinese-French Parallel Text -- Broadcast News", https://hdl.handle.net/11272.1/AB2/IZFPYW, Abacus Data Network, V1
Abstract Introduction TRAD Chinese-French Parallel Text -- Broadcast News was developed by ELDA as part of the PEA-TRAD project. It contains French translations of a subset of approximately 30,000 Chinese characters from GALE Phase 1 Chinese Broadcast News Parallel Text - Part 3...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 2.2 MB - MD5: d9a889c6b92aa226fed868f768e615a9
Data
ISO disc image containing all documentation and data
Plain Text - 415 B - MD5: da75ec668abea80ffe5ec56d02567ba8
Documentation
File manifest
Oct 25, 2024
Pisa, Dipartimento di Informatica of the University of; ILC-CNR,; Processing, Institute for Language and Speech; Szeged, Institute of Informatics at the University of; Sciences, Institute of Linguistics at the Hungarian Academy of; Ltd., Morphologic, 2024, "2007 CoNLL Shared Task - Greek, Hungarian & Italian", https://hdl.handle.net/11272.1/AB2/JLYA64, Abacus Data Network, V1
Abstract Introduction 2007 CoNLL Shared Task - Greek, Hungarian & Italian consists of dependency treebanks in three languages used as part of the CoNLL 2007 shared task on multi-lingual dependency parsing and domain adaptation. The languages covered in this release are: Greek, Hu...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 18.8 MB - MD5: 25de0001104276d17e813c97b887bdae
Data
ISO disc image containing all documentation and data
Plain Text - 2.0 KB - MD5: 193159706927caaf95881ba7391db29c
Documentation
File manifest
Oct 25, 2024
Britt, Erica, 2024, "Vehicle City Voices Corpus – Part I", https://hdl.handle.net/11272.1/AB2/8XVBZS, Abacus Data Network, V1
Abstract Introduction Vehicle City Voices Corpus – Part I was developed at the University of Michigan-Flint, and is an ongoing oral history project and survey of English language variation in Flint, Michigan. It contains approximately 16 hours of speech with corresponding transcr...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 1.9 GB - MD5: 73b3dd2382d696aa67f9e29ca218ade4
Data
ISO disc image containing all documentation and data
Plain Text - 1.9 KB - MD5: 6028d6e554408649ae08556fa513e097
Documentation
File manifest
Oct 25, 2024
Mena, Carlos Daniel Hernández; Herrera, Abel, 2024, "CHM150", https://hdl.handle.net/11272.1/AB2/UWURFR, Abacus Data Network, V1
Abstract Introduction CHM150 (Corpus Hecho en México 150) was developed by the Speech Processing Laboratory of the Faculty of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximately 1.63 hours of Mexican Spanish speech, associated transcri...
Oct 25, 2024 - CHM150
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Oct 25, 2024 - CHM150
Optical Disc Image - 140.4 MB - MD5: 108dccfcb86160817b075caa7cbd85d1
Data
ISO disc image containing all documentation and data
Oct 25, 2024 - CHM150
Plain Text - 179.9 KB - MD5: 6196bb77b89cedd6a3d8d288f7527639
Documentation
File manifest
Oct 25, 2024
Alfaifi, Abdullah; Atwell, Eric, 2024, "Arabic Learner Corpus", https://hdl.handle.net/11272.1/AB2/DPQWPU, Abacus Data Network, V1
Abstract Introduction Arabic Learner Corpus was developed at the University of Leeds and consists of written essays and spoken recordings by Arabic learners collected in Saudi Arabia in 2012 and 2013. The corpus includes 282,732 words in 1,585 materials, produced by 942 students...
Oct 25, 2024 - Arabic Learner Corpus
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Oct 25, 2024 - Arabic Learner Corpus
Optical Disc Image - 882.8 MB - MD5: 69857ac831622a1ad69d4439764dd127
Data
ISO disc image containing all documentation and data
Oct 25, 2024 - Arabic Learner Corpus
Plain Text - 504.8 KB - MD5: 6047cbf1e28371172e70de64eb69f582
Documentation
File manifest
Oct 25, 2024
Slaney, Malcolm; McRoberts, Gerald; Scheirer, Jocelyn, 2024, "BabyEars Affective Vocalizations", https://hdl.handle.net/11272.1/AB2/VK52W9, Abacus Data Network, V1
Abstract Introduction BabyEars Affective Vocalizations was developed by Malcolm Slaney, Gerald McRoberts, and Jocelyn Scheirer. It contains approximately 22 minutes of spontaneous English speech by 12 adults interacting with their infant children, for a total of 509 infant-direct...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =