1 to 50 of 1,837 Results
Aug 14, 2025
Arrigo, Michael; Delgado, Dana; Strassel, Stephanie; Graff, David, 2025, "IWSLT 2022-2023 Shared Task Training, Development and Test Set", https://hdl.handle.net/11272.1/AB2/ONUJ54, Abacus Data Network, V1
Abstract Introduction IWSLT 2022 - 2023 Shared Task Training, Development and Test Set was developed by the Linguistic Data Consortium (LDC). It contains 210 hours of Tunisian Arabic conversational telephone speech, transcripts and their English translations covering 175 hours of... |
Plain Text - 218.8 KB -
MD5: 0d75f120af3b88f1180df5f2dfe6346c
File manifest for disc 1 |
Optical Disc Image - 4.3 GB -
MD5: ad7314c945f61b989c11b9ac6697b6ac
ISO disc image containing all documentation and data: disc 1 |
Plain Text - 204.8 KB -
MD5: 4feb59006445543997ee5a3450f62a62
File manifest for disc 2 |
Optical Disc Image - 3.3 GB -
MD5: 823f98962b928ebcff977225a0f33fc0
ISO disc image containing all documentation and data: disc 2 |
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Aug 14, 2025
Cieri, Christopher; Fiumara, James; Walker, Kevin; Liberman, Mark; Ryant, Neville, 2025, "AnnoDIFP Session Audio and Transcripts", https://hdl.handle.net/11272.1/AB2/OGBCJ9, Abacus Data Network, V1
Abstract Introduction AnnoDIFP (Annotated Data for the Investigation of Facets of Personality) Session Audio and Transcripts was developed by the Linguistic Data Consortium (LDC), the Florida Institute of Technology (FIT), and the University of New Haven (UNH) to support algorith... |
Aug 14, 2025 -
AnnoDIFP Session Audio and Transcripts
Plain Text - 159.7 KB -
MD5: a0fb25c92d6550897ecf4973c9d2eabb
File manifest |
Aug 14, 2025 -
AnnoDIFP Session Audio and Transcripts
Markdown Text - 3.1 KB -
MD5: 891064c78a8e46a2f9922b793aafa160
Instructions on how to access LDC data via UBC's Teamshare service (Markdown / ASCII text) |
Aug 14, 2025 -
AnnoDIFP Session Audio and Transcripts
Adobe PDF - 31.2 KB -
MD5: 2a043207829f9ab259df770590941165
Instructions on how to access LDC data via UBC's Teamshare service |
Aug 14, 2025
Tracey, Jennifer; Graff, David; Chen, Song; Strassel, Stephanie, 2025, "BOLT CTS CALLFRIEND CALLHOME Mainland Mandarin Chinese Audio", https://hdl.handle.net/11272.1/AB2/1BGPSO, Abacus Data Network, V1
Abstract Introduction BOLT CTS CALLFRIEND CALLHOME Mainland Mandarin Chinese Audio was developed by the Linguistic Data Consortium (LDC) and consists of approximately 93 hours of speech from 236 unscripted telephone conversations between native speakers of the Mandarin Chinese di... |
Aug 14, 2025 -
BOLT CTS CALLFRIEND CALLHOME Mainland Mandarin Chinese Audio
Plain Text - 9.6 KB -
MD5: e0fa130b05b8ef250a2acd001a272d26
File manifest |
Aug 14, 2025 -
BOLT CTS CALLFRIEND CALLHOME Mainland Mandarin Chinese Audio
Optical Disc Image - 3.9 GB -
MD5: 6504896b88df7c7b8d1eaa09c8761f24
ISO disc image containing all documentation and data |
Aug 14, 2025 -
BOLT CTS CALLFRIEND CALLHOME Mainland Mandarin Chinese Audio
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Jul 23, 2025
Kroch, Anthony; Santorini, Beatrice; Taylor, Ann; Diertani, Ariel, 2025, "Penn Parsed Corpora of Historical English Second Release", https://hdl.handle.net/11272.1/AB2/E4NMWX, Abacus Data Network, V1
Abstract Introduction Penn Parsed Corpora of Historical English Second Release was developed at the University of Pennsylvania and consists of running texts and text samples of British English prose from the earliest Middle English documents (1100 CE) up to the period of the Firs... |
Jul 23, 2025 -
Penn Parsed Corpora of Historical English Second Release
Plain Text - 124.3 KB -
MD5: df9c3a39a9ea7706a70e8bdacc7874ea
File manifest |
Jul 23, 2025 -
Penn Parsed Corpora of Historical English Second Release
Optical Disc Image - 232.2 MB -
MD5: 73478f7463591442b50fab50e4d79cc6
ISO disc image containing all documentation and data |
Jul 23, 2025 -
Penn Parsed Corpora of Historical English Second Release
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Jun 9, 2025
Bekkozhanova, Gulnar; Bills, Aric; Chouder, Sarra; Jaralve, Vanessa; Corey, Cassian; Dubinski, Eyal; Ellis, Corinna; Gibby, Paul; Kazi, Michael; Lam, Julie; Le, Hanh; Malyska, Nicolas; Marcucci, Giorgia; Marvi, Sarah; McConnell, Sara; Melot, Jennifer; Mensch, Alyssa; Morrison, Michelle; Paget, Shelley; Ramizo, Katerina; Richardson, Frederick; Roberts, Annette; Rubino, Carl; Sarseke, Gulnar; Taubayev, Zharas, 2025, "MATERIAL Kazakh-English Language Pack", https://hdl.handle.net/11272.1/AB2/5G61UB, Abacus Data Network, V1
Abstract Introduction MATERIAL Kazakh-English Language Pack was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) MATERIAL (Machine Translation for English Retrieval of Information in Any Language) program. It contains approximately 57 hours of K... |
Jun 9, 2025 -
MATERIAL Kazakh-English Language Pack
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Jun 9, 2025 -
MATERIAL Kazakh-English Language Pack
Optical Disc Image - 9.0 GB -
MD5: 368db6e6280771ca15d57d25f32b7c35
ISO disc image containing all documentation and data |
Jun 9, 2025 -
MATERIAL Kazakh-English Language Pack
Plain Text - 225.1 KB -
MD5: 3b3588fc37a241f870756de3dcc14bcc
File manifest |
Apr 29, 2025
Greenberg, Craig; Sadjadi, Omid; Graff, David; Walker, Kevin; Jones, Karen; Caruso, Christopher; Strassel, Stephanie; Wright, Jonathan, 2025, "2015 NIST Language Recognition Evaluation Test Set", https://hdl.handle.net/11272.1/AB2/TPVLOA, Abacus Data Network, V1
Abstract Introduction 2015 NIST Language Recognition Evaluation Test Set was developed by the Linguistic Data Consortium (LDC) and the National Institute of Standards and Technology (NIST). It contains the evaluation test set for the 2015 NIST Language Recognition Evaluation, app... |
Apr 29, 2025 -
2015 NIST Language Recognition Evaluation Test Set
Markdown Text - 3.1 KB -
MD5: 891064c78a8e46a2f9922b793aafa160
Instructions on how to access LDC data via UBC's Teamshare service (Markdown / ASCII text) |
Apr 29, 2025 -
2015 NIST Language Recognition Evaluation Test Set
Adobe PDF - 31.2 KB -
MD5: 2a043207829f9ab259df770590941165
Instructions on how to access LDC data via UBC's Teamshare service |
Apr 29, 2025
Chen, Song; Mott, Justin; Strassel, Stephanie, 2025, "DEFT Spanish Light and Rich ERE Annotation", https://hdl.handle.net/11272.1/AB2/WMSO8E, Abacus Data Network, V1
Abstract Introduction DEFT Spanish Light and Rich ERE Annotation was developed by the Linguistic Data Consortium (LDC) and consists of 158 Spanish discussion forum and newswire documents annotated for entities, relations and events (ERE). DARPA's Deep Exploration and Filtering of... |
Apr 29, 2025 -
DEFT Spanish Light and Rich ERE Annotation
Optical Disc Image - 23.2 MB -
MD5: 06ea6b3331938ae5191eb765a0a133e1
ISO disc image containing all documentation and data |
Apr 29, 2025 -
DEFT Spanish Light and Rich ERE Annotation
Plain Text - 26.8 KB -
MD5: da4eb003789c09c742dde08c99ac5c28
File manifest |
Apr 29, 2025 -
DEFT Spanish Light and Rich ERE Annotation
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Apr 29, 2025
Zhang, Xiao; Zhang, Ling; Dang, Tian; Feng, Yuanzhao; Ji, Yujing; Jiang, Xiaohui; Kang, Zhewen; Lu, Yan; Nie, Wen; Ren, Hanyu; Wang, Canjun; Wang, Jiayi; Wang, Yu; Wu, Chen; Wu, Mei; Xu, Tingting; Yang, Ruhai; Zhao, Kai; Zhao, Ran; Zhou, Quanjie; Zhu, Lei, 2025, "The Xi’an Multi-Language Learner Corpus", https://hdl.handle.net/11272.1/AB2/KEPEYK, Abacus Data Network, V1
Abstract Introduction The Xi’an Multi-Language Learner Corpus was developed by Xi'an International Studies University (XISU). It is comprised of 526 argumentative essays in 15 languages by Chinese L1 university students studying second languages, along with student metadata and w... |
Apr 29, 2025 -
The Xi’an Multi-Language Learner Corpus
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Apr 29, 2025 -
The Xi’an Multi-Language Learner Corpus
Optical Disc Image - 4.0 MB -
MD5: 1a62577f66c1a9312e4d3f0bd98dc9e2
ISO disc image containing all documentation and data |
Apr 29, 2025 -
The Xi’an Multi-Language Learner Corpus
Plain Text - 26.1 KB -
MD5: d11477f4a0506d9b0434d12ff92e1669
File manifest |
Apr 3, 2025
Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Ryant, Neville; Kulick, Seth; Griffitt, Kira; Delgado, Dana; Arrigo, Michael, 2025, "LORELEI Hungarian Representative Language Pack", https://hdl.handle.net/11272.1/AB2/6G8DZZ, Abacus Data Network, V1
Abstract Introduction LORELEI Hungarian Representative Language Pack consists of Hungarian monolingual text, Hungarian-English parallel text, annotations, supplemental resources and related software tools developed by the Linguistic Data Consortium for the DARPA LORELEI program.... |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Optical Disc Image - 3.3 GB -
MD5: 1430ccce8b8fe03ee716ff6dbd5d0d9a
ISO disc image containing all documentation and data: disc 1 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Optical Disc Image - 3.9 GB -
MD5: b1a1c1e87b1ad8e4cc7e57bc05ccd10b
ISO disc image containing all documentation and data: disc 3 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Optical Disc Image - 3.3 GB -
MD5: a53f5725ab5bf1f8e69cfd499e53ee9a
ISO disc image containing all documentation and data: disc 4 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Optical Disc Image - 4.2 GB -
MD5: 557498e2f274d6f53853f434eb9018fe
ISO disc image containing all documentation and data: disc 2 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Plain Text - 9.1 KB -
MD5: 5a70d0b26b2ee9a4bf5358a49aab9618
File manifest for disc 4 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Plain Text - 9.1 KB -
MD5: 5a70d0b26b2ee9a4bf5358a49aab9618
File manifest for disc 1 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Plain Text - 2.7 KB -
MD5: ca6e249ba8435f664c237c7de1202f95
File manifest for disc 3 |
Apr 3, 2025 -
LORELEI Hungarian Representative Language Pack
Plain Text - 931 B -
MD5: 69ed20efabf74ddaa3a60c67b23aa9c2
File manifest for disc 2 |
Apr 3, 2025
Vanroy, Bram, 2025, "Abstract Meaning Representation 3.0 - Machine Translations", https://hdl.handle.net/11272.1/AB2/TKRDFD, Abacus Data Network, V1
Abstract Introduction Abstract Meaning Representation 3.0 - Machine Translations was developed by the Center for Computational Linguistics at KU Leuven in the HORIZON2020 project SignON. It is an automatic translation of a subset of sentences from Abstract Meaning Representation... |
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Optical Disc Image - 138.1 MB -
MD5: 808c0ea3fd032deddc567b7b5db8ce48
ISO disc image containing all documentation and data |
Plain Text - 6.1 KB -
MD5: 60655c1a9b133e7fe1e1a6ec07dc3116
File manifest |
Apr 3, 2025
Tracey, Jennifer; Strassel, Stephanie; Getman, Jeremy; Bies, Ann; Griffitt, Kira; Graff, David; Caruso, Christopher, 2025, "AIDA Scenario 3 Practice Topic Source Data and Annotation", https://hdl.handle.net/11272.1/AB2/KAFV5Q, Abacus Data Network, V1
Abstract Introduction AIDA Scenario 3 Practice Topic Source Data and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of English, Russian and Spanish web documents (text, video, image) and annotations. The DARPA AIDA (Active Interpretation of Disp... |
Plain Text - 1.3 KB -
MD5: 4d4231d07ac669e105f71e602457efea
Working with ISO disc images |
Optical Disc Image - 1.4 GB -
MD5: f03f3a4db3433e4b7021abe6121eeeee
ISO disc image containing all documentation and data |