Skip to main content
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

201 to 250 of 1,819 Results
Oct 17, 2023 - CALLFRIEND Russian Text
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Oct 17, 2023 - CALLFRIEND Russian Text
Optical Disc Image - 5.0 MB - MD5: b21d44ff5c5033d97ef6bea10bde1fee
Data
ISO disc image containing all documentation and data
Oct 17, 2023 - CALLFRIEND Russian Text
Plain Text - 4.2 KB - MD5: 49b4f9795a65804ce4359346a341f00a
Documentation
File manifest
Oct 17, 2023
Delgado, Dana; Jones, Karen; Walker, Kevin; Strassel, Stephanie; Caruso, Christopher; Graff, David, 2023, "2019 OpenSAT Public Safety Communications Simulation", https://hdl.handle.net/11272.1/AB2/BOXO5O, Abacus Data Network, V1
Abstract Introduction 2019 OpenSAT Public Safety Communications Simulation was developed by the Linguistic Data Consortium (LDC) and contains approximately 141 hours of speech recordings and transcripts used in the used in the National Institute of Standards and Technology (NIST)...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service
Plain Text - 66.2 KB - MD5: d873c909d2fc85b317f0628e073a5d36
Documentation
File manifest
Oct 16, 2023
Miller, David; Walker, Kevin; Graff, David; Canavan, Alexandra, 2023, "CALLFRIEND Russian Speech", https://hdl.handle.net/11272.1/AB2/NGRVVO, Abacus Data Network, V1
Abstract Introduction CALLFRIEND Russian Speech (LDC2023S08) was developed by the Linguistic Data Consortium (LDC) and consists of approximately 48 hours of telephone conversations (100 recordings) between native speakers of Russian. The calls were recorded in 1999 as part of the...
Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea
Documentation
Working with ISO disc images
Optical Disc Image - 2.2 GB - MD5: f2f1d3efb4da5b636930ab6c60bb9644
Data
ISO disc image containing all documentation and data
Plain Text - 4.1 KB - MD5: a1275e46f9f823c2ba27996ee26cc83f
Documentation
File manifest
Aug 29, 2023
Luqman, Hamzah; Mahmoud, Sabri; Awaida, Sameh, 2016, "KAFD: Arabic Font Database", https://hdl.handle.net/11272.1/AB2/A0JPYM, Abacus Data Network, V2
Introduction KAFD: Arabic Font Database was developed by King Fahd University of Petroleum & Minerals and Qassim University. It is comprised of approximately 2.5 million scanned Arabic printed pages in a variety of fonts, sizes and resolutions along with corresponding transcripts...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Aug 29, 2023
Abdulaziz, Azhar; Kepuska, Veton, 2017, "Noisy TIMIT Speech", https://hdl.handle.net/11272.1/AB2/FFFXT2, Abacus Data Network, V2
Introduction Noisy TIMIT Speech was developed by the Florida Institute of Technology and contains approximately 322 hours of speech from the TIMIT Acoustic-Phonetic Continuous Speech Corpus (LDC93S1) modified with different additive noise levels. Only the audio has been modified;...
Aug 29, 2023 - Noisy TIMIT Speech
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Aug 29, 2023
Chen, Gang; Neubauer, Juergen; Garellek, Marc; Samlan, Robin; Gerratt, Bruce R.; Kreiman, Jody; Alwan, Abeer, 2017, "UCLA High-Speed Laryngeal Video and Audio", https://hdl.handle.net/11272.1/AB2/OWLHMG, Abacus Data Network, V2
UCLA High-Speed Laryngeal Video and Audio was developed by UCLA Speech Processing and Auditory Perception Laboratory and is comprised of high-speed laryngeal video recordings of the vocal folds and synchronized audio recordings from nine subjects collected between April 2012 and...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Vincent, Emmanuel; Barker, Jon; Watanabe, Shinji; Le Roux, Jonathan; Nesta, Francesco; Matassoni, Marco, 2017, "CHiME2 WSJ0", https://hdl.handle.net/11272.1/AB2/IUB8PD, Abacus Data Network, V2
CHiME2 WSJ0 was developed as part of The 2nd CHiME Speech Separation and Recognition Challenge and contains approximately 166 hours of English speech from a noisy living room environment. The CHiME Challenges focus on distant-microphone automatic speech recognition (ASR) in real-...
Aug 29, 2023 - CHiME2 WSJ0
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Aug 29, 2023 - CHiME2 WSJ0
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Tracey, Jennifer; Lee, Haejoong; Strassel, Stephanie, 2017, "BOLT English Discussion Forums", https://hdl.handle.net/11272.1/AB2/VDFID2, Abacus Data Network, V2
BOLT English Discussion Forums was developed by the Linguistic Data Consortium (LDC) and consists of 830,440 discussion forum threads in English harvested from the Internet using a combination of manual and automatic processes. The DARPA BOLT (Broad Operational Language Translati...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Tracey, Jennifer; Lee, Haejoong; Strassel, Stephanie; Ismael, Safa, 2018, "BOLT Arabic Discussion Forums", https://hdl.handle.net/11272.1/AB2/DP4INP, Abacus Data Network, V2
BOLT Arabic Discussion Forums was developed by the Linguistic Data Consortium (LDC) and consists of 813,080 discussion forum threads in Egyptian Arabic harvested from the Internet using a combination of manual and automatic processes. The DARPA BOLT (Broad Operational Language Tr...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Ferraro, Francis; Thomas, Max; Wolfe, Travis; R. Gormley, Matthew; Harman, Craig; Van Durme, Benjamin, 2018, "Concretely Annotated New York Times", https://hdl.handle.net/11272.1/AB2/VA98GM, Abacus Data Network, V2
Introduction Concretely Annotated New York Times was developed by Johns Hopkins University’s Human Language Technology Center of Excellence. It adds multiple kinds and instances of automatically-generated syntactic, semantic and coreference annotations to The New York Times Annot...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Aug 29, 2023
Ferraro, Francis; Thomas, Max; Gormley, Matthew R.; Wolfe, Travis; Harman, Craig; Van Durme, Benjamin, 2018, "Concretely Annotated English Gigaword", https://hdl.handle.net/11272.1/AB2/NQCDFR, Abacus Data Network, V2
Concretely Annotated English Gigaword was developed by Johns Hopkins University’s Human Language Technology Center of Excellence (JHU). It adds multiple kinds and instances of automatically-generated syntactic, semantic and coreference annotations to English Gigaword Fifth Editio...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Morris, Amanda; Strassel, Stephanie; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan G., 2019, "HAVIC MED Progress Test -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/QYTBMD, Abacus Data Network, V2
HAVIC MED Progress Test – Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 3,650 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and related technologies, LDC...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Greenberg, Craig; Martin, Alvin; Graff, David; Brandschain, Linda; Walker, Kevin, 2017, "2010 NIST Speaker Recognition Evaluation Test Set", https://hdl.handle.net/11272.1/AB2/2CPM3O, Abacus Data Network, V2
Introduction 2010 NIST Speaker Recognition Evaluation Test Set was developed by the Linguistic Data Consortium (LDC) and NIST (National Institute of Standards and Technology). It contains 2,255 hours of American English telephone speech and speech recorded over a microphone chann...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Barker, Jon; Marxer, Ricard; Vincent, Emmanuel; Watanabe, Shinji, 2017, "CHiME3", https://hdl.handle.net/11272.1/AB2/HGHM4U, Abacus Data Network, V2
Introduction CHiME3 was developed as part of The 3rd CHiME Speech Separation and Recognition Challenge and contains approximately 342 hours of English speech and transcripts from noisy environments and 50 hours of noisy environment audio. The CHiME Challenges focus on distant-mic...
Aug 29, 2023 - CHiME3
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Aug 29, 2023 - CHiME3
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Bu, Hui, 2018, "AISHELL-1", https://hdl.handle.net/11272.1/AB2/2WMDTT, Abacus Data Network, V2
AISHELL-1 was developed by Beijing Shell Shell Technology Co., Ltd. It contains approximately 520 hours of Chinese Mandarin speech from 400 speakers recorded simultaneously on three different devices with associated transcripts. The goal of the collection was to support speech re...
Aug 29, 2023 - AISHELL-1
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Aug 29, 2023 - AISHELL-1
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Brandschain, Linda; Walker, Kevin; Graff, David; Cieri, Christopher; Neely, Abby; Mirghafori, Nikki; Peskin, Barbara; Godfrey, Jack; Strassel, Stephanie; Goodman, Fred; Doddington, George R.; King, Mike, 2021, "Mixer 4 and 5 Speech", https://hdl.handle.net/11272.1/AB2/LU0TQ8, Abacus Data Network, V2
Abstract Introduction Mixer 4 and 5 Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 14,185 hours of audio recordings of conversational telephone speech, interviews, elicitation exercises and transcript readings involving 616 distinct...
Aug 29, 2023 - Mixer 4 and 5 Speech
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Graff, David; Ma, Xiaoyi; Strassel, Stephanie; Walker, Kevin; Jones, Karen, 2021, "RATS Speaker Identification", https://hdl.handle.net/11272.1/AB2/BZYHPS, Abacus Data Network, V2
Abstract Introduction RATS Speaker Identification was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 1,900 hours of Levantine Arabic, Farsi, Dari, Pashto and Urdu conversational telephone speech with annotations of speech segments. The audio w...
Plain Text - 3.1 KB - MD5: 1b8a8741370964dcfff1eeec66e4b151
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (Markdown/ASCII text)
Adobe PDF - 31.2 KB - MD5: 100c549ff1bb48ed76f05d01f6342eb3
Documentation
Instructions on how to access LDC data via UBC's Teamshare service (PDF)
Aug 29, 2023
Morris, Amanda; Strassel, Stephanie; Li, Xuansong; Antonishek, Brian; Fiscus, Jonathan G., 2022, "HAVIC MED Training Data -- Videos, Metadata and Annotation", https://hdl.handle.net/11272.1/AB2/TQLGAR, Abacus Data Network, V2
Abstract Introduction HAVIC MED Training Data -- Videos, Metadata and Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 2,100 hours of user-generated videos with annotation and metadata. To advance multimodal event detection and re...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =