Linguistic Data Consortium

Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

101 to 150 of 1,819 Results

Working_with_ISO_Images.txt Oct 25, 2024 - 2007 CoNLL Shared Task - Greek, Hungarian & Italian Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
Vehicle City Voices Corpus – Part I Oct 25, 2024 Britt, Erica, 2024, "Vehicle City Voices Corpus – Part I", https://hdl.handle.net/11272.1/AB2/8XVBZS, Abacus Data Network, V1 Abstract Introduction Vehicle City Voices Corpus – Part I was developed at the University of Michigan-Flint, and is an ongoing oral history project and survey of English language variation in Flint, Michigan. It contains approximately 16 hours of speech with corresponding transcr...
Working_with_ISO_Images.txt Oct 25, 2024 - Vehicle City Voices Corpus – Part I Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2017S17.iso Oct 25, 2024 - Vehicle City Voices Corpus – Part I Optical Disc Image - 1.9 GB - MD5: 73b3dd2382d696aa67f9e29ca218ade4 Data ISO disc image containing all documentation and data
LDC2017S17_File_Manifest.txt Oct 25, 2024 - Vehicle City Voices Corpus – Part I Plain Text - 1.9 KB - MD5: 6028d6e554408649ae08556fa513e097 Documentation File manifest
CHM150 Oct 25, 2024 Mena, Carlos Daniel Hernández; Herrera, Abel, 2024, "CHM150", https://hdl.handle.net/11272.1/AB2/UWURFR, Abacus Data Network, V1 Abstract Introduction CHM150 (Corpus Hecho en México 150) was developed by the Speech Processing Laboratory of the Faculty of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximately 1.63 hours of Mexican Spanish speech, associated transcri...
Working_with_ISO_Images.txt Oct 25, 2024 - CHM150 Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2016S04.iso Oct 25, 2024 - CHM150 Optical Disc Image - 140.4 MB - MD5: 108dccfcb86160817b075caa7cbd85d1 Data ISO disc image containing all documentation and data
LDC2016S04_File_Manifest.txt Oct 25, 2024 - CHM150 Plain Text - 179.9 KB - MD5: 6196bb77b89cedd6a3d8d288f7527639 Documentation File manifest
Arabic Learner Corpus Oct 25, 2024 Alfaifi, Abdullah; Atwell, Eric, 2024, "Arabic Learner Corpus", https://hdl.handle.net/11272.1/AB2/DPQWPU, Abacus Data Network, V1 Abstract Introduction Arabic Learner Corpus was developed at the University of Leeds and consists of written essays and spoken recordings by Arabic learners collected in Saudi Arabia in 2012 and 2013. The corpus includes 282,732 words in 1,585 materials, produced by 942 students...
LDC2015S10.iso Oct 25, 2024 - Arabic Learner Corpus Optical Disc Image - 882.8 MB - MD5: 69857ac831622a1ad69d4439764dd127 Data ISO disc image containing all documentation and data
LDC2015S10_File_Manifest.txt Oct 25, 2024 - Arabic Learner Corpus Plain Text - 504.8 KB - MD5: 6047cbf1e28371172e70de64eb69f582 Documentation File manifest
Working_with_ISO_Images.txt Oct 25, 2024 - Arabic Learner Corpus Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
BabyEars Affective Vocalizations Oct 25, 2024 Slaney, Malcolm; McRoberts, Gerald; Scheirer, Jocelyn, 2024, "BabyEars Affective Vocalizations", https://hdl.handle.net/11272.1/AB2/VK52W9, Abacus Data Network, V1 Abstract Introduction BabyEars Affective Vocalizations was developed by Malcolm Slaney, Gerald McRoberts, and Jocelyn Scheirer. It contains approximately 22 minutes of spontaneous English speech by 12 adults interacting with their infant children, for a total of 509 infant-direct...
Working_with_ISO_Images.txt Oct 25, 2024 - BabyEars Affective Vocalizations Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2024S04.iso Oct 25, 2024 - BabyEars Affective Vocalizations Optical Disc Image - 44.7 MB - MD5: 443f06f327c59c69ef48a614a25df8c1 Data ISO disc image containing all documentation and data
LDC2024S04_File_Manifest.txt Oct 25, 2024 - BabyEars Affective Vocalizations Plain Text - 32.2 KB - MD5: 37fd81a2ab192c3cfab5fdc7b3e65d08 Documentation File manifest
Second Language University Speech Intelligibility Corpus Oct 25, 2024 Kang, Okim; Hirschi, Kevin; Looney, Stephen D.; Hansen, John H. L., 2024, "Second Language University Speech Intelligibility Corpus", https://hdl.handle.net/11272.1/AB2/QHVV2O, Abacus Data Network, V1 Abstract Introduction Second Language University Speech Intelligibility Corpus was developed by Northern Arizona University, The Pennsylvania State University, and The University of Texas at Dallas. It contains 10.5 hours of English speech by 66 international faculty and universi...
Working_with_ISO_Images.txt Oct 25, 2024 - Second Language University Speech Intelligibility Corpus Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2024S02.iso Oct 25, 2024 - Second Language University Speech Intelligibility Corpus Optical Disc Image - 781.6 MB - MD5: 0a091f63d8b2c9d54d580b796f995571 Data ISO disc image containing all documentation and data
LDC2024S02_File_Manifest.txt Oct 25, 2024 - Second Language University Speech Intelligibility Corpus Plain Text - 17.7 KB - MD5: 111c4e9c65fa91efe5403558a88c192a Documentation File manifest
AIDA Scenario 2 Practice Topic Annotation Sep 17, 2024 Tracey, Jennifer; Strassel, Stephanie; Getman, Jeremy; Bies, Ann; Griffitt, Kira; Graff, David; Caruso, Christopher, 2024, "AIDA Scenario 2 Practice Topic Annotation", https://hdl.handle.net/11272.1/AB2/BFKQTZ, Abacus Data Network, V1 Abstract Introduction AIDA Scenario 2 Practice Topic Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of annotations for 29 English, Russian and Spanish web documents (text, image and video) from AIDA Scenario 2 Practice Topic Source Data (LDC2024...
Working_with_ISO_Images.txt Sep 17, 2024 - AIDA Scenario 2 Practice Topic Annotation Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2024T06.iso Sep 17, 2024 - AIDA Scenario 2 Practice Topic Annotation Optical Disc Image - 136.2 MB - MD5: d00448218ce7e823a49568f65e387272 Data ISO disc image containing all documentation and data
LDC2024T06_File_Manifest.txt Sep 17, 2024 - AIDA Scenario 2 Practice Topic Annotation Plain Text - 1.7 KB - MD5: c2498c5068b7fa8de6ceea9c15c85219 Documentation File manifest
Dialogs Re-Enacted Across Languages Sep 17, 2024 Ward, Nigel G.; Avila, Jonathan E.; Rivas, Emilia; Marco, Divette, 2024, "Dialogs Re-Enacted Across Languages", https://hdl.handle.net/11272.1/AB2/XRMWND, Abacus Data Network, V1 Abstract Introduction Dialogs Re-Enacted Across Languages was developed at the University of Texas at El Paso. It contains approximately 17 hours of conversational speech in English and Spanish by 129 unique bilingual speakers, specifically, short fragments extracted from spontan...
Working_with_ISO_Images.txt Sep 17, 2024 - Dialogs Re-Enacted Across Languages Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2024S08.iso Sep 17, 2024 - Dialogs Re-Enacted Across Languages Optical Disc Image - 895.1 MB - MD5: c32f94cada0b6a38a36ad32efecd91b3 Data ISO disc image containing all documentation and data
LDC2024S08_File_Manifest.txt Sep 17, 2024 - Dialogs Re-Enacted Across Languages Plain Text - 614.7 KB - MD5: ab54caf920f668e2399dd50de3bd8ff9 Documentation File manifest
Diaspora Tibetan Speech Sep 17, 2024 Geissler, Christopher; Babinski, Sarah; Shaw, Jason, 2024, "Diaspora Tibetan Speech", https://hdl.handle.net/11272.1/AB2/OPZ58Z, Abacus Data Network, V1 Abstract Introduction Diaspora Tibetan Speech was developed at Yale University. It contains approximately 28 hours of Tibetan elicited speech by 73 speakers from the diaspora Tibetan community in Kathmandu, Nepal, along with transcripts, elicitation materials and speaker demograp...
LDC2024S06.iso Sep 17, 2024 - Diaspora Tibetan Speech Optical Disc Image - 3.1 GB - MD5: 9263075d96b4f87ee88c8747e81149c2 Data ISO disc image containing all documentation and data
LDC2024S06_File_Manifest.txt Sep 17, 2024 - Diaspora Tibetan Speech Plain Text - 7.3 KB - MD5: 6143d630ae48377d98565c89a4112210 Documentation File manifest
Working_with_ISO_Images.txt Sep 17, 2024 - Diaspora Tibetan Speech Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LORELEI Uyghur Incident Language Pack Sep 17, 2024 Tracey, Jennifer; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Graff, David; Bies, Ann, 2024, "LORELEI Uyghur Incident Language Pack", https://hdl.handle.net/11272.1/AB2/VRJN4A, Abacus Data Network, V1 Abstract Introduction LORELEI Uyghur Incident Language Pack (LDC2024T07) was developed by the Linguistic Data Consortium and consists of approximately 28 million words of Uyghur monolingual text, 500,000 words of English monolingual text, 3.3 million words of parallel and compara...
Working_with_ISO_Images.txt Sep 17, 2024 - LORELEI Uyghur Incident Language Pack Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2024T07.iso Sep 17, 2024 - LORELEI Uyghur Incident Language Pack Optical Disc Image - 2.7 GB - MD5: ce96de80394708a60a697aca340e32cf Data ISO disc image containing all documentation and data
LDC2024T07_File_Manifest.txt Sep 17, 2024 - LORELEI Uyghur Incident Language Pack Plain Text - 2.2 MB - MD5: acc026c3609ae2befb601b5ec3c73ba8 Documentation File manifest
Call My Net 1 Jul 30, 2024 Jones, Karen; Walker, Kevin; Graff, David; Wright, Jonathan; Strassel, Stephanie, 2024, "Call My Net 1", https://hdl.handle.net/11272.1/AB2/RJMIEI, Abacus Data Network, V1 Abstract Introduction Call My Net 1 was developed by the Linguistic Data Consortium and contains 364 hours of conversational telephone speech in four languages (Tagalog, Cebuano, Cantonese and Mandarin) collected in 2015 from 221 native speakers located in the Philippines and Chi...
LDC2024S05_File_Manifest_d1.txt Jul 30, 2024 - Call My Net 1 Plain Text - 36.2 KB - MD5: 16f8d17eaac90ac0436fc1deb01a2d28 Documentation File manifest for disc 1
LDC2024S05_d1.iso Jul 30, 2024 - Call My Net 1 Optical Disc Image - 2.8 GB - MD5: 84fa3e2bdbb6f7b029acbbab8fc3e502 Data ISO disc image containing all documentation and data: disc 1
LDC2024S05_d2.iso Jul 30, 2024 - Call My Net 1 Optical Disc Image - 3.9 GB - MD5: 6a535b1bd2bad4455e6c1baeb5d98787 Data ISO disc image containing all documentation and data: disc 2
LDC2024S05_d3.iso Jul 30, 2024 - Call My Net 1 Optical Disc Image - 3.8 GB - MD5: de066939de6880203ebb883fcb6517a3 Data ISO disc image containing all documentation and data: disc 3
Working_with_ISO_Images.txt Jul 30, 2024 - Call My Net 1 Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LDC2024S05_File_Manifest_d2.txt Jul 30, 2024 - Call My Net 1 Plain Text - 47.4 KB - MD5: 604118281ba6d13d52f17f555d24343d Documentation File manifest for disc 2
LDC2024S05_File_Manifest_d3.txt Jul 30, 2024 - Call My Net 1 Plain Text - 39.8 KB - MD5: 0b6d13d3a54a9af71e97ae41ff79f8c9 Documentation File manifest for disc 3
Automatic Content Extraction for Portuguese Jul 30, 2024 Cunha, Luís Filipe; Silvano, Purificação; Campos, Ricardo; Jorge, Alípio, 2024, "Automatic Content Extraction for Portuguese", https://hdl.handle.net/11272.1/AB2/5VRIQB, Abacus Data Network, V1 Abstract Introduction Automatic Content Extraction for Portuguese (LDC2024T05) was developed at INESC TEC - Instituto de Engenharia de Sistemas e Computadores, Tecnologia e Ciência and consists of automatic Brazilian Portuguese and European Portuguese translations of the English...
LDC2034T05.iso Jul 30, 2024 - Automatic Content Extraction for Portuguese Optical Disc Image - 20.1 MB - MD5: b9e31b6c0f6c1cb8ac988de41f4bbde3 Data ISO disc image containing all documentation and data
LDC2024T05_File_Manifest.txt Jul 30, 2024 - Automatic Content Extraction for Portuguese Plain Text - 544 B - MD5: 0f4ab7819e10936ec715a550a0d96880 Documentation File manifest
Working_with_ISO_Images.txt Jul 30, 2024 - Automatic Content Extraction for Portuguese Plain Text - 1.3 KB - MD5: 4d4231d07ac669e105f71e602457efea Documentation Working with ISO disc images
LoReHLT Hausa Representative Language Pack May 13, 2024 Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Griffitt, Kira; Ryant, Neville; Kulick, Seth; Delgado, Dana; Arrigo, Michael, 2024, "LoReHLT Hausa Representative Language Pack", https://hdl.handle.net/11272.1/AB2/7MWKZC, Abacus Data Network, V1 Abstract Introduction LoReHLT Hausa Representative Language Pack consists of Hausa monolingual text, Hausa-English parallel text, annotations, amateur web audio recordings, supplemental resources and related software tools developed by the Linguistic Data Consortium for LoReHLT,...

Working_with_ISO_Images.txt

Oct 25, 2024 - 2007 CoNLL Shared Task - Greek, Hungarian & Italian

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

Vehicle City Voices Corpus – Part I

Oct 25, 2024

Britt, Erica, 2024, "Vehicle City Voices Corpus – Part I", https://hdl.handle.net/11272.1/AB2/8XVBZS, Abacus Data Network, V1

Abstract Introduction Vehicle City Voices Corpus – Part I was developed at the University of Michigan-Flint, and is an ongoing oral history project and survey of English language variation in Flint, Michigan. It contains approximately 16 hours of speech with corresponding transcr...

Working_with_ISO_Images.txt

Oct 25, 2024 - Vehicle City Voices Corpus – Part I

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2017S17.iso

Oct 25, 2024 - Vehicle City Voices Corpus – Part I

Optical Disc Image - 1.9 GB -

Data

ISO disc image containing all documentation and data

LDC2017S17_File_Manifest.txt

Oct 25, 2024 - Vehicle City Voices Corpus – Part I

Plain Text - 1.9 KB -

Documentation

File manifest

CHM150

Oct 25, 2024

Mena, Carlos Daniel Hernández; Herrera, Abel, 2024, "CHM150", https://hdl.handle.net/11272.1/AB2/UWURFR, Abacus Data Network, V1

Abstract Introduction CHM150 (Corpus Hecho en México 150) was developed by the Speech Processing Laboratory of the Faculty of Engineering at the National Autonomous University of Mexico (UNAM) and consists of approximately 1.63 hours of Mexican Spanish speech, associated transcri...

Working_with_ISO_Images.txt

Oct 25, 2024 - CHM150

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2016S04.iso

Oct 25, 2024 - CHM150

Optical Disc Image - 140.4 MB -

Data

ISO disc image containing all documentation and data

LDC2016S04_File_Manifest.txt

Oct 25, 2024 - CHM150

Plain Text - 179.9 KB -

Documentation

File manifest

Arabic Learner Corpus

Oct 25, 2024

Alfaifi, Abdullah; Atwell, Eric, 2024, "Arabic Learner Corpus", https://hdl.handle.net/11272.1/AB2/DPQWPU, Abacus Data Network, V1

Abstract Introduction Arabic Learner Corpus was developed at the University of Leeds and consists of written essays and spoken recordings by Arabic learners collected in Saudi Arabia in 2012 and 2013. The corpus includes 282,732 words in 1,585 materials, produced by 942 students...

LDC2015S10.iso

Oct 25, 2024 - Arabic Learner Corpus

Optical Disc Image - 882.8 MB -

Data

ISO disc image containing all documentation and data

LDC2015S10_File_Manifest.txt

Oct 25, 2024 - Arabic Learner Corpus

Plain Text - 504.8 KB -

Documentation

File manifest

Working_with_ISO_Images.txt

Oct 25, 2024 - Arabic Learner Corpus

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

BabyEars Affective Vocalizations

Oct 25, 2024

Slaney, Malcolm; McRoberts, Gerald; Scheirer, Jocelyn, 2024, "BabyEars Affective Vocalizations", https://hdl.handle.net/11272.1/AB2/VK52W9, Abacus Data Network, V1

Abstract Introduction BabyEars Affective Vocalizations was developed by Malcolm Slaney, Gerald McRoberts, and Jocelyn Scheirer. It contains approximately 22 minutes of spontaneous English speech by 12 adults interacting with their infant children, for a total of 509 infant-direct...

Working_with_ISO_Images.txt

Oct 25, 2024 - BabyEars Affective Vocalizations

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2024S04.iso

Oct 25, 2024 - BabyEars Affective Vocalizations

Optical Disc Image - 44.7 MB -

Data

ISO disc image containing all documentation and data

LDC2024S04_File_Manifest.txt

Oct 25, 2024 - BabyEars Affective Vocalizations

Plain Text - 32.2 KB -

Documentation

File manifest

Second Language University Speech Intelligibility Corpus

Oct 25, 2024

Kang, Okim; Hirschi, Kevin; Looney, Stephen D.; Hansen, John H. L., 2024, "Second Language University Speech Intelligibility Corpus", https://hdl.handle.net/11272.1/AB2/QHVV2O, Abacus Data Network, V1

Abstract Introduction Second Language University Speech Intelligibility Corpus was developed by Northern Arizona University, The Pennsylvania State University, and The University of Texas at Dallas. It contains 10.5 hours of English speech by 66 international faculty and universi...

Working_with_ISO_Images.txt

Oct 25, 2024 - Second Language University Speech Intelligibility Corpus

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2024S02.iso

Oct 25, 2024 - Second Language University Speech Intelligibility Corpus

Optical Disc Image - 781.6 MB -

Data

ISO disc image containing all documentation and data

LDC2024S02_File_Manifest.txt

Oct 25, 2024 - Second Language University Speech Intelligibility Corpus

Plain Text - 17.7 KB -

Documentation

File manifest

AIDA Scenario 2 Practice Topic Annotation

Sep 17, 2024

Tracey, Jennifer; Strassel, Stephanie; Getman, Jeremy; Bies, Ann; Griffitt, Kira; Graff, David; Caruso, Christopher, 2024, "AIDA Scenario 2 Practice Topic Annotation", https://hdl.handle.net/11272.1/AB2/BFKQTZ, Abacus Data Network, V1

Abstract Introduction AIDA Scenario 2 Practice Topic Annotation was developed by the Linguistic Data Consortium (LDC) and is comprised of annotations for 29 English, Russian and Spanish web documents (text, image and video) from AIDA Scenario 2 Practice Topic Source Data (LDC2024...

Working_with_ISO_Images.txt

Sep 17, 2024 - AIDA Scenario 2 Practice Topic Annotation

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2024T06.iso

Sep 17, 2024 - AIDA Scenario 2 Practice Topic Annotation

Optical Disc Image - 136.2 MB -

Data

ISO disc image containing all documentation and data

LDC2024T06_File_Manifest.txt

Sep 17, 2024 - AIDA Scenario 2 Practice Topic Annotation

Plain Text - 1.7 KB -

Documentation

File manifest

Dialogs Re-Enacted Across Languages

Sep 17, 2024

Ward, Nigel G.; Avila, Jonathan E.; Rivas, Emilia; Marco, Divette, 2024, "Dialogs Re-Enacted Across Languages", https://hdl.handle.net/11272.1/AB2/XRMWND, Abacus Data Network, V1

Abstract Introduction Dialogs Re-Enacted Across Languages was developed at the University of Texas at El Paso. It contains approximately 17 hours of conversational speech in English and Spanish by 129 unique bilingual speakers, specifically, short fragments extracted from spontan...

Working_with_ISO_Images.txt

Sep 17, 2024 - Dialogs Re-Enacted Across Languages

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2024S08.iso

Sep 17, 2024 - Dialogs Re-Enacted Across Languages

Optical Disc Image - 895.1 MB -

Data

ISO disc image containing all documentation and data

LDC2024S08_File_Manifest.txt

Sep 17, 2024 - Dialogs Re-Enacted Across Languages

Plain Text - 614.7 KB -

Documentation

File manifest

Diaspora Tibetan Speech

Sep 17, 2024

Geissler, Christopher; Babinski, Sarah; Shaw, Jason, 2024, "Diaspora Tibetan Speech", https://hdl.handle.net/11272.1/AB2/OPZ58Z, Abacus Data Network, V1

Abstract Introduction Diaspora Tibetan Speech was developed at Yale University. It contains approximately 28 hours of Tibetan elicited speech by 73 speakers from the diaspora Tibetan community in Kathmandu, Nepal, along with transcripts, elicitation materials and speaker demograp...

LDC2024S06.iso

Sep 17, 2024 - Diaspora Tibetan Speech

Optical Disc Image - 3.1 GB -

Data

ISO disc image containing all documentation and data

LDC2024S06_File_Manifest.txt

Sep 17, 2024 - Diaspora Tibetan Speech

Plain Text - 7.3 KB -

Documentation

File manifest

Working_with_ISO_Images.txt

Sep 17, 2024 - Diaspora Tibetan Speech

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LORELEI Uyghur Incident Language Pack

Sep 17, 2024

Tracey, Jennifer; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Graff, David; Bies, Ann, 2024, "LORELEI Uyghur Incident Language Pack", https://hdl.handle.net/11272.1/AB2/VRJN4A, Abacus Data Network, V1

Abstract Introduction LORELEI Uyghur Incident Language Pack (LDC2024T07) was developed by the Linguistic Data Consortium and consists of approximately 28 million words of Uyghur monolingual text, 500,000 words of English monolingual text, 3.3 million words of parallel and compara...

Working_with_ISO_Images.txt

Sep 17, 2024 - LORELEI Uyghur Incident Language Pack

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2024T07.iso

Sep 17, 2024 - LORELEI Uyghur Incident Language Pack

Optical Disc Image - 2.7 GB -

Data

ISO disc image containing all documentation and data

LDC2024T07_File_Manifest.txt

Sep 17, 2024 - LORELEI Uyghur Incident Language Pack

Plain Text - 2.2 MB -

Documentation

File manifest

Call My Net 1

Jul 30, 2024

Jones, Karen; Walker, Kevin; Graff, David; Wright, Jonathan; Strassel, Stephanie, 2024, "Call My Net 1", https://hdl.handle.net/11272.1/AB2/RJMIEI, Abacus Data Network, V1

Abstract Introduction Call My Net 1 was developed by the Linguistic Data Consortium and contains 364 hours of conversational telephone speech in four languages (Tagalog, Cebuano, Cantonese and Mandarin) collected in 2015 from 221 native speakers located in the Philippines and Chi...

LDC2024S05_File_Manifest_d1.txt

Jul 30, 2024 - Call My Net 1

Plain Text - 36.2 KB -

Documentation

File manifest for disc 1

LDC2024S05_d1.iso

Jul 30, 2024 - Call My Net 1

Optical Disc Image - 2.8 GB -

Data

ISO disc image containing all documentation and data: disc 1

LDC2024S05_d2.iso

Jul 30, 2024 - Call My Net 1

Optical Disc Image - 3.9 GB -

Data

ISO disc image containing all documentation and data: disc 2

LDC2024S05_d3.iso

Jul 30, 2024 - Call My Net 1

Optical Disc Image - 3.8 GB -

Data

ISO disc image containing all documentation and data: disc 3

Working_with_ISO_Images.txt

Jul 30, 2024 - Call My Net 1

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LDC2024S05_File_Manifest_d2.txt

Jul 30, 2024 - Call My Net 1

Plain Text - 47.4 KB -

Documentation

File manifest for disc 2

LDC2024S05_File_Manifest_d3.txt

Jul 30, 2024 - Call My Net 1

Plain Text - 39.8 KB -

Documentation

File manifest for disc 3

Automatic Content Extraction for Portuguese

Jul 30, 2024

Cunha, Luís Filipe; Silvano, Purificação; Campos, Ricardo; Jorge, Alípio, 2024, "Automatic Content Extraction for Portuguese", https://hdl.handle.net/11272.1/AB2/5VRIQB, Abacus Data Network, V1

Abstract Introduction Automatic Content Extraction for Portuguese (LDC2024T05) was developed at INESC TEC - Instituto de Engenharia de Sistemas e Computadores, Tecnologia e Ciência and consists of automatic Brazilian Portuguese and European Portuguese translations of the English...

LDC2034T05.iso

Jul 30, 2024 - Automatic Content Extraction for Portuguese

Optical Disc Image - 20.1 MB -

Data

ISO disc image containing all documentation and data

LDC2024T05_File_Manifest.txt

Jul 30, 2024 - Automatic Content Extraction for Portuguese

Plain Text - 544 B -

Documentation

File manifest

Working_with_ISO_Images.txt

Jul 30, 2024 - Automatic Content Extraction for Portuguese

Plain Text - 1.3 KB -

Documentation

Working with ISO disc images

LoReHLT Hausa Representative Language Pack

May 13, 2024

Tracey, Jennifer; Strassel, Stephanie; Graff, David; Wright, Jonathan; Chen, Song; Griffitt, Kira; Ryant, Neville; Kulick, Seth; Delgado, Dana; Arrigo, Michael, 2024, "LoReHLT Hausa Representative Language Pack", https://hdl.handle.net/11272.1/AB2/7MWKZC, Abacus Data Network, V1

Abstract Introduction LoReHLT Hausa Representative Language Pack consists of Hausa monolingual text, Hausa-English parallel text, annotations, amateur web audio recordings, supplemental resources and related software tools developed by the Linguistic Data Consortium for LoReHLT,...

Add Data

Share Dataverse

Link Dataverse

Reset Modifications