Skip to main content
Metrics
701,969 Downloads
The Abacus Data Network is a data repository collaboration involving Libraries at Simon Fraser University (SFU), the University of British Columbia (UBC), the University of Northern British Columbia (UNBC) and the University of Victoria (UVic).
Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

401 to 450 of 2,582 Results
Mar 18, 2022 - Linguistic Data Consortium
Fung, Pascale; Huang, Shudong; Graff, David, 2022, "HKUST Mandarin Telephone Transcript Data, Part 1", https://hdl.handle.net/11272.1/AB2/UOHG3I, Abacus Data Network, V1
Abstract Introduction HKUST Mandarin Telephone Transcript Data Part 1 was developed by Hong Kong University of Science and Technology (HKUST) and contains transcripts for 897 telephone conversations in Mandarin Chinese. In 2004 HKUST was contracted to collect and transcribe 200 h...
Mar 18, 2022 - Linguistic Data Consortium
Fung, Pascale; Huang, Shudong; Graff, David, 2022, "HKUST Mandarin Telephone Speech, Part 1", https://hdl.handle.net/11272.1/AB2/TKM8OR, Abacus Data Network, V1
Abstract Introduction HKUST Mandarin Telephone Speech, Part 1 was developed by Hong Kong University of Science and Technology (HKUST) and contains approximately 149 hours of conversational telephone speech (CTS) in Mandarin. Given that Standard Mandarin is not the native dialect...
Mar 16, 2022 - Statistics Canada - DLI
Statistics Canada, 2022, "Postal Code Conversion File, March 2022 Postal Codes, 2022", https://hdl.handle.net/11272.1/AB2/UAGJKN, Abacus Data Network, V1
The Postal Code Project is responsible for linking the approximately 900,000 single postal codes in Canada to Statistics Canada’s Census dissemination geography, (presently 2021 Census geography). This process is performed by using data provided by Canada Post Corporation and lin...
Mar 11, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "Preliminary dataset on confirmed cases of COVID-19, Public Health Agency of Canada [custom extraction]", https://hdl.handle.net/11272.1/AB2/ME03PG, Abacus Data Network, V1, UNF:6:3Eliht9aisreZk+zSfJPtw== [fileUNF]
This dataset provides Canadians and researchers with preliminary data on the confirmed cases of coronavirus (COVID-19) in Canada. Given the rapidly-evolving nature of this situation, these data are considered preliminary. The dataset was downloaded from Statistics Canada as a CSV...
Mar 2, 2022 - Statistics Canada - DLI
Statistics Canada, 2021, "Postal Code Conversion File, August 2021 Postal Codes, 2021", https://hdl.handle.net/11272.1/AB2/HJPB6W, Abacus Data Network, V2
The Postal Code Project is responsible for linking the approximately 900,000 single postal codes in Canada to Statistics Canada’s Census dissemination geography, (presently 2016 Census geography). This process is performed by using data provided by Canada Post Corporation and lin...
Feb 9, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "National Travel Survey, 2020", https://hdl.handle.net/11272.1/AB2/VM9BXS, Abacus Data Network, V1, UNF:6:4stdrvj4lIxqeysb1XWD7A== [fileUNF]
The National Travel Survey (NTS) was developed to fully replace the Travel Survey of Residents of Canada (TSRC record number 3810) and replace the Canadian resident component of the International Travel Survey (ITS record number 3152). The National Travel Survey collects informat...
Feb 7, 2022 - Linguistic Data Consortium
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Bies, Ann, 2022, "LORELEI Kinyarwanda Incident Language Pack", https://hdl.handle.net/11272.1/AB2/P1OIX0, Abacus Data Network, V1
Abstract Introduction LORELEI Kinyarwanda Incident Language Pack was developed by the Linguistic Data Consortium and is comprised of approximately 11.9 million words of Kinyarwanda monolingual text, 35,000 words of English monolingual text, 3.4 million words of parallel and compa...
Feb 7, 2022 - Linguistic Data Consortium
Byers, Frederick, 2022, "2017 NIST OpenSAT Pilot - SSSF", https://hdl.handle.net/11272.1/AB2/PTU0AQ, Abacus Data Network, V1
Abstract Introduction 2017 NIST OpenSAT Pilot - SSSF was developed by NIST (National Institute of Standards and Technology) and contains approximately one hour of operational speech data, transcripts and annotation files used in the speech activity detection, automatic speech rec...
Feb 7, 2022 - Linguistic Data Consortium
Bies, Ann; Mott, Justin; Warner, Colin; Kulick, Seth, 2022, "BOLT English Translation Treebank - Chinese SMS/Chat", https://hdl.handle.net/11272.1/AB2/JBOOKU, Abacus Data Network, V1
Abstract Introduction BOLT English Translation Treebank - Chinese SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of SMS and chat text data translated from Chinese to English and annotated for part-of-speech and syntactic structure. The DARPA BOLT (Bro...
Feb 4, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "Canadian Income Survey, 2018", https://hdl.handle.net/11272.1/AB2/G6T0LC, Abacus Data Network, V1, UNF:6:RlzI4LxHQ+ZRmY8Hn85cuw== [fileUNF]
The primary objective of the Canadian Income Survey (CIS) is to provide information on the income and income sources of Canadians, along with their individual and household characteristics. The data collected in the CIS is combined with Labour Force Survey (LFS, record number 370...
Jan 27, 2022 - Statistics Canada - DLI
Statistics Canada, 2022, "Social Policy Simulation Database and Model (SPSD/M), Version 29.0, database year 2017", https://hdl.handle.net/11272.1/AB2/1QO7LM, Abacus Data Network, V1
The SPSD/M is a static microsimulation model designed to analyse financial interactions between governments and individuals in Canada. It can compute taxes paid to and cash transfers received from government. It is comprised of a database, a series of tax/transfer algorithms and...
Jan 24, 2022 - Linguistic Data Consortium
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2017, "GALE Phase 3 Arabic Broadcast News Transcripts Part 2", https://hdl.handle.net/11272.1/AB2/VM5MOD, Abacus Data Network, V2
Introduction GALE Phase 3 Arabic Broadcast News Transcripts Part 2 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 128 hours of Arabic broadcast news speech collected in 2007 by the Linguistic Data Consortium (LDC), MediaNet, Tun...
Jan 19, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "Provincial Symmetric Input-Output Tables, 2018", https://hdl.handle.net/11272.1/AB2/WWLLUT, Abacus Data Network, V1
The Industry Accounts Division of Statistics Canada publishes annual provincial supply and use tables. While these industry by product tables closely reflect actual economic transactions, certain analytical and modeling purposes, however, require symmetric industry-by-industry in...
Jan 14, 2022 - Statistics Canada Open License
Statistics Canada, 2022, "Canadian Business Counts, 2021", https://hdl.handle.net/11272.1/AB2/6YYPZD, Abacus Data Network, V1
Canadian business counts—previously called Canadian business patterns—provide counts of active businesses by industry classification and employment-size categories for Canada and the provinces and territories. Canadian business counts are based on the same criteria that were used...
Dec 2, 2021 - Linguistic Data Consortium
Palmer, Martha; Hwang, Jena D.; Mansouri, Aous; Bonial, Claire; O'Gorman, Tim; Gung, James, 2021, "BOLT Egyptian Arabic PropBank and Sense -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech", https://hdl.handle.net/11272.1/AB2/YS81IR, Abacus Data Network, V1
Abstract Introduction BOLT Egyptian Arabic PropBank and Sense -- Discussion Forum, SMS/Chat, and Conversational Telephone Speech was developed by the University of Colorado Boulder - CLEAR (Computational Language and Education Research) and consists of propbank annotation on Egyp...
Dec 2, 2021 - Linguistic Data Consortium
Ryant, Neville; Liberman, Mark; Fiumara, James; Cieri, Christopher, 2021, "Second DIHARD Challenge Development - Eleven Sources", https://hdl.handle.net/11272.1/AB2/CBFPZO, Abacus Data Network, V1
Abstract Introduction Second DIHARD Challenge Development - Eleven Sources was developed by LDC and contains approximately 22 hours of English and Chinese speech data along with corresponding annotations used in support of the Second DIHARD Challenge. The DIHARD Challenges are a...
Campus and Community Planning logo
Dec 1, 2021University of British Columbia Open Data
Open data produced and maintained by UBC Campus and Community Planning.
University of British Columbia Open Data(University of British Columbia)
University of British Columbia Open Data logo
Dec 1, 2021
Open data produced by the University of British Columbia (UBC) in the course of its operations and administration. For open data created as the output of UBC research, see the UBC collection in Scholars Portal Dataverse.
Nov 19, 2021 - Statistics Canada - DLI
Canadian Institute for Health Information, 2021, "Discharge Abstract Database, 2019-2020 and 2020-2021", https://hdl.handle.net/11272.1/AB2/RQKUYZ, Abacus Data Network, V1, UNF:6:g/k+/5S9AnzFOhXd58GNig== [fileUNF]
Originally developed in 1963, the Discharge Abstract Database (DAD) captures administrative, clinical and demographic information on hospital discharges (including deaths, sign-outs and transfers). Some provinces and territories also use the DAD to capture day surgery. Data extra...
Nov 18, 2021 - Linguistic Data Consortium
Maamouri, Mohamed; Bies, Ann; Kulick, Seth; Krouna, Sondos; Tabassi, Dalila; Ciul, Michael, 2021, "BOLT Egyptian Arabic Treebank - SMS/Chat", https://hdl.handle.net/11272.1/AB2/1DSLOX, Abacus Data Network, V1
Abstract Introduction BOLT Egyptian Arabic Treebank - SMS/Chat was developed by the Linguistic Data Consortium (LDC) and consists of Egyptian Arabic SMS/Chat data with part-of-speech annotation, morphology, and syntactic tree annotation. The DARPA BOLT (Broad Operational Language...
Nov 18, 2021 - Linguistic Data Consortium
Keating, Patricia; Kreiman, Jody; Alwan, Abeer; Chong, Adam; Lee, Yoonjeong, 2021, "UCLA Speaker Variability Database", https://hdl.handle.net/11272.1/AB2/CIIVXT, Abacus Data Network, V1
Abstract Introduction UCLA Speaker Variability Database was developed by UCLA Speech Processing and Auditory Perception Laboratory and is comprised of approximately 34 hours of English speech and orthographic transcripts. This corpus was designed to sample variability in speaking...
Nov 2, 2021 - DMTI Spatial
DMTI Spatial Inc., 2021, "CanMap Content Suite, v2021.3", https://hdl.handle.net/11272.1/AB2/ZGQDCJ, Abacus Data Network, V1
CanMap Content Suite contains over 100 unique and rich content layers. Each layer has a unique file and layer name with associated definitions, descriptions, attribution and metadata. All layers, with a few exceptions, are vector data consisting of polygon, polyline, or point geo...
Oct 28, 2021 - DMTI Spatial
DMTI Spatial Inc., 2021, "CanMap Postal Code Suite, v2021.3", https://hdl.handle.net/11272.1/AB2/LIKJJX, Abacus Data Network, V1
The CanMap Postal Code Suite is comprised of the following postal products: The CanMap Postal Code File - Multiple Enhanced Postal Code (MEP) product is a precision-based point file representing over 1 million postal codes across Canada. The Multiple Enhanced Postal Code product...
Oct 26, 2021 - Linguistic Data Consortium
Godfrey, John J.; Holliman, Edward, 2021, "Switchboard-1 Release 2", https://hdl.handle.net/11272.1/AB2/VTPSCK, Abacus Data Network, V1
Abstract Introduction The Switchboard-1 Telephone Speech Corpus (LDC97S62) consists of approximately 260 hours of speech and was originally collected by Texas Instruments in 1990-1, under DARPA sponsorship. The first release of the corpus was published by NIST and distributed by...
Oct 20, 2021 - DMTI Spatial
DMTI Spatial Inc., 2021, "CanMap Address Points, v2021.3", https://hdl.handle.net/11272.1/AB2/HOWGP8, Abacus Data Network, V1
CanMap Address Points are unique and discrete representations of civic address assignments across Canada. It is the ultimate in answering the question of “where” and an anchor for a single source of accuracy in your mission-critical data. When building your location intelligence...
Oct 20, 2021 - DMTI Spatial
DMTI Spatial Inc., 2019, "CanMap Address Points, v2019.2", https://hdl.handle.net/11272.1/AB2/RZ99DD, Abacus Data Network, V1
CanMap Address Points are unique and discrete representations of civic address assignments across Canada. It is the ultimate in answering the question of “where” and an anchor for a single source of accuracy in your mission-critical data. When building your location intelligence...
Oct 20, 2021 - DMTI Spatial
DMTI Spatial Inc., 2018, "CanMap Address Points, v2018.3", https://hdl.handle.net/11272.1/AB2/MDCZXG, Abacus Data Network, V1
CanMap Address Points are unique and discrete representations of civic address assignments across Canada. It is the ultimate in answering the question of “where” and an anchor for a single source of accuracy in your mission-critical data. When building your location intelligence...
Oct 20, 2021 - DMTI Spatial
DMTI Spatial Inc., 2017, "CanMap Address Points, v2017.4", https://hdl.handle.net/11272.1/AB2/GBIUJJ, Abacus Data Network, V1
CanMap Address Points are unique and discrete representations of civic address assignments across Canada. It is the ultimate in answering the question of “where” and an anchor for a single source of accuracy in your mission-critical data. When building your location intelligence...
Oct 19, 2021 - Campus and Community Planning
University of British Columbia. Campus and Community Planning, 2021, "[University of British Columbia Vancouver Campus Lidar], 2005", https://hdl.handle.net/11272.1/AB2/GTPDZF, Abacus Data Network, V1
University of British Columbia Vancouver (formerly called Point Grey) campus lidar survey. This survey does not cover the entire campus; it consists mostly of shoreline areas.
Oct 14, 2021 - Linguistic Data Consortium
Mena, Carlos Daniel Hernández; Ruiz, Iván Vladimir Meza, 2021, "Wikipedia Spanish Speech and Transcripts", https://hdl.handle.net/11272.1/AB2/L05NFF, Abacus Data Network, V1
Abstract Introduction Wikipedia Spanish Speech and Transcripts consists of approximately 25 hours of Spanish read speech and transcripts. The read text was taken from the Spanish version of WikiProject Spoken Wikipedia, referred to as Wikipedia Grabada. The transcripts were devel...
Oct 14, 2021 - Linguistic Data Consortium
Tracey, Jennifer; Delgado, Dana; Chen, Song; Strassel, Stephanie, 2021, "BOLT Egyptian Arabic SMS/Chat Parallel Training Data", https://hdl.handle.net/11272.1/AB2/WXML9A, Abacus Data Network, V1
Abstract Introduction BOLT Egyptian Arabic SMS/Chat Parallel Training Data was developed by the Linguistic Data Consortium (LDC) and consists of approximately 723,000 tokens of Egyptian Arabic SMS/Chat data collected for the DARPA BOLT program along with their corresponding Engli...
Oct 14, 2021 - Linguistic Data Consortium
Alsheddi, Abeer, 2021, "Classical Arabic Dictionary", https://hdl.handle.net/11272.1/AB2/FQ7PIS, Abacus Data Network, V1
Abstract Introduction Classical Arabic Dictionary consists of approximately one hundred million words of Arabic collected from texts dating between 431 and 1104 CE, principally books and essays, along with word occurrences, source documents and related metadata. Data The dictiona...
Oct 14, 2021 - DMTI Spatial
DMTI Spatial Inc., 2020, "CanMap Address Points, v2020.4", https://hdl.handle.net/11272.1/AB2/HL7BV7, Abacus Data Network, V1
CanMap Address Points are unique and discrete representations of civic address assignments across Canada. It is the ultimate in answering the question of “where” and an anchor for a single source of accuracy in your mission-critical data. When building your location intelligence...
Oct 14, 2021 - DMTI Spatial
DMTI Spatial Inc., 2021, "CanMap Postal Code Suite, v2020.3", https://hdl.handle.net/11272.1/AB2/MPQ1LE, Abacus Data Network, V1
The CanMap Postal Code Suite is comprised of the following postal products: The CanMap Postal Code File - Multiple Enhanced Postal Code (MEP) product is a precision-based point file representing over 1 million postal codes across Canada. The Multiple Enhanced Postal Code product...
Oct 8, 2021 - Statistics Canada - DLI
Statistics Canada, 2021, "Postal Codes by Federal Ridings File (PCFRF) 2013 Representation Order, August 2021 Postal Codes, 2021", https://hdl.handle.net/11272.1/AB2/GI4245, Abacus Data Network, V1
The Postal Code Project is responsible for linking the approximately 900,000 single postal codes in Canada to Statistics Canada’s Census dissemination geography, (presently 2016 Census geography). This process is performed by using data provided by Canada Post Corporation and lin...
Oct 7, 2021 - Campus and Community Planning
University of British Columbia. Campus and Community Planning, 2021, "[Orthophotos, University of British Columbia Vancouver Campus], 2021", https://hdl.handle.net/11272.1/AB2/R731P3, Abacus Data Network, V1
Orthorectified aerial imagery of the UBC Vancouver campus, 2021
Oct 6, 2021 - Campus and Community Planning
University of British Columbia. Campus and Community Planning, 2021, "[University of British Columbia Vancouver Campus Lidar], 2021", https://hdl.handle.net/11272.1/AB2/Y5KQNB, Abacus Data Network, V1
University of British Columbia Vancouver (formerly called Point Grey) campus lidar survey. Includes Pacific Spirit Park and Musqueam Reserve No.2 (southeast of UBC Campus).
Oct 1, 2021 - Linguistic Data Consortium
Bills, Aric; Conners, Thomas; David, Anne; Dubinski, Eyal; Fiscus, Jonathan G.; Gann, Ketty; Harper, Mary; Kazi, Michael; Lim, Lynn-Li; Malyska, Nicolas; Melot, Jennifer; Ray, Jessica; Rytting, Anton; Shen, Sinney; Smith, Rosanna, 2021, "IARPA Babel Mongolian Language Pack IARPA-babel401b-v2.0b", https://hdl.handle.net/11272.1/AB2/IFBL6A, Abacus Data Network, V1
Abstract Introduction IARPA Babel Mongolian Language Pack IARPA-babel401b-v2.0b was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 204 hours of Halh Mongolian conversational and scripted telephone speec...
Sep 29, 2021 - Linguistic Data Consortium
Andresen, Jess; Bills, Aric; Conners, Thomas; Dubinski, Eyal; Fiscus, Jonathan G.; Harper, Mary; Kozlov, Kirill; Malyska, Nicolas; Melot, Jennifer; Morrison, Michelle; Phillips, Josh; Ray, Jessica; Rytting, Anton; Shen, Wade; Silber, Ronnie; Tzoukermann, Evelyne; Wong, Jamie, 2021, "IARPA Babel Swahili Language Pack IARPA-babel202b-v1.0d", https://hdl.handle.net/11272.1/AB2/TNSSDU, Abacus Data Network, V2
Abstract Introduction IARPA Babel Swahili Language Pack IARPA-babel202b-v1.0d was developed by Appen for the IARPA (Intelligence Advanced Research Projects Activity) Babel program. It contains approximately 350 hours of Swahili conversational and scripted telephone speech collect...
Sep 29, 2021 - Linguistic Data Consortium
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Arrigo, Michael; Wright, Jonathan; Bies, Ann, 2021, "LORELEI Oromo Incident Language Pack", https://hdl.handle.net/11272.1/AB2/EH7NXF, Abacus Data Network, V1
Abstract Introduction LORELEI Oromo Incident Language Pack was developed by the Linguistic Data Consortium and is comprised of approximately 3.9 million words of Oromo monolingual text, 25,000 words of English monolingual text, 135,000 words of parallel and comparable Oromo-Engli...
Sep 24, 2021 - Statistics Canada Open License
Statistics Canada, 2021, "Survey of Financial Security, 2019", https://hdl.handle.net/11272.1/AB2/B8A8ZH, Abacus Data Network, V1, UNF:6:+jkZpvTireJsb9/nPMFK0A== [fileUNF]
The purpose of the survey is to collect information from a sample of Canadian households on their assets, debts, employment, income and education. The SFS provides a comprehensive picture of the financial health of Canadians. Information is collected on the value of all major fin...
Sep 3, 2021 - Linguistic Data Consortium
Neergaard, Karl David; Xu, Hongzhi; Huang, Chu-Ren, 2021, "Database of Word Level Statistics - Mandarin", https://hdl.handle.net/11272.1/AB2/VJDPA0, Abacus Data Network, V1
Abstract Introduction Database of Word Level Statistics - Mandarin was developed by The Hong Kong Polytechnic University. It provides lexical characteristics of a descriptive and statistical nature for words and nonwords of Mandarin Chinese. It is designed for researchers particu...
Sep 3, 2021 - Linguistic Data Consortium
Knight, Kevin; Badarau, Bianca; Baranescu, Laura; Bonial, Claire; Bardocz, Madalina; Griffitt, Kira; Hermjakob, Ulf; Marcu, Daniel; Palmer, Martha; O'Gorman, Tim; Schneider, Nathan, 2021, "Abstract Meaning Representation (AMR) Annotation Release 3.0", https://hdl.handle.net/11272.1/AB2/82CVJF, Abacus Data Network, V1
Abstract Introduction Abstract Meaning Representation (AMR) Annotation Release 3.0 was developed by the Linguistic Data Consortium (LDC), SDL/Language Weaver, Inc., the University of Colorado's Computational Language and Educational Research group and the Information Sciences Ins...
Sep 3, 2021 - Linguistic Data Consortium
Sluyter-Gaethje, Henny; Bourgonje, Peter; Stede, Manfred, 2021, "Penn Discourse Treebank Version 2.0 - German Translation", https://hdl.handle.net/11272.1/AB2/1AXWBN, Abacus Data Network, V1
Abstract Introduction Penn Discourse Treebank Version 2.0 - German Translation was developed at the University of Potsdam's Applied Computational Linguistics group and consists of approximately one million tokens derived from Penn Discourse Treebank Version 2.0 (LDC2008T05). This...
Sep 3, 2021 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2021, "TAC KBP English Surprise Slot Filling -- Comprehensive Training and Evaluation Data 2010", https://hdl.handle.net/11272.1/AB2/VAZOSD, Abacus Data Network, V1
Abstract Introduction TAC KBP English Surprise Slot Filling -- Comprehensive Training and Evaluation Data 2010 was developed by the Linguistic Data Consortium and contains training and evaluation data produced in support of the 2010 TAC KBP Surprise Slot Filling track, the only y...
Sep 3, 2021 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2021, "TAC KBP English Sentiment Slot Filling -- Comprehensive Training and Evaluation Data 2013-2014", https://hdl.handle.net/11272.1/AB2/MRZALN, Abacus Data Network, V1
Abstract Introduction TAC KBP English Surprise Slot Filling -- Comprehensive Training and Evaluation Data 2010 was developed by the Linguistic Data Consortium and contains training and evaluation data produced in support of the 2013 and 2014 TAC KBP Sentiment Slot Filling tracks....
Sep 3, 2021 - Linguistic Data Consortium
Daza, Angel; Frank, Anette, 2021, "X-SRL: Parallel Cross-lingual Semantic Role Labeling", https://hdl.handle.net/11272.1/AB2/DNOJP9, Abacus Data Network, V1
Abstract Introduction X-SRL: Parallel Cross-lingual Semantic Role Labeling was developed by Heidelberg University, Department of Computational Linguistics and the Leibniz Institute for the German Language (IDS). It consists of approximately three million words of German, French a...
Sep 3, 2021 - Linguistic Data Consortium
Arase, Yuki; Tsujii, Junichi, 2021, "ESPADA", https://hdl.handle.net/11272.1/AB2/ANSK9Z, Abacus Data Network, V1
Abstract Introduction ESPADA (Extended Syntactic Phrase Alignment DAtaset) consists of annotated parse trees and alignment on English sentential paraphrases extracted from machine translation evaluation corpora. It extends SPADE (LDC2018T09) by adding new annotated data for train...
Sep 3, 2021 - Linguistic Data Consortium
Tracey, Jennifer; Delgado, Dana; Chen, Song; Strassel, Stephanie, 2021, "BOLT Chinese SMS/Chat Parallel Training Data", https://hdl.handle.net/11272.1/AB2/O3JTA9, Abacus Data Network, V1
Abstract Introduction BOLT Chinese SMS/Chat Parallel Training Data was developed by the Linguistic Data Consortium and consists of approximately 1.8 million tokens of Chinese SMS/Chat data collected for the DARPA BOLT program along with their corresponding English translations Th...
Sep 3, 2021 - Linguistic Data Consortium
Li, Bin; Xiao, Liming; Liu, Yihuan; Wen, Yuan; Song, Li; Chun, Jayeol; Feng, Minxuan; Zhou, Junsheng; Qu, Weiguang; Xue, Nianwen, 2021, "Chinese Abstract Meaning Representation 2.0", https://hdl.handle.net/11272.1/AB2/LVQEZJ, Abacus Data Network, V1
Abstract Introduction Chinese Abstract Meaning Representation (CAMR) 2.0 was developed by Brandeis University and Nanjing Normal University and is comprised of semantic representations of a set of approximately 20,000 Chinese sentences from Chinese Treebank (CTB) 8.0 (LDC2013T21)...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.

Contact Abacus Data Network Support

Abacus Data Network Support

Please fill this out to prove you are not a robot.

+ =