1 to 50 of 64 Results
Aug 29, 2023 - Linguistic Data Consortium
Ferraro, Francis; Thomas, Max; Gormley, Matthew R.; Wolfe, Travis; Harman, Craig; Van Durme, Benjamin, 2018, "Concretely Annotated English Gigaword", https://hdl.handle.net/11272.1/AB2/NQCDFR, Abacus Data Network, V2
Concretely Annotated English Gigaword was developed by Johns Hopkins University’s Human Language Technology Center of Excellence (JHU). It adds multiple kinds and instances of automatically-generated syntactic, semantic and coreference annotations to English Gigaword Fifth Editio... |
Aug 29, 2023 - Linguistic Data Consortium
Bu, Hui, 2018, "AISHELL-1", https://hdl.handle.net/11272.1/AB2/2WMDTT, Abacus Data Network, V2
AISHELL-1 was developed by Beijing Shell Shell Technology Co., Ltd. It contains approximately 520 hours of Chinese Mandarin speech from 400 speakers recorded simultaneously on three different devices with associated transcripts. The goal of the collection was to support speech re... |
Jun 9, 2021 - Linguistic Data Consortium
Li, Bin; Yin, Siqi; Xu, Jie; Song, Li; Feng, Minxuan, 2020, "Chinese CogBank", https://hdl.handle.net/11272.1/AB2/XQKHRG, Abacus Data Network, V1
Abstract Introduction Chinese CogBank is a database of cognitive properties of Chinese words intended for use in metaphor understanding and generation. It consists of 232,497 "word-property" pairs, which are comprised of 83,104 words and 100,195 properties. Each "word-property" t... |
Nov 15, 2019 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2019, "TAC KBP Cold Start - Comprehensive Evaluation Data 2012-2017", https://hdl.handle.net/11272.1/AB2/KQWRTL, Abacus Data Network, V1
TAC KBP Cold Start - Comprehensive Evaluation Data 2012-2017 was developed by the Linguistic Data Consortium (LDC) and contains Chinese, English and Spanish data produced in support of the TAC KBP Cold Start evaluation track conducted from 2012 to 2017. This includes source docum... |
Aug 15, 2019 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2019, "TAC KBP Evaluation Source Corpora 2016-2017", https://hdl.handle.net/11272.1/AB2/JDNLHX, Abacus Data Network, V1
TAC KBP Evaluation Source Corpora 2016-2017 was developed by the Linguistic Data Consortium (LDC) and contains the 180,003 Chinese, English and Spanish source documents used in support of all TAC KBP evaluation tracks conducted in 2016 and 2017. Text Analysis Conference (TAC) is... |
Jul 15, 2019 - Linguistic Data Consortium
Qin, Xiaoyi; Liu, Xinzhong; Cai, Zexin; Li, Ming, 2019, "The DKU-JNU-EMA Electromagnetic Articulography Database", https://hdl.handle.net/11272.1/AB2/D9PQFH, Abacus Data Network, V1
The DKU-JNU-EMA Electromagnetic Articulography Database was developed by Duke Kunshan University and Jinan University and contains approximately 10 hours of articulography and speech data in Mandarin, Cantonese, Hakka, and Teochew Chinese from two to seven native speakers for eac... |
May 15, 2019 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2019, "TAC KBP Chinese Regular Slot Filling - Comprehensive Training and Evaluation Data 2014", https://hdl.handle.net/11272.1/AB2/ZZMOPP, Abacus Data Network, V1
TAC KBP Chinese Regular Slot Filling - Comprehensive Training and Evaluation Data 2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Chinese Regular Slot Filling evaluation track conducted in 201... |
Apr 15, 2019 - Linguistic Data Consortium
Li, Bin; Wen, Yuan; Song, Li; Dai, Rubing; Qu, Weiguang; Xue, Nianwen, 2019, "Chinese Abstract Meaning Representation 1.0", https://hdl.handle.net/11272.1/AB2/TT5KRI, Abacus Data Network, V1
Chinese Abstract Meaning Representation was developed by Brandeis University and Nanjing Normal University and is comprised of semantic representations of a set of Chinese sentences from Chinese Treebank 8.0 (LDC2013T21). Abstract Meaning Representation (AMR) captures "who is doi... |
Jan 15, 2019 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2019, "TAC KBP Entity Discovery and Linking - Comprehensive Training and Evaluation Data 2014-2015", https://hdl.handle.net/11272.1/AB2/LCPM63, Abacus Data Network, V1
TAC KBP Entity Discovery and Linking - Comprehensive Training and Evaluation Data 2014-2015 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Entity Discovery and Linking (EDL) tasks in 2014 and 2015... |
Dec 17, 2018 - Linguistic Data Consortium
Linguistic Data Consortium, 2018, "HUB5 Mandarin Telephone Speech and Transcripts Second Edition", https://hdl.handle.net/11272.1/AB2/2JAJJE, Abacus Data Network, V1
HUB5 Mandarin Telephone Speech and Transcripts Second Edition was developed by the Linguistic Data Consortium (LDC) in support of US government projects for language recognition and Large Vocabulary Conversational Speech Recognition (LVCSR). The first edition was released by LDC... |
Dec 15, 2018 - Linguistic Data Consortium
Zhong, Victor; Zhang, Yuhao; Chen, Danqi; Angeli, Gabor; Manning, Christopher, 2018, "TAC Relation Extraction Dataset", https://hdl.handle.net/11272.1/AB2/SOYGGB, Abacus Data Network, V1
TAC Relation Extraction Dataset (TACRED) was developed by The Stanford NLP Group and is a large-scale relation extraction dataset with 106,264 examples built over English newswire and web text used in the NIST TAC KBP English slot filling evaluations during the period 2009-2014.... |
Oct 15, 2018 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2018, "TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014", https://hdl.handle.net/11272.1/AB2/B3R0J4, Abacus Data Network, V1
TAC KBP English Regular Slot Filling - Comprehensive Training and Evaluation Data 2009-2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Slot Filling evaluation track conducted from 2009 to 2014... |
Jul 16, 2018 - Linguistic Data Consortium
Linguistic Data Consortium, 2018, "CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition", https://hdl.handle.net/11272.1/AB2/88OSWL, Abacus Data Network, V1
CALLFRIEND Mandarin Chinese-Mainland Dialect Second Edition was developed by the Linguistic Data Consortium (LDC) and consists of approximately 24 hours of unscripted telephone conversations between native speakers of the Mandarin Chinese dialect spoken in mainland China. This se... |
Jun 15, 2018 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2018, "TAC KBP English Entity Linking - Comprehensive Training and Evaluation Data 2009-2013", https://hdl.handle.net/11272.1/AB2/SRPNPS, Abacus Data Network, V1
TAC KBP English Entity Linking - Comprehensive Training and Evaluation Data 2009-2013 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP English Entity Linking tasks in 2009, 2010, 2011, 2012, and 201... |
Mar 15, 2018 - Linguistic Data Consortium
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Ma, Xiaoyi; Wright, Jonathan, 2018, "LORELEI Somali Representative Language Pack - Monolingual and Parallel Text", https://hdl.handle.net/11272.1/AB2/75GGBX, Abacus Data Network, V1
LORELEI Somali Representative Language Pack - Monolingual and Parallel Text was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 13 million words of monolingual Somali text, approximately 800,000 of which are translated into English. Another 100... |
Feb 16, 2018 - Linguistic Data Consortium
Tracey, Jennifer; Graff, David; Strassel, Stephanie; Ma, Xiaoyi; Wright, Jonathan, 2018, "LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text", https://hdl.handle.net/11272.1/AB2/5TNZPX, Abacus Data Network, V1
Introduction LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text was developed by the Linguistic Data Consortium and is comprised of approximately 25 million words of monolingual Amharic text, approximately 600,000 of which are translated into English. An... |
Dec 15, 2017 - Linguistic Data Consortium
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2017, "GALE Phase 4 Chinese Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/4ADDAM, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 134 hours of Mandarin Chinese broadcast news speech collected in 2008 by LDC and Hong University of Science and Technology (HKUST), Hong... |
Nov 17, 2017 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2017, "TAC KBP Chinese Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2011-2014", https://hdl.handle.net/11272.1/AB2/XOE0NF, Abacus Data Network, V1
TAC KBP Chinese Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2011-2014 was developed by the Linguistic Data Consortium and contains training and evaluation data produced in support of the TAC KBP Chinese Cross-lingual Entity Linking tasks in 2011, 201... |
Oct 18, 2017 - Linguistic Data Consortium
Chen, Xiaohe; Li, Bin; Feng, Minxuan; Xu, Chao; Xu, Runhua; Shi, Min; Yu, Lili; Xiao, Lei; Wang, Qingqing, 2017, "Ancient Chinese Corpus", https://hdl.handle.net/11272.1/AB2/4HYBFE, Abacus Data Network, V1
Ancient Chinese Corpus was developed at Nanjing Normal University. It contains word-segmented and part-of-speech tagged text from Zuozhuan, an ancient Chinese work believed to date from the Warring States Period (475-221 BC). Zuozhuan is a commentary on the Chunqui, a history of... |
Sep 14, 2017 - Linguistic Data Consortium
Xue, Nianwen; Ng, Hwee Tou; Pradhan, Sameer; Rutherford, Attapol T.; Webber, Bonnie; Wang, Chuan; Wang, Hong Min; Prasad, Rashmi, 2017, "2015-2016 CoNLL Shared Task", https://hdl.handle.net/11272.1/AB2/TSNLNO, Abacus Data Network, V1
2015-2016 CoNLL Shared Task, LDC Catalog Number LDC2017T13 and ISBN 1-58563-812-9, contains the Chinese and English training, development and test data for the 2015 and 2016 CoNLL (Conference on Computational Natural Language Learning) Shared Task Evaluation which focused on shal... |
Jun 15, 2017 - Linguistic Data Consortium
Knight, Kevin; Badarau, Bianca; Baranescu, Laura; Bonial, Claire; Bardocz, Madalina; Griffitt, Kira; Hermjakob, Ulf; Marcu, Daniel; Palmer, Martha; O'Gorman, Tim; Schneider, Nathan, 2017, "Abstract Meaning Representation (AMR) Annotation Release 2.0", https://hdl.handle.net/11272.1/AB2/8MN4GE, Abacus Data Network, V1
Abstract Meaning Representation (AMR) Annotation Release 2.0 was developed by the Linguistic Data Consortium (LDC), SDL/Language Weaver, Inc., the University of Colorado’s Computational Language and Educational Research group and the Information Sciences Institute at the Universi... |
May 15, 2017 - Linguistic Data Consortium
Huang, Ruihong; Jurafsky, Daniel; Riloff, Ellen, 2017, "The EventStatus Corpus", https://hdl.handle.net/11272.1/AB2/EGUSOP, Abacus Data Network, V1
Introdution The EventStatus Corpus was developed by researchers at Texas A&M University, Stanford University and The University of Utah. It consists of approximately 3,000 English and 1,500 Spanish news articles about civil unrest events annotated with temporal tags. This corpus... |
Mar 17, 2017 - Linguistic Data Consortium
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie; Ma, Xiaoyi; Xue, Nianwen; Marcus, Mitch; Taylor, Ann, 2017, "GALE English-Chinese Parallel Aligned Treebank -- Training", https://hdl.handle.net/11272.1/AB2/QROJQB, Abacus Data Network, V1
Introduction GALE English-Chinese Parallel Aligned Treebank – Training was developed by the Linguistic Data Consortium (LDC) and contains 196,123 tokens of word aligned English and Chinese parallel text with treebank annotations. This material was used as training data in the DAR... |
Mar 17, 2017 - Linguistic Data Consortium
Song, Zhiyi; Garland, Jennifer; Walker, Christopher; Strassel, Stephanie, 2017, "BOLT Chinese Discussion Forum Parallel Training Data", https://hdl.handle.net/11272.1/AB2/EWIO27, Abacus Data Network, V1
Introduction BOLT Chinese Discussion Forum Parallel Training Data was developed by the Linguistic Data Consortium (LDC) and consists of 1,876,799 tokens of Chinese discussion forum data collected for the DARPA BOLT program along with their corresponding English translations. The... |
Dec 15, 2016 - Linguistic Data Consortium
Ellis, Joe; Getman, Jeremy; Strassel, Stephanie, 2016, "TAC KBP Spanish Cross-lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014", https://hdl.handle.net/11272.1/AB2/HL83QO, Abacus Data Network, V1
Introduction TAC KBP Spanish Cross-Lingual Entity Linking - Comprehensive Training and Evaluation Data 2012-2014 was developed by the Linguistic Data Consortium (LDC) and contains training and evaluation data produced in support of the TAC KBP Spanish Cross-lingual Entity Linking... |
Nov 15, 2016 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Newswire Parallel Text", https://hdl.handle.net/11272.1/AB2/KYZUJ0, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Newswire Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploitation)... |
Oct 19, 2016 - Linguistic Data Consortium
O'Gorman, Tim; Palmer, Martha, 2016, "Richer Event Description", https://hdl.handle.net/11272.1/AB2/H5RQJH, Abacus Data Network, V1
Introduction Richer Event Description was developed by the University of Colorado Boulder-CLEAR (Computational Language and Education Research, Carnegie Mellon University and LDC. It consists of coreference, bridging and event-event relations (temporal, causal, subevent and repor... |
Jul 15, 2016 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Jiang, Zixin; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Broadcast News Parallel Text", https://hdl.handle.net/11272.1/AB2/CE2DP3, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Broadcast News Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language Exploit... |
Jun 15, 2016 - Linguistic Data Consortium
Xue, Nianwen; Zhang, Xiuhong; Jiang, Zixin; Palmer, Martha; Xia, Fei; Chiou, Fu-Dong; Chang, Meiyu, 2016, "Chinese Treebank 9.0", https://hdl.handle.net/11272.1/AB2/YYY4FY, Abacus Data Network, V1
Introduction Chinese Treebank 9.0 consists of approximately two million words of annotated and parsed text from Chinese newswire, government documents, magazine articles, various broadcast news and broadcast conversation programs, web newsgroups, weblogs, discussion forums, chat... |
May 16, 2016 - Linguistic Data Consortium
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2016, "GALE Phase 4 Chinese Broadcast Conversation Transcripts", https://hdl.handle.net/11272.1/AB2/QOKU34, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 172 hours of Chinese broadcast conversation speech collected in 2008 by LDC and Hong Kong University of Science... |
May 16, 2016 - Linguistic Data Consortium
Walker, Kevin; Caruso, Christopher; Maeda, Kazuaki; DiPersio, Denise; Strassel, Stephanie, 2016, "GALE Phase 4 Chinese Broadcast Conversation Speech", https://hdl.handle.net/11272.1/AB2/Y6ZKMX, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 172 hours of Mandarin Chinese broadcast conversation speech collected in 2008 by LDC and Hong Kong University of Science and Tec... |
Mar 15, 2016 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 3 and 4 Chinese Broadcast Conversation Parallel Text", https://hdl.handle.net/11272.1/AB2/JVLMY4, Abacus Data Network, V1
Introduction GALE Phase 3 and 4 Chinese Broadcast Conversation Parallel Text was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phases 3 and 4 of the DARPA GALE (Global Autonomous Language... |
Mar 15, 2016 - Linguistic Data Consortium
Glenn, Meghan; Tracey, Jennifer; Fore, Dana; Strassel, Stephanie, 2016, "DEFT Narrative Text", https://hdl.handle.net/11272.1/AB2/L4XDPK, Abacus Data Network, V1
Introduction DEFT Narrative Text was developed by the Linguistic Data Consortium (LDC) and contains proxy reports and their source newswire used to support DARPA's Deep Exploration and Filtering of Text (DEFT) program. Among the goals of the DEFT program is to develop technologie... |
Feb 15, 2016 - Linguistic Data Consortium
Tracey, Jennifer; Lee, Haejoong; Strassel, Stephanie; Song, Zhiyi, 2016, "BOLT Chinese Discussion Forums", https://hdl.handle.net/11272.1/AB2/YW237I, Abacus Data Network, V1
Introduction BOLT Chinese Discussion Forums was developed by the Linguistic Data Consortium (LDC) and consists of 1,597,500 discussion forum threads in Chinese harvested from the Internet using a combination of manual and automatic processes. The DARPA BOLT (Broad Operational Lan... |
Jan 18, 2016 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2016, "GALE Phase 4 Chinese Weblog Parallel Sentences", https://hdl.handle.net/11272.1/AB2/EHYDWM, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Weblog Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program.... |
Dec 15, 2015 - Linguistic Data Consortium
Meghan Glenn; Haejoong Lee; Stephanie Strassel; Kazuaki Maeda, 2015, "GALE Phase 3 Chinese Broadcast News Transcripts", https://hdl.handle.net/11272.1/AB2/1FSF7R, Abacus Data Network, V1
Introduction GALE Phase 3 Chinese Broadcast News Transcripts was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 150 hours of Chinese broadcast news speech collected in 2007 and 2008 by LDC and Hong University of Science and Technolo... |
Dec 15, 2015 - Linguistic Data Consortium
Kevin Walker; Christopher Caruso; Kazuaki Maeda; Denise DiPersio; Stephanie Strassel, 2015, "GALE Phase 3 Chinese Broadcast News Speech", https://hdl.handle.net/11272.1/AB2/S6DMZM, Abacus Data Network, V1
Introduction GALE Phase 3 Chinese Broadcast News Speech was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 150 hours of Mandarin Chinese broadcast news speech collected in 2007 and 2008 by LDC and Hong University of Science and Technology (HKU... |
Nov 16, 2015 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2015, "GALE Phase 4 Chinese Newswire Parallel Sentences", https://hdl.handle.net/11272.1/AB2/WAB9QW, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Newswire Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) Program... |
Oct 15, 2015 - Linguistic Data Consortium
Walker, Christopher; Song, Zhiyi; Kazuaki, Maeda, 2015, "ACE 2007 Spanish DevTest - Pilot Evaluation", https://hdl.handle.net/11272.1/AB2/GBDW4K, Abacus Data Network, V1
Introduction ACE 2007 Spanish DevTest was developed by the Linguistic Data Consortium (LDC). This publication contains the complete set of Spanish development and test data to support the 2007 Automatic Content Extraction (ACE) technology evaluation, namely, newswire data annotat... |
Oct 15, 2015 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2015, "GALE Phase 4 Chinese Broadcast News Parallel Sentences", https://hdl.handle.net/11272.1/AB2/RDRUFI, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast News Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploitation) P... |
Sep 15, 2015 - Linguistic Data Consortium
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie, 2015, "GALE Chinese-English Word Alignment and Tagging -- Broadcast Training Part 4", https://hdl.handle.net/11272.1/AB2/DFHQD1, Abacus Data Network, V1
Introduction GALE Chinese-English Word Alignment and Tagging -- Broadcast Training Part 4 was developed by the Linguistic Data Consortium (LDC) and contains 243,038 tokens of word aligned Chinese and English parallel text enriched with linguistic tags. This material was used as t... |
Jun 15, 2015 - Linguistic Data Consortium
Song, Zhiyi; Krug, Gary; Strassel, Stephanie, 2015, "GALE Phase 4 Chinese Broadcast Conversation Parallel Sentences", https://hdl.handle.net/11272.1/AB2/X1AKBI, Abacus Data Network, V1
Introduction GALE Phase 4 Chinese Broadcast Conversation Parallel Sentences was developed by the Linguistic Data Consortium (LDC). Along with other corpora, the parallel text in this release comprised training data for Phase 4 of the DARPA GALE (Global Autonomous Language Exploit... |
May 15, 2015 - Linguistic Data Consortium
Glenn, Meghan; Lee, Haejoong; Strassel, Stephanie; Maeda, Kazuaki, 2015, "GALE Phase 3 Chinese Broadcast Conversation Transcripts Part 2", https://hdl.handle.net/11272.1/AB2/BS5SXB, Abacus Data Network, V1
Introduction GALE Phase 3 Chinese Broadcast Conversation Transcripts Part 2 was developed by the Linguistic Data Consortium (LDC) and contains transcriptions of approximately 112 hours of Chinese broadcast conversation speech collected in 2007 and 2008 by LDC and Hong University... |
Apr 15, 2015 - Linguistic Data Consortium
Yuan, Jiahong; Ryant, Neville; Liberman, Mark, 2015, "Mandarin Chinese Phonetic Segmentation and Tone", https://hdl.handle.net/11272.1/AB2/HW9PE3, Abacus Data Network, V1
Mandarin Chinese Phonetic Segmentation and Tone was developed by the Linguistic Data Consortium (LDC) and contains 7,849 Mandarin Chinese "utterances" and their phonetic segmentation and tone labels separated into training and test sets. The utterances were derived from 1997 Mand... |
Mar 16, 2015 - Linguistic Data Consortium
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie; Ma, Xiaoyi; Xue, Nianwen; Marcus, Mitch; Taylor, Anne, 2015, "GALE Chinese-English Parallel Aligned Treebank -- Training", https://hdl.handle.net/11272.1/AB2/R6YEEW, Abacus Data Network, V1
GALE Chinese-English Parallel Aligned Treebank -- Training was developed by the Linguistic Data Consortium (LDC) and contains 229,249 tokens of word aligned Chinese and English parallel text with treebank annotations. This material was used as training data in the DARPA GALE (Glo... |
Feb 16, 2015 - Linguistic Data Consortium
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie, 2015, "GALE Chinese-English Word Alignment and Tagging -- Broadcast Training Part 3", https://hdl.handle.net/11272.1/AB2/NWG5BA, Abacus Data Network, V1
Introduction GALE Chinese-English Word Alignment and Tagging -- Broadcast Training Part 3 was developed by the Linguistic Data Consortium (LDC) and contains 242,020 tokens of word aligned Chinese and English parallel text enriched with linguistic tags. This material was used as t... |
Oct 16, 2013 - Linguistic Data Consortium
Weischedel, Ralph; Palmer, Martha; Marcus, Mitchell; Hovy, Eduard; Pradhan, Sameer; Ramshaw, Lance; Xue, Nianwen; Taylor, Ann; Kaufman, Jeff; Franchini, Michelle; El-Bachouti, Mohammed; Belvin, Robert; Houston, Ann, 2013, "OntoNotes Release 5.0", https://hdl.handle.net/11272.1/AB2/MKJJ2R, Abacus Data Network, V1
OntoNotes Release 5.0 is the final release of the OntoNotes project, a collaborative effort between BBN Technologies, the University of Colorado, the University of Pennsylvania and the University of Southern Californias Information Sciences Institute. The goal of the project was... |
Sep 24, 2013 - UBC Library licensed data
International Monetary Fund, 2013, "Balance of Payments Statistics, 1967- [v6, August 2012 - ]", https://hdl.handle.net/11272.1/AB2/QEA9MT, Abacus Data Network, V1
The BOPS database provides users with over 180,000 time series of quarterly and annual balance of payments data for more than 180 countries, jurisdictions, or other reporting entities. These data are also published in the Balance of Payments Statistics Yearbook (BOPSY), but the n... |
Sep 24, 2013 - UBC Library licensed data
International Monetary Fund, 2013, "Direction of Trade Statistics, 1980-", https://hdl.handle.net/11272.1/AB2/YFV8QM, Abacus Data Network, V1
Direction of Trade Statistics (DOTS) present, for most member countries of the International Monetary Fund (the Fund), current figures on the value of merchandise exports and imports disaggregated according to their most important trading partners. Direction of trade statistics (... |
Jul 31, 2012 - UBC Library licensed data
International Monetary Fund, 2012, "Balance of Payments Statistics, 1967-, [v5, ? - July 2012]", https://hdl.handle.net/11272.1/AB2/FUONVO, Abacus Data Network, V1
The BOPS database provides users with over 180,000 time series of quarterly and annual balance of payments data for more than 180 countries, jurisdictions, or other reporting entities. These data are also published in the Balance of Payments Statistics Yearbook (BOPSY), but the n... |