助手标题  
全文文献 工具书 数字 学术定义 翻译助手 学术趋势 更多
查询帮助
意见反馈
   大规模语料 的翻译结果: 查询用时:0.019秒
图标索引 在分类学科中查询
所有学科
中国语言文字
更多类别查询

图标索引 历史查询
 

大规模语料
相关语句
  large-scale corpus
     New word identification based on large-scale corpus
     基于大规模语料的新词语识别方法
短句来源
     An experimental study of Chinese words cluster on large-scale corpus
     基于大规模语料的中文词聚类研究与实现
     Based on statistic over large-scale corpus, this paper built a Chinese name identification knowledgebase and presented the method of person name recognition with statistics and rules.
     本文在对大规模语料统计的基础上,建立了一个人名识别的知识库,提出了一种统计和规则相结合的人名识别方法。
短句来源
     It constructs Trigram statistic language model based on large-scale corpus,and builds corresponding bintree for the sentence waiting disposal;
     基于大规模语料,利用三元模型Trigram,建立统计语言模型; 基于SLM为待处理句子生成相应的二叉树;
短句来源
     This paper builds a knowledge base of person names recognition based on large-scale corpus and presents the method of person name recognition with statistics and rules , Our knowledge base of person names recognition include Chinese surname and its surname, Chinese name word and its usage probability, prefix and suffix leader word of Chinese name .
     本文在对大规模语料统计的基础上,建立了一个包含人名姓氏及其统计概率、人名用字及其统计概率、人名前后缀和人名引导词等资源的人名识别的知识库,提出了一种统计和规则相结合的人名识别方法。
  large scale corpus
     The Algorithm Design and Realization to Calculate The Mutual Information of Four-Word-String in Large Scale Corpus
     计算大规模语料中四字词串互信息的算法设计
短句来源
     The former must be based on an adequate formal research of the language, while the latter is based on the analysis and statistics of large scale corpus.
     前者必须建立在对自然语言充分的形式化研究的基础之上 ,后者则以对大规模语料的分析和统计为基础。
短句来源
     This paper dynamically builds parameter table and threshold by extracting and analyzing usage frequency of characters of Chinese names based on large scale corpus and researches evaluation function for Chinese name recognition.
     本文在大规模语料基础上提取和分析了中文姓氏和名字用字的使用频率,研究了中文姓名识别的评价函数,动态地建立了姓名识别统计数据表和姓名阈值。
短句来源
     It uses ratio of relative word rank(RRWR) to pre-process the potential collocations, applies linguistic knowledge to restrict the part of speech of the candidate collocations, makes use of statistical language model such as mutual information to extract collocations in large scale corpus. The result shows that the average accuracy is 84.73%, 4.7% higher than Xtract, 50.79% higher than the work of same kind in Chinese collocation extraction.
     引入相对词序比(RRWR)的方法对候选搭配词语进行筛选,应用语言学中词语搭配组合规律对候选搭配的词性进行限定,利用互信息等统计学模型在大规模语料中进行词语搭配的自动抽取,抽取的搭配平均准确率为84.73%,较Xtract系统高4.7%,较国内同类工作结果高50.79%。
短句来源
  “大规模语料”译为未确定词的双语例句
     To count the large scale language corpus , to find some linguistic phenomena and toestablish statistical model are the common method used in linguistic research andcomputational linguistic research.
     对大规模语料进行统计,发现一些语言现象和建立统计语言模型,是语言学和计算语言学研究常用的方法。
短句来源
     In recent years, the advanced computational technology has made it possible to conduct study of collocations on the bases of large corpuses and in broader areas.
     现代语料库技术和计算机技术大大推动了搭配研究,尤其是在大规模语料中开展词语搭配研究。
短句来源
     Different estimation methods of the probabilities of sparse events for the computation of the entropy in large scale modern Chinese text are applied in this paper.
     文本文在大规模语料的基础上 ,利用语言模型中稀疏事件的概率估计方法对汉语的熵进行计算 ,并讨论了语料规模等因素对熵的影响 .
短句来源
     The main stream in Chinese information processing community depends heavily on corpus based methods, by making full use of the statistical relationship among words, in recent years.
     到目前为止 ,中文信息处理主要依赖于对大规模语料的统计 ,根据概率 ,对词与词的关系作出界定。
短句来源
     The more investigators study the Chinese language, the more they realize the importance of meaning-constraints in the language.
     新的发展趋势是研究者越来越注重语义在语言结构和语言表达上的制约作用,试图用统计大规模语料为手段来攻克难关。
短句来源
更多       
查询“大规模语料”译词为用户自定义的双语例句

    我想查看译文中含有:的双语例句
例句
为了更好的帮助您理解掌握查询词或其译词在地道英语中的实际用法,我们为您准备了出自英文原文的大量英语例句,供您参考。
  large-scale corpus
Our large-scale corpus investigation reveals that PP-internal focus particles are a genuine possibility, not only in English, but also in Dutch and, to a lesser extent, German.
      
Using data from a large-scale corpus, this paper establishes the claim that in Japanese rap rhymes, the degree of similarity of two consonants positively correlates with their likelihood of making a rhyme pair.
      
To our knowledge, it was the first attempt to build a large-scale corpus of German text at that time.
      
These results provide primary verification of the initial theoretical claims, a necessary foundation for a large-scale corpus study.
      


he corpus lingUistics requires a very large corpus. Huge amount of computere-reada-ble texts for building very large corpus become available as the rapid development of electronic publishing in recent years.However,the raw texts, which are miscellaneous,must beclassified on domain before further processing.To classify texts is a tedious job for human.Therefore,we present in this paper a new approach of classifying the raw texts automaticaly The related coefficient of texts, which is defined in this paper,could...

he corpus lingUistics requires a very large corpus. Huge amount of computere-reada-ble texts for building very large corpus become available as the rapid development of electronic publishing in recent years.However,the raw texts, which are miscellaneous,must beclassified on domain before further processing.To classify texts is a tedious job for human.Therefore,we present in this paper a new approach of classifying the raw texts automaticaly The related coefficient of texts, which is defined in this paper,could be implied to classify texts.The accuracy rate of classifying texts to 21 classes is 93%.

语料库语言学的发展要求语料库的规模越来越大。随着电子出版业的迅速发展,获取大量机读文本建立大规模语料库已成为可能。但是收集来的粗语料是杂乱无章的,在作加工整理前必须分类。若用手工分类则工作量很大。本文介绍了一种语料自动分类办法。它采用文中提出的语料相关系数的概念,并利用不同类语料相关系数不同的特点进行分类,取得了93%的大类分类正确率。

Different estimation methods of the probabilities of sparse events for the computation of the entropy in large scale modern Chinese text are applied in this paper.Experiments based on the corpus of four years′ People′s Daily show that the 0 order,1 order and 2 order entropy are 9 62,6 18 and 4 89 bits respectively.In addition,the influence of such factors as the scale of corpus is also discussed.

文本文在大规模语料的基础上 ,利用语言模型中稀疏事件的概率估计方法对汉语的熵进行计算 ,并讨论了语料规模等因素对熵的影响 .在 4年的人民日报的语料规模下 ,所求得的零阶熵、一阶熵、二阶熵分别为 9 6 2 ,6 18和 4 89比特 .

Rationalism and empiricism are the two main research methods in the field of Natural Language Processing by Computers. The former must be based on an adequate formal research of the language, while the latter is based on the analysis and statistics of large scale corpus. Now, it is time to combine the empirical and rational methods to establish a Chinese Vocabulary-Grammar Computer Checking System.

在自然语言的计算机处理中有理性主义和经验主义两种主要的研究方法。前者必须建立在对自然语言充分的形式化研究的基础之上 ,后者则以对大规模语料的分析和统计为基础。现在有必要把经验主义与理性主义的研究方法结合起来 ,为早日建立一个汉语的词汇—语法计算机自动检索系统而努力

 
<< 更多相关文摘    
图标索引 相关查询

 


 
CNKI小工具
在英文学术搜索中查有关大规模语料的内容
在知识搜索中查有关大规模语料的内容
在数字搜索中查有关大规模语料的内容
在概念知识元中查有关大规模语料的内容
在学术趋势中查有关大规模语料的内容
 
 

CNKI主页设CNKI翻译助手为主页 | 收藏CNKI翻译助手 | 广告服务 | 英文学术搜索
版权图标  2008 CNKI-中国知网
京ICP证040431号 互联网出版许可证 新出网证(京)字008号
北京市公安局海淀分局 备案号:110 1081725
版权图标 2008中国知网(cnki) 中国学术期刊(光盘版)电子杂志社