助手标题  
全文文献 工具书 数字 学术定义 翻译助手 学术趋势 更多
查询帮助
意见反馈
   重复网页 的翻译结果: 查询用时:0.009秒
图标索引 在分类学科中查询
所有学科
互联网技术
计算机软件及计算机应用
更多类别查询

图标索引 历史查询
 

重复网页
相关语句
  repeated pages
     The Application of MD5 to Remove the Repeated Pages
     MD5算法在消除重复网页算法中的应用
短句来源
     The Searching Engines often return massive repeated pages information to Internet users and result in low searching efficiency.
     Internet用户通过常用搜索引擎获取Web信息时,往往得到了大量的重复网页信息,从而导致搜索效率不高。
短句来源
     Considering the mature and portability of MD5, an algorithm based on MD5 is proposed to remove the repeated pages. The experiment indicates this algorithm is effective and its complexity of time and space is not high. It is showed that the study is practicable and valid.
     本文利用MD5算法成熟及可移植性好的特点,提出了一种基于MD5的消除重复网页的算法,实验证明该算法能有效的去除重复网页,时间和空间的复杂度不高,具有较强的实用价值。
短句来源
     The experiment shows that it can remove the repeated pages from usual search engines effectively and can improve the searching efficiency of Internet users. It has good application foreground.
     实验证明该算法能有效地去除常用搜索引擎返回的重复网页,从而为Internet用户提高信息检索效率,具有较强的实用价值。
短句来源
  duplicated web pages
     The recall rate of duplicated web pages reaches 97.3%, and the precision rate of removing duplicated web pages reaches 99.5% in large-scale testing.
     在大规模开放测试中重复网页召回率达97.3%,去重正确率达99.5%。
短句来源
     The experiment results show that the algorithm is effective. The recall rate of duplicated web pages reaches 97.3%,and the precision rate of the duplication removal reaches 99.5% in large scale testing.
     实验结果表明该算法是有效的 ,大规模开放测试的重复网页召回率达 97 3% ,去重正确率达 99 5 %。
短句来源
     Deletion of duplicated web pages has been one of the problems that need to be solved in information retrieval.
     去除重复网页一直是信息检索领域的一个待解决的问题。
     We apply this approach in the recognition of cross language duplicated web pages.
     并将其运用到了跨语言的重复网页的识别上。
     One of them extracts features based on the word frequency, and a text is transformed to afeature string, and then uses the feature string to recognize the duplicated web pages.
     基于高频词的网页查重算法根据特征的频率选择特征,组成特征串,来判别重复网页
短句来源
更多       
  “重复网页”译为未确定词的双语例句
     Multi-threading, detection of duplicate content and spider traps, text repository are discussed in page retrieval.
     在页面采集中分析了多线程、重复网页、采集器陷阱和网页的存储。
查询“重复网页”译词为用户自定义的双语例句

    我想查看译文中含有:的双语例句
例句
没有找到相关例句


Many documents are being replicated across the World wide Web.How to efficiently and accurately find the near replicas of web pages has become an important topic in the search engine research area,which can be used to improve the quality of searching service.In this paper,we propose 5 near replicas detection algorithms for search engines that rely on keyword matching,and evaluate them using the WebGather search engine system.In addition,we also compare our method with one of the most popular copy detection...

Many documents are being replicated across the World wide Web.How to efficiently and accurately find the near replicas of web pages has become an important topic in the search engine research area,which can be used to improve the quality of searching service.In this paper,we propose 5 near replicas detection algorithms for search engines that rely on keyword matching,and evaluate them using the WebGather search engine system.In addition,we also compare our method with one of the most popular copy detection mechanisms.Our method has been successfully adopted to remove the near replicas of web pages in WebGather,and it can also be widely used to build digital library.

当前在WWW上有众多的近似镜像web页面 ,如何快速准确地发现这些内容上相似的网页已经成为提高搜索引擎服务质量的关键技术之一 .为基于关键词匹配的搜索引擎系统提出了 5种近似镜像网页检测算法 ,并利用“天网”系统对这 5种算法进行了实际评测 .另外还将它们与现有的方法进行了对比分析 .本文所论述的近似镜像检测算法已成功地被用于消除“天网”系统的重复网页 ,同时也可广泛应用于数字化图书馆的搭建

Reprinting of information between websites produces a great deal redundant web pages that not only waste storage resource but also bring many burdens to user in retrieval and reading.In this paper string of feature code based algorithm is developed to remove the duplicated web page after analyzing the feature of the redundant web page.The idea of fuzzy matching and information of content and structure of the text of web page are introduced into the algorithm,and the efficiency of the algorithm is optimized.The...

Reprinting of information between websites produces a great deal redundant web pages that not only waste storage resource but also bring many burdens to user in retrieval and reading.In this paper string of feature code based algorithm is developed to remove the duplicated web page after analyzing the feature of the redundant web page.The idea of fuzzy matching and information of content and structure of the text of web page are introduced into the algorithm,and the efficiency of the algorithm is optimized.The experiment results show that the algorithm is effective.The recall rate of duplicated web pages reaches 97.3%,and the precision rate of the duplication removal reaches 99.5% in large scale testing.

网页检索结果中 ,用户经常会得到内容相同的冗余页面 ,其中大量是由于网站之间的转载造成。它们不但浪费了存储资源 ,并给用户的检索带来诸多不便。本文依据冗余网页的特点引入模糊匹配的思想 ,利用网页文本的内容、结构信息 ,提出了基于特征串的中文网页的快速去重算法 ,同时对算法进行了优化处理。实验结果表明该算法是有效的 ,大规模开放测试的重复网页召回率达 97 3% ,去重正确率达 99 5 %。

The Searching Engines often return massive repeated pages information to Internet users and result in low searching efficiency. Considering the mature and portability of MD5, an algorithm based on MD5 is proposed to remove the repeated pages. The experiment indicates this algorithm is effective and its complexity of time and space is not high. It is showed that the study is practicable and valid.

Internet用户通过常用搜索引擎获取Web信息时,往往得到了大量的重复网页信息,从而导致搜索效率不高。本文利用MD5算法成熟及可移植性好的特点,提出了一种基于MD5的消除重复网页的算法,实验证明该算法能有效的去除重复网页,时间和空间的复杂度不高,具有较强的实用价值。

 
<< 更多相关文摘    
图标索引 相关查询

 


 
CNKI小工具
在英文学术搜索中查有关重复网页的内容
在知识搜索中查有关重复网页的内容
在数字搜索中查有关重复网页的内容
在概念知识元中查有关重复网页的内容
在学术趋势中查有关重复网页的内容
 
 

CNKI主页设CNKI翻译助手为主页 | 收藏CNKI翻译助手 | 广告服务 | 英文学术搜索
版权图标  2008 CNKI-中国知网
京ICP证040431号 互联网出版许可证 新出网证(京)字008号
北京市公安局海淀分局 备案号:110 1081725
版权图标 2008中国知网(cnki) 中国学术期刊(光盘版)电子杂志社