معماری بسیار کارآمد برای جمع آوری خزشهای بالا رفته با استفاده از همسطح گرایی بافت های جنبیده
Highly Efficient Architecture for Scalable Focused Crawling Using Incremental Parallel Web Crawler
نویسندگان |
این بخش تنها برای اعضا قابل مشاهده است ورودعضویت |
اطلاعات مجله |
thescipub.com |
سال انتشار |
2015 |
فرمت فایل |
PDF |
کد مقاله |
16708 |
پس از پرداخت آنلاین، فوراً لینک دانلود مقاله به شما نمایش داده می شود.
چکیده (انگلیسی):
With the growing industrial impact over the recent years in
computer science, data mining has established itself as one of the most
important disciplines. In the fast growing Web and in an appropriate
amount of time, locating the resources that are precise and relevant is a
huge challenge for the all-purpose single process crawlers, which makes
the enhanced and the convincing algorithm in demand. Gradually Large
scale search engines frequently update their index and in a timely
behavior which are not capable to present such information. In this study
a scalable focused crawling is proposed with an incremental parallel Web
crawler, the Web pages can be crawled concurrently that are relevant to
multiple pre-defined topics. Furthermore, to solve the issue of URL
distribution, a compound decision model based on multi-objective
decision making method is introduced, which will consider multiple
factors synthetically such as load balance and relevance, the update
frequency issue can be solved by the local repository decision. The result
shows that our proposed system will efficiently produce high quality,
relevance and freshness with significantly low memory requirement.
کلمات کلیدی مقاله (فارسی):
خزنده متمرکز ، ازایش بافت های خزنده ، توزیع شماره آدرس ، تعادل بار ، ارتباط
کلمات کلیدی مقاله (انگلیسی):
Focused Crawler, Incremental Web Crawler, URL Distribution Issue, Load Balance, Relevance
پس از پرداخت آنلاین، فوراً لینک دانلود مقاله به شما نمایش داده می شود.