Bot-trap

Bot-trap

[Login to edit this page]

Common techniques used are:

There is no algorithm to detect all spider traps. Some classes of traps can be detected automatically, but new, unrecognized traps arise quickly.

A spider trap causes a web crawler to enter something like an infinite loop, which wastes the spider's resources, lowers its productivity, and, in the case of a poorly written crawler, can crash the program. Polite spiders alternate requests between different hosts, and don't request documents from the same server more than once every several seconds, meaning that a "polite" web crawler is affected to a much lesser degree than an "impolite" crawler.

In addition, sites with spider traps usually have a robots.txt telling bots not to go to the trap, so a legitimate "polite" bot would not fall into the trap, whereas an "impolite" bot which disregards the robots.txt settings would be affected by the trap.



Share On Facebook
Search And Find
Epik Search:

Related Clips for Bot-trap

Join The Epik Network
Join Now:

Browse The Epik Network

  • boysstink

    ericwinter

    alishadavis

    nightandfog

    franfine

    nilocruz

    berrettas

    maraliasson

    georgheym

    ireneryan

    bicyclebmx

    trreid

    bachillerato

    filibusters

    leninpeak

    fredrouse

    cherylpaul

    coralsmith

    johnagard

    sun-signs

    vanpersie