# /robots.txt file for http://www.waag.ch/ # have to allow Googlebot here since it will index # but not visit the page. I don't want it to be # indexed! User-agent: Googlebot User-agent: Ask Jeeves/Teoma Disallow: /analog/ Disallow: /bots/ Disallow: /botsv/ Disallow: /cssfiles/ Disallow: /fpdb/ Disallow: /graphics/ Disallow: /htmlarea/ Disallow: /include/ Disallow: /intern/ Disallow: /scripts/ Disallow: /style/ Disallow: /webmin/ # Do not exclude legacy links kept for a search engine friendly transition # Bot must walk the link in order to find out it's out-dated User-agent: * Disallow: /analog/ Disallow: /bots/ Disallow: /botsv/ Disallow: /cssfiles/ Disallow: /fpdb/ Disallow: /graphics/ Disallow: /galerie/ Disallow: /htmlarea/ Disallow: /include/ Disallow: /intern/ Disallow: /scripts/ Disallow: /style/ Disallow: /webmin/ # Exclude legacy links kept for a search engine friendly transition Disallow: /aktuell/ Disallow: /cd/ # common excludes for bad bots # They're banned via .htaccess, if they obey robots.txt they won't trigger an alarm User-agent: Art-Online Disallow: / User-agent: CherryPicker Disallow: / User-agent: Crescent Internet ToolPak Disallow: / User-agent: EmailCollector Disallow: / User-agent: EmailSiphon Disallow: / User-agent: EmailWolf Disallow: / User-agent: ExtractorPro Disallow: / User-agent: HTTPClient Disallow: / User-agent: Indy Library Disallow: / User-agent: Java Disallow: / User-agent: Microsoft URL Control Disallow: / User-agent: NexaBot Disallow: / User-agent: Nutch Disallow: / User-agent: VoilaBot Disallow: / User-agent: wavepluz Disallow: /