View Full Version : 3 days and crawlers ( ? ) don't leave my site
FrancescoV
07-31-2005, 04:53 AM
All started about 3 days ago.
My VPS was almost shotted down by an horde of search engine crawlers ( i suppose ).
load average: 2.64, 7.02, 21.01
The good thing is that my PR bumbed from 4 to 5 !
On this vps i have one only site, www.netwargamingitalia.net. On the right of homepage i can see how many users are surfing my site. Usually this number is not higher than 7-8.
Well in these last 3 days i see always high number. In the last 2 days i see always 15-20 users online but i know they are not really users.
Are they still search engine crawlers ? 3 days of exploring seem to much !
Or do i should worry about integrity of my vps ?
Have you installed eAccelerator? That may help reduce the load as the pages will be cached.
FrancescoV
07-31-2005, 09:38 AM
no i'm still planning what script install. I'm wondering what's the best set of script ( generale optmizer, php cache script, mysql cache script ).
Anyway the matter is not the load but this persistent presence of 12-20 users online on my site. In the past, search engine spiders made their work on my site in 1 days only :(
You should really install eaccelerator since it does take about 2 secs to load a page. Seems you are using a premade script so i dont know if you want to recode it with a mysql cache. Adding JPcache will just messup your user account scripts. With my site we did inhouse programming and was able to use AdoDB as the database extraction layer and it has built in cache of sql infomation. That with the help of eaccelerator we have sub 0.00x generation times and load averages lower than .50 at peak hours.
FrancescoV
07-31-2005, 02:32 PM
errr..well...my worry is not for the load. I'm worried about this: are these 15-20 users really serach engine crawlers ? Or are they something of malicious and danger ?
@HVU: yes, i'm using a premade CMS. I know about mysql caching script that act on the entire databases and don't need to be integrated in a particular script. I mean something as eaccelerator but related to MYSQL caching.
If the load goes down...what exactly is there to worry about? If these are malicous users the load would probably stay up
FrancescoV
07-31-2005, 02:51 PM
i don't know, maybe this high number of users is a evidence of the presence of malicious activity.
I usually have 6-7 users online and not 15-20.
What i asked is if is it normal that crawler search my site for almost 4 days. If the answer is "no" then i should think that this "users" are not crawlers but other "things".
One friend of mine speaks about DoS attack.
ndndixie
07-31-2005, 03:10 PM
You have about 420 links on MSN, It could be multiple BOT sessions or just curious users. Your stats should show who is there and when. AWStats list the individual BOTS, when they were there, etc.
www.netwargamingitalia.net Total 8,193 Google 44 HotBot 44 MSN 545 Yahoo 7,560
i don't know, maybe this high number of users is a evidence of the presence of malicious activity.
I usually have 6-7 users online and not 15-20.
What i asked is if is it normal that crawler search my site for almost 4 days. If the answer is "no" then i should think that this "users" are not crawlers but other "things".
One friend of mine speaks about DoS attack.
:rolleyes: Maybe you should try to install eAccelerator and see if the load goes. Then if the load is still high, you can look into it, but if the load isn't high, there should be no problems...and no DoS attack...
charles
07-31-2005, 08:39 PM
15-20 doesn't sound normal. But don't ask us. Look at your httpd logs and see who they are and what they are doing!
mikelbeck
07-31-2005, 08:52 PM
Do you have Adsense on your pages? I've seen that once a user goes to a page that has Adsense on it, a Google Adsense spider will show up soon after.
FrancescoV
08-01-2005, 04:33 AM
Thanks a lot for your help.
I checked website log and it seems that some Search Engine Spiders named Gigabot/2.0 are stucked on a section of my site ( "Member List" module ). They continuosly request page related to member list, the same page ( it seems ).
I putted them in ignore in my robot.txt 20 minutes ago.
Edit: after 30 minutes all ok..no more crawlers.
ndndixie
08-01-2005, 06:18 AM
I had to block that bot from my football forums. I sat and watched it one day, it would follow every link, which sent it in a constant circle.
a lot of bots doesn't mind to check the robots.txt ... or they check it but don't respect it... some of them even just visit the "banned" section noted in the robots.txt ;)
Happy that adding the line to robots.txt solved your trouble...
vBulletin® v3.8.4, Copyright ©2000-2010, Jelsoft Enterprises Ltd.