timothyv: dev and doc are not loading
irc.tw.org as well
cool its all clear now ***: timothyv1 has joined #tikiwiki-monitor
timothyv2 has joined #tikiwiki-monitor
timothyv2 has left
timothyv2 has joined #tikiwiki-monitor
timothyv has quit IRC (Read error: 110 (Connection timed out))
timothyv1 has quit IRC (Read error: 113 (No route to host))
timothyv has joined #tikiwiki-monitor
deepaks has joined #tikiwiki-monitor
timothyv2 has quit IRC (Read error: 110 (Connection timed out))
deepaks has quit IRC ("Leaving.") mose: timothyv: here ?
I'm trying to diagnose why noc is getting down so often recently ***: timothyv has quit IRC (Read error: 113 (No route to host)) mose: and I was curious about why seine.avonsys.com gets 3.1Gb of traffic on dev.two and 5.7Gb on doc.two
it's the first host on the list
far beyond any other ***: srishti has joined #tikiwiki-monitor mose: srishti: hi srishti: hi mose: I'm trying to diagnose why noc is getting down so often recently
and I was curious about why seine.avonsys.com gets 3.1Gb of traffic on dev.two and 5.7Gb on doc.two
far beyond any other host
so I go see apache logs
but I don't see special thing
many nagios checks srishti: ok mose: but maybe you could make check on something else than tiki-index.php and /features ? srishti: yea ok mose: I also see, but that's not related
a lot of 404 to archives.tikiwiki.org
from a browser
at regular times
archives.tikiwiki.org 113.20.89.106 - - [29/Jan/2010:00:11:32 +0100] "GET /tiki-index.php HTTP/1.1" 404 212 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)" "-"
every minute srishti: this is starange mose: yes that's also what I think
archives.tikiwiki.org 113.20.89.106 - - [29/Jan/2010:00:11:55 +0100] "GET /features HTTP/1.1" 404 206 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)" "-"
looks like nagios
but without nagios signature
have a look on http://noc.tikiwiki.org/stats srishti: looks like a false nagios check mose: real nagios checks are every 3 minutes visibly
on 2 pages
on every host
the thing is, actually, as now we have monit + munin, http/apache checks could be removed from nagios
nagios check is still required to see if host pings, though
as well, on old server you used snmp, didn't you ? srishti: yes mose: what was it for ? srishti: we didnt actually use snmp, just ping, and host_alive mose: oh, snmp was setup on server, actually
so I guess that was just not used srishti: yeah mose: on the day of yesterday we had 1542 nagios checks, and 3796 MSIE false nagios checks
quite odd srishti: really weird ***: timothyv1 has joined #tikiwiki-monitor mose: btw what time is it for you right now ? srishti: its 7.40pm PDT mose: oh, nice
for me it's 2:41 pm
and 7:41 am for changi
good coverage :) srishti: :-) mose: we could use a guy from canada or us, to fill it up srishti: hehe mose: anyway, I guess it could be wise to remove the apache monitor from nagios, now
that's probably not related to our current probklems but that traffic is useless srishti: ok sure mose: do you want to get emails from monit/munin ? srishti: yeah sure mose: do you have an adress for the monitoring staff ? srishti: you may use noc@avonsys.com mose: great
I add you srishti: i am also in the above group
ok nagios s no longer doing http_check mose: superb, thanks
I will also add noc@avonsys as recipient of noc@tw.o if you don't mind srishti: yea sure mose: so you actually get other alerts
like diffmon messages srishti: ok mose: and when people send alerts by mail
we also use that email to discuss sysadmin issues srishti: great mose: we really need to fix the current condition of the server, it falls often for unknown reason
we have an armada of tools already but they are failing to auto-fix everything srishti: so what other plans do yu have to get that fixed? mose: we'll make plan when analysis will reveal the cause
we plan to dig out all the logs we have :)
well, logical move
I explored a bit but didn't see anything relevant yet
if you get time, you are welcome to join the exploration :) srishti: ok sure mose: we still get those requests from MSIE
have a look on tail -f /var/log/apache2/access.vcombined-2010.01.29 | grep 113.20.89.106
there is a ghost somewhere ! :) srishti: on the server yeah mose: oh we still have nagios check on doc.tikiwiki.org (only) srishti: ok lemme have a look
ok done
i mean yu shouldnt be getting checks on doc now changi: polom
hi tailers boy :) srishti: hi changi mose: heya changi :)
I'm a fanatic tailer changi: mose: nice job with semaphor :) mose: that was an easy one actually
error message was explicit changi: mose: i'll have more time this WE, will inspect this damned apache log mose: but that was actually the first time I bumped into such error
the thing is that sometimes apache gets ghosted
impossible to restart by normal way changi: that's why i create the script in /usr/local/sbin called by monit mose: maybe that's mod_bw side effect
combioned with some other oddity changi: don't think so, we have this problem before mose: oh
then maybe apc changi: it's a tiki problem mose: well, whatever tiki problem apache should live or die
not get ghosted changi: when it ghosted, it use 100% of one cpu
try to fetch information from mysql mose: so that's the famous tracker curse changi: on the old server, the problem was on the mysql server that couldn't answer more query
i think so mose: damn this thing
it was a hack
it became a beast changi: it's an infinity loop
and as we put mysql on socket connection, it's apache that crash :) mose: poor indian changi: ? mose: apaches are indian changi: lol mose: linux proposes quite a wide range of imagination pretexts ;) changi: will try some fcgi tuning
to avoid this problem
and try to find in log what queyr is done when apache crashed
maybe put php5-cgi in debug mode
hav to work
see ya ***: rigieta has joined #tikiwiki-monitor
timothyv1 has left
changi|home has joined #tikiwiki-monitor
changi|home has left
changi|home has joined #tikiwiki-monitor
changi|home has left
srishti has quit IRC ("Leaving.")
rigieta has quit IRC ("Leaving.")
rupeni has joined #tikiwiki-monitor