←Prev date Next date→ Show only urls(Click on time to select a line by its url)

WhoWhatWhen
timothyvhi changi
dev and doc are not loading
[01:07]
irc.tw.org as well [01:19]
........... (idle for 51mn)
cool its all clear now [02:10]
***timothyv1 has joined #tikiwiki-monitor [02:14]
timothyv2 has joined #tikiwiki-monitor
timothyv2 has left
timothyv2 has joined #tikiwiki-monitor
[02:22]
timothyv has quit IRC (Read error: 110 (Connection timed out)) [02:33]
timothyv1 has quit IRC (Read error: 113 (No route to host)) [02:42]
.................... (idle for 1h39mn)
timothyv has joined #tikiwiki-monitor [04:21]
deepaks has joined #tikiwiki-monitor [04:29]
timothyv2 has quit IRC (Read error: 110 (Connection timed out)) [04:40]
............. (idle for 1h0mn)
deepaks has quit IRC ("Leaving.") [05:40]
........ (idle for 37mn)
mosetimothyv: here ?
I'm trying to diagnose why noc is getting down so often recently
[06:17]
***timothyv has quit IRC (Read error: 113 (No route to host)) [06:18]
moseand I was curious about why seine.avonsys.com gets 3.1Gb of traffic on dev.two and 5.7Gb on doc.two
it's the first host on the list
far beyond any other
[06:19]
***srishti has joined #tikiwiki-monitor [06:24]
mosesrishti: hi [06:24]
srishtihi [06:24]
moseI'm trying to diagnose why noc is getting down so often recently
and I was curious about why seine.avonsys.com gets 3.1Gb of traffic on dev.two and 5.7Gb on doc.two
far beyond any other host
so I go see apache logs
but I don't see special thing
many nagios checks
[06:25]
srishtiok [06:27]
mosebut maybe you could make check on something else than tiki-index.php and /features ? [06:27]
srishtiyea ok [06:28]
moseI also see, but that's not related
a lot of 404 to archives.tikiwiki.org
from a browser
at regular times
archives.tikiwiki.org 113.20.89.106 - - [29/Jan/2010:00:11:32 +0100] "GET /tiki-index.php HTTP/1.1" 404 212 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)" "-"
every minute
[06:28]
srishtithis is starange [06:29]
moseyes that's also what I think
archives.tikiwiki.org 113.20.89.106 - - [29/Jan/2010:00:11:55 +0100] "GET /features HTTP/1.1" 404 206 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)" "-"
looks like nagios
but without nagios signature
have a look on http://noc.tikiwiki.org/stats
[06:29]
srishtilooks like a false nagios check [06:31]
mosereal nagios checks are every 3 minutes visibly
on 2 pages
on every host
the thing is, actually, as now we have monit + munin, http/apache checks could be removed from nagios
nagios check is still required to see if host pings, though
as well, on old server you used snmp, didn't you ?
[06:32]
srishtiyes [06:34]
mosewhat was it for ? [06:34]
srishtiwe didnt actually use snmp, just ping, and host_alive [06:36]
moseoh, snmp was setup on server, actually
so I guess that was just not used
[06:36]
srishtiyeah [06:37]
moseon the day of yesterday we had 1542 nagios checks, and 3796 MSIE false nagios checks
quite odd
[06:39]
srishtireally weird [06:39]
***timothyv1 has joined #tikiwiki-monitor [06:40]
mosebtw what time is it for you right now ? [06:40]
srishtiits 7.40pm PDT [06:41]
moseoh, nice
for me it's 2:41 pm
and 7:41 am for changi
good coverage :)
[06:41]
srishti:-) [06:42]
mosewe could use a guy from canada or us, to fill it up [06:42]
srishtihehe [06:42]
moseanyway, I guess it could be wise to remove the apache monitor from nagios, now
that's probably not related to our current probklems but that traffic is useless
[06:43]
srishtiok sure [06:44]
mosedo you want to get emails from monit/munin ? [06:49]
srishtiyeah sure [06:49]
mosedo you have an adress for the monitoring staff ? [06:49]
srishtiyou may use noc@avonsys.com [06:49]
mosegreat
I add you
[06:49]
srishtii am also in the above group
ok nagios s no longer doing http_check
[06:50]
mosesuperb, thanks
I will also add noc@avonsys as recipient of noc@tw.o if you don't mind
[06:52]
srishtiyea sure [06:55]
moseso you actually get other alerts
like diffmon messages
[06:55]
srishtiok [06:55]
moseand when people send alerts by mail
we also use that email to discuss sysadmin issues
[06:55]
srishtigreat [06:57]
mosewe really need to fix the current condition of the server, it falls often for unknown reason
we have an armada of tools already but they are failing to auto-fix everything
[07:01]
srishtiso what other plans do yu have to get that fixed? [07:04]
mosewe'll make plan when analysis will reveal the cause
we plan to dig out all the logs we have :)
well, logical move
I explored a bit but didn't see anything relevant yet
if you get time, you are welcome to join the exploration :)
[07:04]
srishtiok sure [07:07]
mosewe still get those requests from MSIE
have a look on tail -f /var/log/apache2/access.vcombined-2010.01.29 | grep 113.20.89.106
there is a ghost somewhere ! :)
[07:16]
srishtion the server yeah [07:18]
moseoh we still have nagios check on doc.tikiwiki.org (only) [07:25]
srishtiok lemme have a look [07:25]
ok done
i mean yu shouldnt be getting checks on doc now
[07:32]
.............. (idle for 1h8mn)
changipolom
hi tailers boy :)
[08:42]
srishtihi changi [08:44]
..... (idle for 21mn)
moseheya changi :)
I'm a fanatic tailer
[09:05]
changimose: nice job with semaphor :) [09:06]
mosethat was an easy one actually
error message was explicit
[09:07]
changimose: i'll have more time this WE, will inspect this damned apache log [09:07]
mosebut that was actually the first time I bumped into such error
the thing is that sometimes apache gets ghosted
impossible to restart by normal way
[09:07]
changithat's why i create the script in /usr/local/sbin called by monit [09:08]
mosemaybe that's mod_bw side effect
combioned with some other oddity
[09:08]
changidon't think so, we have this problem before [09:09]
moseoh
then maybe apc
[09:09]
changiit's a tiki problem [09:09]
mosewell, whatever tiki problem apache should live or die
not get ghosted
[09:09]
changiwhen it ghosted, it use 100% of one cpu
try to fetch information from mysql
[09:09]
moseso that's the famous tracker curse [09:10]
changion the old server, the problem was on the mysql server that couldn't answer more query
i think so
[09:10]
mosedamn this thing
it was a hack
it became a beast
[09:10]
changiit's an infinity loop
and as we put mysql on socket connection, it's apache that crash :)
[09:11]
mosepoor indian [09:12]
changi? [09:12]
moseapaches are indian [09:12]
changilol [09:12]
moselinux proposes quite a wide range of imagination pretexts ;) [09:12]
changiwill try some fcgi tuning
to avoid this problem
and try to find in log what queyr is done when apache crashed
maybe put php5-cgi in debug mode
hav to work
see ya
[09:13]
..................... (idle for 1h43mn)
***rigieta has joined #tikiwiki-monitor [10:58]
timothyv1 has left [11:08]
............................................................................................ (idle for 7h37mn)
changi|home has joined #tikiwiki-monitor
changi|home has left
changi|home has joined #tikiwiki-monitor
changi|home has left
[18:45]
................................ (idle for 2h38mn)
srishti has quit IRC ("Leaving.") [21:24]
............................ (idle for 2h18mn)
rigieta has quit IRC ("Leaving.") [23:42]
rupeni has joined #tikiwiki-monitor [23:47]

←Prev date Next date→ Show only urls(Click on time to select a line by its url)