Contrivance to do my best the system operation monitoring alone

Oak fan Advent Calendar 2018 3 days in charge @kei75 it is.

So was just a memo, beat writing, memorandum until now, it is the first time to write this kind of article.

Read the oranges on the kotatsu in the sense that seemed to look dopey would be appreciated.

By the way, since there is no kotatsu on the inner, we eat oranges wrapped up the futon instead.

Self-introduction

Since joining the net price in 2007, the information systems department, infrastructure, has been developed with many things wade.

October 2018 became Transferred to oak fan, but what you are doing is as usual (laughs)

We are confident that engineers that you are good at a place with relatively low layer, but because the engineers "shallow wide",

do not come true for those who specialized.

What is the system operation monitoring?

Well, one of my main business, there is a "system operation monitoring".

The system will not end just made, it is important to "keep moving".

Server or network, but is basically a continues to move without anything, there is also that some trouble occurs.

The trouble quickly detects, resolve what can be is the "system operation monitoring".

Introduction of operational monitoring environment

So, first is the introduction of that operation monitoring environment that used in the Company.

  • Nagios
    https://www.nagios.org/

    classic in classic Needless to say monitoring tools.

    Although a decade before was a standard monitoring tools, now of whether they've entered into another legacy category ....

    More than 10 years has adopted a Nagios, but the stability of therefore has withered is preeminent.

    A little drawback is difficult to write the configuration file.

  • ZABBIX
    https://www.zabbix.com/

    relatively recent network monitoring tool operation monitoring tool that has become standard as.

    There is also such as Japanese documents and user meetings, information sources and more.

    GUI be made of firm, that it would be easy to use (although due to people).

    So that can also be applied to large-scale environment, you have a variety of functions are available.

  • Cacti
    https://www.cacti.net/

    "is designed as a front-end application for RRDtool, for network monitoring and graph generation of Web-based, open source software," There has been described in Wikipedia and.

    Rather than a monitoring tool, it is strong aspects of tool to graph the status of the resource.

    We are using a communication amount of network equipment for the purpose of graph.

  • SmokePing
    https://oss.oetiker.ch/smokeping/

    so there is also the name, it is a tool to measure and graph the latency of the network by using the Ping.

    Story that from the fact that the graph seems to be standing smoke, with its name.

    Since the monitoring of latency in terms of the network monitoring or was quite important, it is dead tool but necessities.

The use of a number of tools in operation monitoring is also not recommended for time-consuming when setting change not only the load during normal operation.

It is a have tried a lot of tools from the old days, a result that has been applied to the right man in the right place is and this ....

And become more comfortable you will be able to operate in its own way (laughs)

Notification of monitoring

These monitoring tools to check the items set at regular intervals, regarded as "abnormal" when the result of the assumption or exceed the range set did not come back, you can make notification to the administrator at the set method.

What is often used as a notification of monitoring tools would be e-mail.

Also conducted a mail notification entitled to the Company.

Not only PC-friendly address, also skip the notification e-mail to the mobile address, you have to be able to receive a notification e-mail may not stay in front of the PC.

In recent years, Slack (company-wide as a communication tool https://slack.com/ because it uses a), we try to skip a similar notification to the channel of the Slack.

The Slack, there is a feature called Incoming Webhooks, order or can be posted to the channel by hitting a URL in the curl or the like, and very useful.
https://api.slack.com/incoming-webhooks

Accustomed Monitoring Notification

When I began involved what, mobile even though sleeping when the e-mail notification came flying in (Garake), but was able to correspond noticed immediately,

thing called "familiar" is a scary thing, in a little Ya softly of notification it has become not occur ....

Because you are doing the operation monitoring business the past few years alone members are depleted, when trouble occurs situation that "not notice the notification = not correspondence" was adapted to generate in the middle of the night band.

By this remains in the bad ... that, I thought the notification mechanism that will surely cause yourself to non-mail.

The introduction of the telephone notification

So was thought, it was to introduce a mechanism to notify by telephone rather than e-mail.

World already convenient service exists in, it is also hand to take advantage of these services.

Reactio (Li Akutio)
https://reactio.jp/

However, the Company has gone can be monitored environment is already firmly operation, Reactio was slightly over-spec because it is only want a phone notification (sweat)

So, we decided to use a service called Twilio which has been focused for a long time.

Twilio
https://twilio.kddi-web.com/

In Japan, it is a service that provides the communications API, which was to a phone or SMS I spindle KDDI Web Communications Inc. is providing the service.

Twilio introduction of itself, but omitted because there is a lot of articles or the like of the other Qiita, By using this Twilio, it will be possible to place a simple phone call or SMS from the system. (Of course usage fees will take, but ...)

Twilio phone notification script

Since the SDK that corresponds to the various languages are prepared to Twilio, you can easily telephone call simply by introducing the SDK.

We prepared this kind of script, it is incorporated into Nagios.

(Laughs you running very Hasho' for Qiita)

<?php
 require __DIR__ . '/Twilio/autoload.php';
 use Twilio\Rest\Client;

//twilio set $AccountSid = (Account describes the SID ); $AuthToken = (Auth described the Token ); $MyNumber = "+815099999999";

//generation of TwiML //---------------------------------------------------------------
$twiml = <<<_XML
<Response>
<Say Language = "Ja-JP" Loop = "2"> critical alert detection </Say>
</Response>
_XML;
//--------------------------------------------------------------- //add a URL encoded to prefix for Twimlet $twimlet = "http://twimlets.com/echo?Twiml=" . urlencode(str_replace("\n", "", $twiml));

//create an instance $client = new Client($AccountSid, $AuthToken);

//call destination telephone number $CallNumber = '+819099999999';

//call the CallAPI $call = $client->calls->create(
$CallNumber, $MyNumber,
array(
'url' => $twimlet,
'Timeout' => 8
) );

If you try to fine control at Twilio, but there is such need to receive a call back from the Twilio make a Web server on their own,

simply by utilizing the services Twilio that twimlets if only to send is available it can be implemented.

Using these, start this PHP when a specific monitoring alerts have occurred, we operate to ring the phone.

It has continued for more than a year this operation, but many times we have with the help of this phone.

Now it is one of the indispensable tool (laughs)

Finally

It is often our of these has its own implementation, but if you have some people with a spirit of "Tsu by there is also a better way!" Let your work come together!

https://aucfan.co.jp/recruit/

There is also a little old information, but ....

Thank you!