v1.0.0 release

add GPU table page. add timestamp format. update readme for v1.0.0. fix minor bug.
a-maumau · Mar 24, 2019 · 7b37002 · 7b37002
2 parents aef2065 + a6e01d4
commit 7b37002
Show file tree

Hide file tree

Showing 18 changed files with 543 additions and 258 deletions.
diff --git a/README.md b/README.md
@@ -6,9 +6,15 @@ This script is depending on `Python3`, and `nvidia-smi`, `awk`, `ps` commands.
 pip install -r requirements.txt
 ```  
 If there is a missing package, please install by yourself using pip.  
-also you need setup vesta/settings.py for your environment.   
+Also you might need to setup some configurations for your own environment.   
+
+# Configuration
+When using `gpu_status_server.py` and `gpu_info_sender.py`, there are many options to change the settings, or you can use .yaml file for overwriting the arguments.  
+Use `-h` to see all arguments and use `--local_settings_yaml_path` option to overwrite.  
+Example is in `example/local_settings.yaml`  
 
 # Usage
+You can use simple wrapper,  
 for Server  
 ```
 python gpu_status_server.py
@@ -19,17 +25,27 @@ for Nodes
 python gpu_info_sender.py
 ```  
 
-For automatical process, using systemd and crontab will do the works.  
+For automation, using systemd and crontab will do the work.  
 
 ## from Terminal
 To get GPU information from terminal app, use curl and access `http://<server_address>/?term=true`.  
 You will get like  
 ```
 $ curl "http://0.0.0.0:8080/?term=true"
++------------------------------------------------------------------------------+
+| vesta ver. 1.0.0                                                   gpu info. |
 +------------------+------------------------+-----------------+--------+-------+
 | host             | gpu                    | memory usage    | volat. | temp. |
 +------------------+------------------------+-----------------+--------+-------+
-|host1             | 0:GeForce GTX 1080 Ti  |    235 /  11169 |     0 %|  36 °C|
+|mau_local         | 0:GeForce GTX 1080 Ti  |   8018 /  11169 |     0 %|  80 °C|
+|                  | 1:GeForce GTX 1080 Ti  |      2 /  11172 |     0 %|  38 °C|
++------------------+------------------------+-----------------+--------+-------+
+|mau_local_11a7c5eb| 0:GeForce GTX 1080 Ti  |   8018 /  11169 |    92 %|  80 °C|
+|                  | 1:GeForce GTX 1080 Ti  |      2 /  11172 |     0 %|  38 °C|
++------------------+------------------------+-----------------+--------+-------+
+|mau_local_ac993634| 0:GeForce GTX 1080 Ti  |   8018 /  11169 |    92 %|  80 °C|
+|                  | 1:GeForce GTX 1080 Ti  |      2 /  11172 |     0 %|  38 °C|
+|                  | 1:GeForce GTX 1080 Ti  |      2 /  11172 |     0 %|  38 °C|
 |                  | 1:GeForce GTX 1080 Ti  |      2 /  11172 |     0 %|  38 °C|
 +------------------+------------------------+-----------------+--------+-------+
 ```
@@ -38,25 +54,25 @@ If you want to see detail information you can use `detail` option like `http://<
 You will get like  
 ```
 $ curl "http://0.0.0.0:8080/?term=true&detail=true"
+vesta ver. 1.0.0
 
-### host1 :: 127.0.0.1 #########################################################
-  last update: 2018/12/03 23:16:59
+#### mau_local_19e5d26c :: 127.0.0.1 ###########################################
+  last update: 24/03/2019 20:27:10
 --------------------------------------------------------------------------------
-  ┌[ gpu:0 GeForce GTX 1080 Ti 2018/12/01 14:32:37.140 ]─────────────────────┐
+  ┌[ gpu:0 GeForce GTX 1080 Ti 2019/03/24 20:00:00.000 ]─────────────────────┐
   │      memory used  memory available  gpu volatile  temperature            │
-  │   235 / 11169MiB          10934MiB            0%         36°C            │
+  │  8018 / 11169MiB           3151MiB           92%         80°C            │
   │                                                                          │
-  │ mem [/                                                            ]   2% │
-  │  ├── /usr/bin/X                   148MiB                                 │
-  │  └── compiz                        84MiB                                 │
+  │ mem [///////////////////////////////////////////                  ]  71% │
+  │  ├── train1                      6400MiB root                            │
+  │  └── train2                      1618MiB user1                           │
   └──────────────────────────────────────────────────────────────────────────┘
 
-  ┌[ gpu:1 GeForce GTX 1080 Ti 2018/12/01 14:32:37.141 ]─────────────────────┐
+  ┌[ gpu:1 GeForce GTX 1080 Ti 2019/03/24 20:00:00.000 ]─────────────────────┐
   │      memory used  memory available  gpu volatile  temperature            │
   │     2 / 11172MiB          11170MiB            0%         38°C            │
   │                                                                          │
   │ mem [                                                             ]   0% │
-  │  └── /usr/bin/X                   148MiB                                 │
   └──────────────────────────────────────────────────────────────────────────┘
 
 ________________________________________________________________________________
@@ -72,7 +88,7 @@ Just access `http://<server_address>/`
 You will get like  
 ![sample web broser image](imgs/browser_sample_resized.png "sample")
 
-# Response
+# API Response 
 User can get the information of GPU by accessing `http://<server_address>/states/`.  
 Json response is like
 ```
@@ -85,28 +101,29 @@ Json response is like
                 {   # each GPU will be denote by "gpu:<device_num>"
                     'gpu_data':{
                         'gpu:0':{'available_memory': '10934',
-                        'device_num': '0',
+                            'device_num': 0,
                             'gpu_name': 'GeForce GTX 1080 Ti',
-                            'gpu_volatile': '0',
+                            'gpu_volatile': 92,
                             'processes': [
-                                {
-                                    'name': '/usr/bin/X',
-                                    'pid': '1963',
-                                    'used_memory': '148',
+                                  {
+                                    'name': 'train1',
+                                    'pid': "31415",
+                                    'used_memory': 6400,
                                     'user': 'root'
-                                },
-                                {
-                                    'name': 'compiz',
-                                    'pid': '3437',
-                                    'used_memory': '84',
+                                  },
+                                  {
+                                    'name': 'train2',
+                                    'pid': "27182",
+                                    'used_memory': 1618,
                                     'user': 'user1'
-                                }
+                                  }
                             ],
-                            'temperature': '36',
-                            'timestamp': '2018/11/30 23:29:47.115',
-                            'total_memory': '11169',
-                            'used_memory': '235',
-                            'uuid': 'GPU-...'},
+                            'temperature': 80,
+                            'timestamp': '2019/03/24 20:00:00.000',
+                            'total_memory': 11169,
+                            'used_memory': 8018,
+                            'uuid': 'GPU-...'
+                        },
                         'gpu:1':{
                             'available_memory': '11170',
                             'device_num': '1',
@@ -128,6 +145,17 @@ Json response is like
 }
 ```
 
+# Slack Notification
+If you set slack's webhook and bot setting, you can receive notification via slack.  
+## up and down
+![sample notification image](imgs/noti_up_down_sample_resized.png "notificate_up_and_down")  
+
+## interact with bot
+![sample interact image](imgs/bot_interact_sample_resized.png "bot_interact")  
+
+For specifying slack setting, use `--slack_webhook`, `--slack_bot_token`, and `--slack_bot_post_channel` for `gpu_status_server.py`.  
+Or you can use .yaml file, see `example/local_settings.yaml`  
+
 # Topology
 Topology is very simple, Master (server) and Slave (each local machine) style, but it is ad hoc.  
 Server is only waiting the slaves to post the gpu information.  
@@ -161,4 +189,4 @@ Table field is
 | timestamp_n | data_n |
 
 `timestamp` is based on server time zone and the style is "YYYYMMDDhhmmss".  
-`data` is a Python dict object while it is serialized and compressed by Python bz2.  
+`data` is a Python dict object while it is serialized and compressed by Python pickle and bz2.  
diff --git a/examples/local_settings.yaml b/examples/local_settings.yaml
@@ -1,15 +1,4 @@
-# in sec.
-WS_RECEIVE_TIMEOUT: 1
-
-# in sec.
-SLACK_BOT_SLEEP_TIME: 1
-
-# at least interval time for saving data (in sec.)
-# if you want to save all data, set this to 0
-SAVE_INTERVAL: 60
-
-# sort type, "ip" or "name"
-SORT_BY: "ip"
+# every key must be in capital letter
 
 # this ip address is the server address for the client which send the gpu information
 IP: "127.0.0.1"
@@ -23,8 +12,12 @@ TOKEN: '0000'
 # how many information to read in each page
 PAGE_PER_HOST_NUM: 8
 
-PAGE_TITLE: "AWSOME GPUs"
-PAGE_DESCRIPTION: "awsome description"
+MAIN_PAGE_TITLE: "AWSOME GPUs"
+MAIN_PAGE_DESCRIPTION: "awsome description"
+TABLE_PAGE_TITLE: "AWSOME Table"
+TABLE_PAGE_DESCRIPTION: "awsome description"
+
+TIMESTAMP_FORMAT: "DMY"
 
 # you can notificate at slack
 SLACK_WEBHOOK: "https://hooks.slack.com/services/<your web hook>"
@@ -37,7 +30,7 @@ SLACK_BOT_POST_CHANNEL: "your channel"
 # it will be used in re.search, so you can use regular expression
 VALID_NETWORK: "127.0.0.1"
 
-# you can use python schedule module to schedule the announcement of somethin
+# you can use python schedule module to schedule the announcement of something
 # in this case, it will use the function self.send_hosts_statuses for every day at00:00 
 # must be a array
 SCHEDULE_FUNCTION:

diff --git a/gpu_info_sender.py b/gpu_info_sender.py
@@ -25,7 +25,7 @@
     if settings.local_settings_yaml_path is not None:
         try:
             with open(settings.local_settings_yaml_path, "r") as yaml_file:
-                yaml_data = yaml.load(yaml_file)
+                yaml_data = yaml.load(yaml_file, yaml.safe_load)
         except Exception as e:
             print(e)
             yaml_data = []

diff --git a/gpu_status_server.py b/gpu_status_server.py
@@ -13,17 +13,30 @@
                         help='yaml file path which overwrite the contents args.')
 
     # args for server
-    parser.add_argument('--db_name', dest='DB_NAME', type=str, default="gpu_states.db", help='database name.')
-    parser.add_argument('--db_dir', dest='DB_DIR', type=str, default="data", help='dir of database.')
+    parser.add_argument('--ip', dest='IP', type=str, default="127.0.0.1",
+                        help='this ip address is the server address for the client which send the gpu information.\nit is mainly for a machine which is sending the data to server.')
+    parser.add_argument('--port_num', dest='PORT_NUM', type=int, default=8080, help="server's open port.")
+    parser.add_argument('--token', dest='TOKEN', type=str, default="0000",
+                        help="url parameter token for posting data.\nwhatever you want, actually it's doing nothing now. it is only for preventing accidental posting.")
     parser.add_argument('--server_name', dest='SERVER_NAME', type=str, default="gpu_monitor", help='')
     parser.add_argument('--bind_host', dest='BIND_HOST', type=str, default="0.0.0.0",
                         help='bind host IP address.\nthis should be 0.0.0.0.\nif you want to filter IP addresses, use `--valid_network`.')
 
-    # ssl settings certfile, keyfile=None, password
-    parser.add_argument('--ssl_cert', dest='SSL_CERT', type=str, default=None, help='path of ssl certificate file.')
-    parser.add_argument('--ssl_key', dest='SSL_KEY', type=str, default=None, help='path for ssl key file.')
+    parser.add_argument('--db_name', dest='DB_NAME', type=str, default="gpu_states.db", help='database name.')
+    parser.add_argument('--db_dir', dest='DB_DIR', type=str, default="data", help='dir of database.')
+    parser.add_argument('--timestamp_format', dest='TIMESTAMP_FORMAT', type=str, default="MDY", choices=['YMD', 'MDY', 'DMY'],
+                        help='timestamp format. default is `MM/DD/YYYY`. choose from `YMD`, `MDY` or `DMY`.')
+
+    parser.add_argument('--page_per_host_num', dest='PAGE_PER_HOST_NUM', type=int, default=8,
+                        help='how many information to read in each page.\nit is controlling the view of html page.')
+    parser.add_argument('--main_page_title', dest='MAIN_PAGE_TITLE', type=str, default="GPU info", help='page title of main page.')
+    parser.add_argument('--main_page_description', dest='MAIN_PAGE_DESCRIPTION', type=str, default="", help='page description of main page.')
+    parser.add_argument('--table_page_title', dest='TABLE_PAGE_TITLE', type=str, default="GPU Table", help='page title of gpu table page.')
+    parser.add_argument('--table_page_description', dest='TABLE_PAGE_DESCRIPTION', type=str, default="", help='page description of gpu table page.')
 
     parser.add_argument('--term_width', dest='TERM_WIDTH', type=int, default=80, help='width of terminal printing.')
+    parser.add_argument('--sort_by', dest='SORT_BY', type=str, default="ip", choices=['ip', 'name'],
+                        help='sort type of machine arrangement. choice from `ip` or `name`.')
 
     # args for waching part
     parser.add_argument('--server_sleep_time', dest='SERVER_SLEEP_TIME', type=int, default=5, help='server sleeping time in sec.')
@@ -34,20 +47,6 @@
     parser.add_argument('--slack_bot_sleep_time', dest='SLACK_BOT_SLEEP_TIME', type=int, default=1, help="slack bot's waiting time (response time) in sec.")
     parser.add_argument('--save_interval', dest='SAVE_INTERVAL', type=int, default=60,
                         help='at least interval time for saving data in sec.\nthis is for controlling the data which is will save in database. if you want to save all data, set this to 0.')
-
-    parser.add_argument('--sort_by', dest='SORT_BY', type=str, default="ip", choices=['ip', 'name'],
-                        help='sort type of machine arrangement. choice from `ip` or `name`.')
-
-    parser.add_argument('--ip', dest='IP', type=str, default="127.0.0.1",
-                        help='this ip address is the server address for the client which send the gpu information.\nit is mainly for a machine which is sending the data to server.')
-    parser.add_argument('--port_num', dest='PORT_NUM', type=int, default=8080, help="server's open port.")
-    parser.add_argument('--token', dest='TOKEN', type=str, default="0000",
-                        help="url parameter token for posting data.\nwhatever you want, actually it's doing nothing now. it is only for preventing accidental posting.")
-
-    parser.add_argument('--page_per_host_num', dest='PAGE_PER_HOST_NUM', type=int, default=8,
-                        help='how many information to read in each page.\nit is controlling the view of html page.')
-    parser.add_argument('--page_title', dest='PAGE_TITLE', type=str, default="GPUs", help='page title of html page.')
-    parser.add_argument('--page_description', dest='PAGE_DESCRIPTION', type=str, default="", help='page description of html page.')
 
     parser.add_argument('--slack_webhook', dest='SLACK_WEBHOOK', type=str, default="",
                         help='for slack notification. set a webhook url.\nit will send a up/down notification to this webhook.')
@@ -62,7 +61,7 @@
     parser.add_argument('--shedule_function', dest='SCHEDULE_FUNCTION', type=str, nargs='*', default=[],
                         help="if you want send shceduled status report, use this function.\nyou can use python schedule module to schedule the announcement of something like `'schedule.every().day.at('00:00').do(self.send_hosts_statuses, 'SCHEDULED_STATUS_REPORT')'`. this will go through `exec()` be careful.")
 
-    # notification message ########################################
+    # notification message
     parser.add_argument('--register_uplink_msg', dest='REGISTER_UPLINK_MSG', type=str, default="⬆︎⬆︎⬆︎ `Uplink` Detected - New uplink from `{}`. Hello!",
                         help='notification message of new host came.\nif you use {} it will be filled with `host name`')
     parser.add_argument('--re_uplink_msg', dest='RE_UPLINK_MSG', type=str, default="⬆︎⬆︎⬆︎ `  Up  ` Detected - Uplink from `{}`. Welcome back!",
@@ -85,17 +84,18 @@
 
     parser.add_argument('-quiet', dest='QUIET', action="store_true", default=False, help='show only critical error message.')
 
-    settings = parser.parse_args()
+    # ssl settings certfile, keyfile=None, password
+    """
+    parser.add_argument('--ssl_cert', dest='SSL_CERT', type=str, default=None, help='path of ssl certificate file.')
+    parser.add_argument('--ssl_key', dest='SSL_KEY', type=str, default=None, help='path for ssl key file.')
+    """
 
-    ssl_context = None
-    if settings.SSL_CERT is not None:
-        ssl_context = ssl.SSLContext(ssl.PROTOCOL_TLSv1_2)
-        ssl_context.load_cert_chain(settings.SSL_CERT, settings.SSL_KEY)
+    settings = parser.parse_args()
 
     if settings.local_settings_yaml_path is not None:
         try:
             with open(settings.local_settings_yaml_path, "r") as yaml_file:
-                yaml_data = yaml.load(yaml_file)
+                yaml_data = yaml.load(yaml_file, yaml.FullLoader)
         except Exception as e:
             print(e)
             yaml_data = []
@@ -104,8 +104,5 @@
             if arg_key in settings:
                 setattr(settings, arg_key, yaml_data[arg_key])
 
-    print(settings.SCHEDULE_FUNCTION)
-
     server = HTTPServer(settings)
-    server.start(ssl_context=ssl_context)
-    server.watch_and_sleep()
+    server.start()
diff --git a/imgs/bot_interact_sample_resized.png b/imgs/bot_interact_sample_resized.png
diff --git a/imgs/browser_sample_resized.png b/imgs/browser_sample_resized.png
diff --git a/imgs/noti_up_down_sample_resized.png b/imgs/noti_up_down_sample_resized.png
diff --git a/vesta/__version__.py b/vesta/__version__.py
@@ -1,4 +1,4 @@
 __title__ = 'vesta'
 __description__ = 'simple gpu monitoring script'
 __url__ = 'https://github.com/a-maumau/vesta'
-__version__ = '0.5.3'
+__version__ = '1.0.0'