pynvml工具获取gpu资源信息

在训练模型时我们希望选择显存占用率较低的gpu,那么就需要查看当前gpu的资源状态并输出,然后选择训练的gpu。

一、调用pynvml工具包获取gpu资源信息

1、安装python包pynvml

pip install nvidia-ml-py -i https://pypi.douban.com/simple

2、使用

使用pynvml相关函数获取gpu资源状态,数量、型号、显存、温度,并存储为txt文件

def get_gpu(simlpe=True):
     # Init
     nvmlInit()
     # get the number of GPU
     deviceCount = nvmlDeviceGetCount()
     total_memory = 0
     total_free = 0
     total_used = 0
     gpu_name = ""
     gpu_num = deviceCount

     for i in range(deviceCount):
         handle = nvmlDeviceGetHandleByIndex(i)
         info = nvmlDeviceGetMemoryInfo(handle)
         gpu_name = nvmlDeviceGetName(handle).decode('utf-8')
         # 查看型号、显存、温度、电源
         if not simlpe:
          logging.info(
               "GPU{}:{}  total memory:{}G  free memory:{:.1f}G  used memory:{:.1f}G  Used Percentage:{:.1f}%  Temperature:{}'C".format(
                    i,
                    gpu_name,
                    (info.total // 1048576) / 1024,
                    (info.free // 1048576) / 1024,
                    (info.used // 1048576) / 1024,
                    info.used / info.total *100,
                    nvmlDeviceGetTemperature(handle, 0),
                    )
          )

         total_memory += (info.total // 1048576) / 1024
         total_free += (info.free // 1048576) / 1024
         total_used += (info.used // 1048576) / 1024                               
     
     logging.info(
          "GPU name:{}  number:{}  total memory:{:.1f}G  free memory:{:.1f}G  used memory:{:.1f}G  Used Percentage:{:.1f}%".format(
               gpu_name,
               deviceCount,
               total_memory,
               total_free,
               total_used,
               total_used/total_memory*100,

          )
     )
     # shutdown
     nvmlShutdown()

3、查看执行结果

GPU0:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:20.0G  used memory:4.0G  Used Percentage:16.7%  Temperature:32'C
GPU1:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:32'C
GPU2:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:18.6G  used memory:5.4G  Used Percentage:22.3%  Temperature:47'C
GPU3:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:33'C
GPU4:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:31'C
GPU5:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:21.0G  used memory:3.0G  Used Percentage:12.3%  Temperature:31'C
GPU6:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:32'C
GPU7:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:31'C
GPU name:NVIDIA GeForce RTX 3090  number:8  total memory:192.0G  free memory:178.1G  used memory:13.9G  Used Percentage:7.2%

你可能感兴趣的文章

相关问题

0 条评论

请先 登录 后评论
李瑶
李瑶

西北工业大学

6 篇文章

作家榜 »

  1. Panda-admin 37 文章
  2. 解弘艺 17 文章
  3. 高曾谊 16 文章
  4. 旺仔牛奶opo 15 文章
  5. 胡中天 14 文章
  6. LH 14 文章
  7. 罗柏荣 13 文章
  8. 林晨 12 文章