pynvml工具获取gpu资源信息

在训练模型时我们希望选择显存占用率较低的gpu,那么就需要查看当前gpu的资源状态并输出,然后选择训练的gpu。

一、调用pynvml工具包获取gpu资源信息

1、安装python包pynvml

pip install nvidia-ml-py -i https://pypi.douban.com/simple

2、使用

使用pynvml相关函数获取gpu资源状态,数量、型号、显存、温度,并存储为txt文件

def get_gpu(simlpe=True):
     # Init
     nvmlInit()
     # get the number of GPU
     deviceCount = nvmlDeviceGetCount()
     total_memory = 0
     total_free = 0
     total_used = 0
     gpu_name = ""
     gpu_num = deviceCount

     for i in range(deviceCount):
         handle = nvmlDeviceGetHandleByIndex(i)
         info = nvmlDeviceGetMemoryInfo(handle)
         gpu_name = nvmlDeviceGetName(handle).decode('utf-8')
         # 查看型号、显存、温度、电源
         if not simlpe:
          logging.info(
               "GPU{}:{}  total memory:{}G  free memory:{:.1f}G  used memory:{:.1f}G  Used Percentage:{:.1f}%  Temperature:{}'C".format(
                    i,
                    gpu_name,
                    (info.total // 1048576) / 1024,
                    (info.free // 1048576) / 1024,
                    (info.used // 1048576) / 1024,
                    info.used / info.total *100,
                    nvmlDeviceGetTemperature(handle, 0),
                    )
          )

         total_memory += (info.total // 1048576) / 1024
         total_free += (info.free // 1048576) / 1024
         total_used += (info.used // 1048576) / 1024                               
     
     logging.info(
          "GPU name:{}  number:{}  total memory:{:.1f}G  free memory:{:.1f}G  used memory:{:.1f}G  Used Percentage:{:.1f}%".format(
               gpu_name,
               deviceCount,
               total_memory,
               total_free,
               total_used,
               total_used/total_memory*100,

          )
     )
     # shutdown
     nvmlShutdown()

3、查看执行结果

GPU0:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:20.0G  used memory:4.0G  Used Percentage:16.7%  Temperature:32'C
GPU1:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:32'C
GPU2:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:18.6G  used memory:5.4G  Used Percentage:22.3%  Temperature:47'C
GPU3:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:33'C
GPU4:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:31'C
GPU5:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:21.0G  used memory:3.0G  Used Percentage:12.3%  Temperature:31'C
GPU6:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:32'C
GPU7:NVIDIA GeForce RTX 3090  total memory:24.0G  free memory:23.7G  used memory:0.3G  Used Percentage:1.3%  Temperature:31'C
GPU name:NVIDIA GeForce RTX 3090  number:8  total memory:192.0G  free memory:178.1G  used memory:13.9G  Used Percentage:7.2%

你可能感兴趣的文章

相关问题

0 条评论

请先 登录 后评论
李瑶
李瑶

西北工业大学

6 篇文章

作家榜 »

  1. 解弘艺 17 文章
  2. 高曾谊 16 文章
  3. 胡中天 14 文章
  4. 旺仔牛奶opo 14 文章
  5. LH 14 文章
  6. 罗柏荣 13 文章
  7. Panda-admin 13 文章
  8. 林晨 12 文章