Take a look at our
ThinkPads.com HOME PAGE
For those who might want to contribute to the blog, start here: Editors Alley Topic
Then contact Bill with a Private Message

Irq/201-nvidia when training AI model on X1 Extreme Gen 4

X1 / X1-Carbon (X1C) / X1-Extreme (X1E) Series/Generations
Post Reply
Message
Author
egalahad
Posts: 1
Joined: Fri Jan 20, 2023 5:35 am
Location: China, Zhejiang Ningbo

Irq/201-nvidia when training AI model on X1 Extreme Gen 4

#1 Post by egalahad » Fri Jan 20, 2023 5:40 am

I’m training AI model for one of my courses, but the process stops at a random epoch for each time, and the training just freezes. I couldn’t even Ctrl-C to stop it. The whole system seems not affected, because I’m using Intel integrated GPU for rendering GUIs. Then I use top only to find irq/201-nvidia running at 100.

System Specs:
OS: Arch Linux

Linux archlinux 5.15.85-1-lts #1 SMP Thu, 22 Dec 2022 06:22:00 +0000 x86_64 GNU/Linux

GPU: GeForce RTX 3080
VGA compatible controller: NVIDIA Corporation GA104M [GeForce RTX 3080 Mobile / Max-Q 8GB/16GB] (rev a1)
Subsystem: Lenovo Device 22e4
Kernel driver in use: nvidia
Kernel modules: nouveau, nvidia_drm, nvidia
The machine is Lenovo Thinkpad X1 Extreme Gen 4

CUDA version: 11.7.0
CUDNN version: 8.3.0.98
>> pacman -Qs nvidia
local/cuda 11.7.0-1
NVIDIA’s GPU programming toolkit
local/cudnn 8.3.0.98-1
NVIDIA CUDA Deep Neural Network library
local/egl-wayland 2:1.1.11-2
EGLStream-based Wayland external platform
local/libvdpau 1.5-1
Nvidia VDPAU library
local/nvidia-dkms 525.60.11-1
NVIDIA drivers - module sources
local/nvidia-utils 525.60.11-1
NVIDIA drivers utilities
local/opencl-nvidia 525.60.11-1
p.s. When editing videos in Windows I usually get blue screens. The error code is nvlddmkm.sys, or something containing TDR. (if that info would help)

I have been experiencing this in the last three month, wrestling with different versions of cuda, cudnn and nvidia drivers, with no result.

I'm aware that nvidia GPUs may be tuned for laptop usage, but I think it should only be some performance issues, does anyone encounters this or know how to deal with?

This is my first time asking for support in forums. If there any more infos that I should support pls let me know.

Thanks in advance!!!

w0qj
ThinkPadder
ThinkPadder
Posts: 1187
Joined: Fri Jun 11, 2004 9:53 pm
Location: Hong Kong

Re: Irq/201-nvidia when training AI model on X1 Extreme Gen 4

#2 Post by w0qj » Thu Feb 09, 2023 8:02 am

Err... perhaps you might get better replies if you re-post your above original message in the Linux forums instead:

viewforum.php?f=9

Good luck!
Daily Driver: (X1E3) X1 Extreme 3rd Gen | mobile broadband (WWAN)
Current Thinkpads: X1E3 | X1E1 | X1C10 | X1C9 | X1C4 | X1C3 | X230
Retired Thinkpads: X250 | T410 | T42 | 560 (circa 1996)

Post Reply
  • Similar Topics
    Replies
    Views
    Last post

Return to “ThinkPad X1 / X1-Carbon / X1-Extreme and later Series”

Who is online

Users browsing this forum: No registered users and 13 guests