求助nv驱动问题

我使用ubuntu22.04 安装 nvidia-driver-580-server驱动后,nvidia-smi正常显示显卡,重启电脑 选完grub 就报 oom(内存不足),这是什么问题,内核不兼容吗?
我自己换了个版本的内核 6.12,同样安装那个版本的驱动,也报oom,nvidia-smi命令都无法使用,部分报错
Apr 22 08:24:19 houmao kernel: [ 568.170199] watchdog: BUG: soft lockup - CPU#13 stuck for 134s! [modprobe:25854]
Apr 22 08:24:19 houmao kernel: [ 568.170202] Modules linked in: nvidia(POE+) ccm snd_sof_pci_intel_tgl snd_sof_pci_intel_cnl snd_sof_intel_hda_generic soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda_common snd_soc_hdac_hda snd_sof_intel_hda_mlink snd_sof_intel_hda snd_hda_ext_core snd_sof_pci snd_sof_xtensa_dsp binfmt_misc snd_sof intel_uncore_frequency intel_uncore_frequency_common snd_sof_utils intel_tcc_cooling snd_soc_acpi_intel_match iwlmvm snd_soc_acpi soundwire_bus snd_hda_codec_realtek x86_pkg_temp_thermal snd_soc_core snd_hda_codec_generic intel_powerclamp coretemp nls_iso8859_1 snd_hda_scodec_component snd_compress ac97_bus mac80211 snd_pcm_dmaengine exfat snd_hda_codec_hdmi kvm_intel snd_hda_intel libarc4 uvcvideo snd_intel_dspcfg snd_intel_sdw_acpi kvm btusb videobuf2_vmalloc snd_hda_codec btrtl uvc btintel mei_hdcp videobuf2_memops iwlwifi intel_rapl_msr snd_hda_core btbcm videobuf2_v4l2 btmtk videobuf2_common processor_thermal_device_pci rapl snd_hwdep intel_cstate videodev snd_pcm
Apr 22 08:24:19 houmao kernel: [ 568.170228] processor_thermal_device input_leds joydev bluetooth mei_me processor_thermal_wt_hint cfg80211 snd_timer mc ecdh_generic processor_thermal_rfim mei hid_multitouch serio_raw ecc processor_thermal_rapl snd intel_rapl_common soundcore processor_thermal_wt_req processor_thermal_power_floor igen6_edac processor_thermal_mbox int3403_thermal int340x_thermal_zone intel_hid int3400_thermal acpi_thermal_rel sparse_keymap mac_hid acpi_pad dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel msr efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear r8153_ecm cdc_ether usbnet usbhid i915 drm_buddy i2c_algo_bit ttm drm_display_helper hid_generic r8152 mii cec i2c_hid_acpi rc_core crct10dif_pclmul i2c_hid uas crc32_pclmul usb_storage mxm_wmi r8169 nvme ghash_clmulni_intel drm_kms_helper sha512_ssse3 intel_lpss_pci i2c_i801 hid ahci sha256_ssse3 i2c_mux intel_lpss sha1_ssse3 psmouse realtek
Apr 22 08:24:19 houmao kernel: [ 568.170261] libahci i2c_smbus nvme_core idma64 drm video wmi pinctrl_tigerlake aesni_intel crypto_simd cryptd

这是nv的新版驱动有问题吗?还是系统问题

哪里报了oom?你贴的日志报的是soft lockup啊。应该同时有写调用栈,看看就知道是哪里有问题了。

grub 选完就报这个,,

6.12内核报错补充。 内核是自己编译的,打了cjktty补丁

哦这里啊。你这内核版本好老……

这也太老了,为什么要用这个版本(

我更新到24 看看吧,,

6.12.58也很老吗? 还有是什么问题啊?

6.12.58不老了。图中的日志卡在了一个我不认识的地方。你再找找有没有别的不一样的调用栈。尽量找最开头的调用栈,那些更可能是罪魁祸首,后边的可能很多都是被殃及的池鱼。

之前折腾过一段时间的n驱动,最后咸鱼和别人等价换了一张a卡 :rofl: