Jump to content

dtek

Members
  • Posts

    60
  • Joined

  • Last visited

Posts posted by dtek

  1. 55 minutes ago, fabiosun said:

    it is difficult to say...

    if you have an OSX installed (one you use with proxmox should be fine) you could use it with my EFI

    you have to check MMIO and other stuff and maybe 3 GPU inserted is a bit complicated..unsure of this

     

    spacer.png

     

    I only have 6800xt in slot 3 then moved it to slot 1 and that didn't work either.  Launching the installation file manually gave me these errors.   Does SIP need to be disabled?

  2. 5 hours ago, fabiosun said:

    61195855_Screenshot2021-06-03at5_59_04PM.png.e6888bc11d97afbe16e76d123e86c42b.png

    spacer.png

     

    Hi @fabiosun,

     

    Thanks to your guide, I've been running proxmox without any issues for awhile.   I started doing a bare metal installation using your EFI but ran into some issues.   I got to the Big Sur installation page but after selecting the drive and clicked on continue, it goes back to the recovery menu.  Can you point me in the right direction to resolve this?

  3. 19 hours ago, meina222 said:

    My 3090 ended up on backorder btw (so I withdraw my TigerDirect recommendation, it doesn't seem honest), but I got a non-XT 6800 from Newegg that was charged and will ship tomorrow. So I may try to dual GPU VMs (6800 for Win 10 as BigSur doesn't seem to have the driver enabled yet and the 5700XT for MacOS).

    I did a passthrough with dual GPU VMs(5700xt for Big Sur and  6800xt for Windows 10) and they're are both stable if only one VM is running at a time.  As soon as both VMS are started,  the machine froze up and rebooted on it's own.  I'm thinking it could be the 5700xt reset bug that might be causing the reboot so I got a Nvidia 3060 ti for further testing.  I didn't get a chance to install the 3060ti yet,  I'll keep you posted.  Please update me on your progress with the dual GPU VMs.

  4. On 11/30/2020 at 12:49 AM, meina222 said:

    I re-enabled the re-binding of the framebuffer and vtconsole - the highlighted commands were previously commented out - this is needed since otherwise you can't unbind in the pre-start phase when you restart.

     

    When you shutdown your VM, you'd need an external device to restart it as main display won't come back to host (still not sure how to achieve that) but you can re-start VM from web console on a different device and display should come back. Similarly, rebooting a VM should just work and GPU should reset properly (for me DisplayPort sound stops working but this is WIP by developer).

     

    #!/bin/bash

    vmid="$1"
    phase="$2"

    if [[ "$phase" == "pre-start" ]]; then
        clear
        echo "Starting VM $vmid - please wait..."
        IFILE=/var/lib/vz/snippets/interfaces.$vmid
        if [[ -f "$IFILE" ]]; then
            cp /var/lib/vz/snippets/interfaces.$vmid /etc/network/interfaces
            systemctl restart networking
        fi          
        echo 0 > /sys/class/vtconsole/vtcon0/bind
        echo 0 > /sys/class/vtconsole/vtcon1/bind
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
    elif [[ "$phase" == "post-start" ]]; then
        main_pid="$(< /run/qemu-server/$vmid.pid)"

        cpuset="0-63"
        #cpuset="$(< /etc/pve/qemu-server/$vmid.cpuset)"

        taskset --cpu-list  --all-tasks --pid "$cpuset" "$main_pid"
    elif [[ "$phase" == "post-stop" ]]; then
        echo "Post-stop VM $vmid"
        #reboot
        #shutdown -h now

        sleep 1 
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/bind
        echo 1 > /sys/class/vtconsole/vtcon0/bind
        echo 1 > /sys/class/vtconsole/vtcon1/bind

    fi
    vmhook.sh (END)

     

     

     

    Alternatively, you can get a cheap GPU for the host and not worry about the framebuffer unbind/rebind in the hook and manipulate the display output selector.

    @meina222

    Should I use this script for 2 separate VMs (Big Sur and Win10) and 2 GPUs (6800xt and 3060ti) passthrough?  Are there any modifications that is needed?

  5. 47 minutes ago, meina222 said:

    Where did you get that 6800XT? I wish I can find one without being gouged.

     

    I just landed an Asus 3090 Strix OC from a retailer at $1899 (only 100 above MSRP). I got very lucky as this is not Newegg or Best Buy and I don't think it's targeted by bots and I snagged the only one in stock. I really wanted a 6900XT but I doubt I can get it given how bad the 6800XT launch is and that only 4% of all 6000 cores are for 6900XT bins. Will get the Asus in a few days. Don't know what to do with it, I think I will start ML practice.

    The multi VM issue could be some problem with AMD PCIE arbitration. I vaguely remember that Linus Sebstian had an issue in one of his "1 PC, 2 gaming VMs" YouTube videos where he tried simultaneous 1v1 w 2 cards and AMD gave me issues and he switched to Intel. I only use 1 VM at time right now but with the 3090 that can change. We'll see.

    Wow that's super lucky.   I'd just sell the 3090 and use that profit to pay for the 6900 markup price.  I got mine on ebay for $300 above MSRP =(

  6. 1 hour ago, meina222 said:

    Yeah, it seems just an info and not an error. Not sure what it means but I wouldn't sweat it.

     

    I don't think you need to remove it but try " "dkms remove -m vendor-reset -v 0.0.18 --all"  

     

    Then try installing again and update-initramfs -u

     

    reboot

    Reinstalled and rebooted.  I forgot to tell you I bought a 6800XT and it's currently installed on my machine.  The reset bug seems to be gone and a passthrough was a success.  There's no support for this GPU yet, so there's no video acceleration.  The only issue it has now is it can't run both Windows 10 and Big Sur simultaneously.  I'm only able to test with one GPU at the moment due to my water cooling setup  allowing space for one GPU(6800xt).  I'm hoping once both GPUs are successfully passthrough,  then both VMs can run without crashing.

    • +1 1
  7. 7 minutes ago, meina222 said:

    Try "dkms status"

     

    dkms uninstall vendor-reset/0.0.18

    root@dtk:~/vendor-reset# update-initramfs -u
    update-initramfs: Generating /boot/initrd.img-5.4.78-1-pve
    Running hook script 'zz-pve-efiboot'..
    Re-executing '/etc/kernel/postinst.d/zz-pve-efiboot' in new private mount namespace..
    No /etc/kernel/pve-efiboot-uuids found, skipping ESP sync.
     

    Is this normal?

  8. 4 minutes ago, meina222 said:

    Try "dkms status"

     

    dkms uninstall vendor-reset/0.0.18

    root@dtk:~# dkms status
    vendor-reset, 0.0.18: added


    root@dtk:~# dkms uninstall vendor-reset/0.0.18
    Error! The module vendor-reset 0.0.18 is not currently installed.
    This module is not currently ACTIVE for kernel 5.4.78-1-pve (x86_64).

     

     

    This command worked

     dkms remove vendor-reset/0.0.18 --all

  9. 3 minutes ago, meina222 said:

    It's already installed. You can try  "dkms uninstall vendor-reset-0.0.18"

     

    and reinstall it to be sure.

    root@dtk:~# dkms uninstall vendor-reset-0.0.18
    Error! Invalid number of arguments passed.
    Usage: uninstall <module>/<module-version> or
           uninstall -m <module>/<module-version> or
           uninstall -m <module> -v <module-version>


    I tried dkms -m uninstall vendor-reset-0.0.18 and dkms -m uninstall vendor-reset -v 0.0.18, both have the same error

  10. 17 minutes ago, meina222 said:

    hi @23d1 - can you "update-initramfs -u" and reboot and try? This I think is a Proxmox script error message that indicates that your VFIO kernel module is not loaded properly.

     

    @dtekyou need to install the kernel headers 1st. I believe I already had them as I've done kernel builds on the host.

     

    apt install pve-headers

    Now I got this error after installing pve-headers

     

    root@dtk:~/vendor-reset# dkms install .
    Error! DKMS tree already contains: vendor-reset-0.0.18
    You cannot add the same module/version combo more than once.
     

  11. On 11/27/2020 at 4:13 PM, meina222 said:

    On your host (Proxmox) try:

     

    git clone https://github.com/gnif/vendor-reset.git
    cd vendor-reset
    dkms install .
    echo "vendor-reset" >> /etc/modules
    update-initramfs -u

    reboot

     

    If all works well, your GPU should start resetting.

     

    You may need to install the dkms package if not there by default.

     

     

    root@dtk:~/vendor-reset# dkms install .

    Creating symlink /var/lib/dkms/vendor-reset/0.0.18/source ->
                     /usr/src/vendor-reset-0.0.18

    DKMS: add completed.
    Error! Your kernel headers for kernel 5.4.34-1-pve cannot be found.
    Please install the linux-headers-5.4.34-1-pve package,
    or use the --kernelsourcedir option to tell DKMS where it's located

     

    I got this error when installing dkms
     

  12. On 11/30/2020 at 12:49 AM, meina222 said:

    I re-enabled the re-binding of the framebuffer and vtconsole - the highlighted commands were previously commented out - this is needed since otherwise you can't unbind in the pre-start phase when you restart.

     

    When you shutdown your VM, you'd need an external device to restart it as main display won't come back to host (still not sure how to achieve that) but you can re-start VM from web console on a different device and display should come back. Similarly, rebooting a VM should just work and GPU should reset properly (for me DisplayPort sound stops working but this is WIP by developer).

     

    #!/bin/bash

    vmid="$1"
    phase="$2"

    if [[ "$phase" == "pre-start" ]]; then
        clear
        echo "Starting VM $vmid - please wait..."
        IFILE=/var/lib/vz/snippets/interfaces.$vmid
        if [[ -f "$IFILE" ]]; then
            cp /var/lib/vz/snippets/interfaces.$vmid /etc/network/interfaces
            systemctl restart networking
        fi          
        echo 0 > /sys/class/vtconsole/vtcon0/bind
        echo 0 > /sys/class/vtconsole/vtcon1/bind
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
    elif [[ "$phase" == "post-start" ]]; then
        main_pid="$(< /run/qemu-server/$vmid.pid)"

        cpuset="0-63"
        #cpuset="$(< /etc/pve/qemu-server/$vmid.cpuset)"

        taskset --cpu-list  --all-tasks --pid "$cpuset" "$main_pid"
    elif [[ "$phase" == "post-stop" ]]; then
        echo "Post-stop VM $vmid"
        #reboot
        #shutdown -h now

        sleep 1 
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/bind
        echo 1 > /sys/class/vtconsole/vtcon0/bind
        echo 1 > /sys/class/vtconsole/vtcon1/bind

    fi
    vmhook.sh (END)

     

     

     

    Alternatively, you can get a cheap GPU for the host and not worry about the framebuffer unbind/rebind in the hook and manipulate the display output selector.

    After a  successful  GPU passthrough attempt,  my machine is acting weird.  It won't run both VMs simultaneously.   When I start one VM, the other would crash.  Have you experience this before? 

  13. On 11/27/2020 at 4:13 PM, meina222 said:

    On your host (Proxmox) try:

     

    git clone https://github.com/gnif/vendor-reset.git
    cd vendor-reset
    dkms install .
    echo "vendor-reset" >> /etc/modules
    update-initramfs -u

    reboot

     

    If all works well, your GPU should start resetting.

     

    You may need to install the dkms package if not there by default.

    Is there an updated hookscript or should I remove it completely?

  14. On 11/24/2020 at 11:11 PM, meina222 said:

    GPU reset now works on 5700XT (tested with mine) and possibly Vega 64/56 (unable to test) by using a non-kernel module install generously offered by https://github.com/gnif/vendor-reset/

     

    @dtek,  and others using Proxmox - worth trying! I can now reboot my single GPU VM in MacOS to upgrade it and regain it back via pass-thru. It works pretty reliably in Windows 10 VMs too. You can checkout the repo and install the module to try it.

    OMG I can't wait to try this.  GPU passthrough  was removed from my VM ever since I had to reinstall everything from scratch.  It refused to boot after multiple forceful resets 😣

  15. 26 minutes ago, meina222 said:

    Could you post the outputs of the following in Proxmox

     

    1)   ls -l /sys/class/net

    2)   less /etc/network/interfaces

     

    root@dtk:~#  ls -l /sys/class/net
    total 0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 eno1 -> ../../devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:04.0/0000:44:00.0/net/eno1
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 enp69s0 -> ../../devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:05.0/0000:45:00.0/net/enp69s0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 fwbr100i0 -> ../../devices/virtual/net/fwbr100i0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 fwbr101i0 -> ../../devices/virtual/net/fwbr101i0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 fwln100i0 -> ../../devices/virtual/net/fwln100i0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 fwln101i0 -> ../../devices/virtual/net/fwln101i0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 fwpr100p0 -> ../../devices/virtual/net/fwpr100p0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 fwpr101p0 -> ../../devices/virtual/net/fwpr101p0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 lo -> ../../devices/virtual/net/lo
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 tap100i0 -> ../../devices/virtual/net/tap100i0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 tap101i0 -> ../../devices/virtual/net/tap101i0
    lrwxrwxrwx 1 root root 0 Nov  6 17:17 vmbr0 -> ../../devices/virtual/net/vmbr0

     

     

    auto lo
    iface lo inet loopback

    iface enp69s0 inet manual

    auto vmbr0
    iface vmbr0 inet static
            address 192.168.1.14
            netmask 255.255.255.0
            gateway 192.168.1.1
            bridge_ports enp69s0
            bridge_stp off
            bridge_fd 0

    iface eno1 inet manual

     

     

  16. 12 minutes ago, meina222 said:

    @dtek

     

    I used the Dortania guide.

     

    https://dortania.github.io/OpenCore-Post-Install/universal/iservices.html

     

    Basically you need to do this:

     

    1. Find out the PCI id of your ethernet card in Proxmox and pass it through to the VM. Make sure your Porxmox host can get its connectivity from another place (2nd ethernet, wireless)

     

    As an example, the highlighted id below is one my two ethernets (the 2nd I leave to Proxmox for host connectivity).

     

    args: -device isa-applesmc,osk="ourhardworkbythesewordsguardedpleasedontsteal(c)AppleComputerInc" -smbios type=2 -device usb-kbd,bus=ehci.0,port=2 -cpu host,+invtsc,vendor=GenuineIntel
    balloon: 0
    bios: ovmf
    boot: cdn
    bootdisk: virtio0
    cores: 64
    cpu: Penryn
    efidisk0: aorus:vm-101-disk-1,size=1M
    hookscript: local:snippets/vmhook.sh
    hostpci0: 43:00,pcie=1,x-vga=1,romfile=vbios.bin

    hostpci1: 86:00,pcie=1
    hostpci2: 85:00,pcie=1
    hostpci3: 88:00,pcie=1
    hostpci4: 02:00,pcie=1
    hugepages: 1024
    ide2: local:iso/OpenCoreBeta.iso,size=150M
    machine: q35
    memory: 196608
    name: bigsur
    numa: 1
    ostype: other
    scsihw: virtio-scsi-pci
    smbios1: uuid=4b5493a6-6a73-48b7-8ce5-2be70a66a383
    sockets: 1
    vga: none
    virtio0: aorus:vm-101-disk-0,cache=unsafe,discard=on,size=250G
    vmgenid: 18d68c27-3a62-4059-9280-7f86a572af59
    vmgenid: 0c7cc702-74ba-4d8e-ba4b-d52d5fe53847

     

     

     

    2. Get the ROM/MAC address of the ethernet and specify it in your OpenCore config.plist as described in the Dortania guide

    3. Make sure your enX (e.g. en0) device matching the card is "primary" in MacOS as described in the guide

    4. Make sure NVRAM works in MacOS as described in the guide

     

    When done, you can activate iMessage. Worked for me.

     

    I have 2 ethernet ports.  How do I find out which one is being used by Proxmox host?

  17. 6 minutes ago, dtek said:

    Ok I can wait another month.  I'm having some issues after the passthru.   Every time I start both Windows10 and Catalina, one or both will stop.  Any idea what the problem might be?

    I figured out what the problem was.  I followed the other gpu passthrough guide and set a crontab on every reboot.  That has conflicts with both machines somehow,  removed it and all is good now.  Thanks again

  18. 33 minutes ago, meina222 said:

    Your best bet is to wait for the new Radeon 6000 and hope they fixed it there. There are some older AMD GPUs that reset but nothing newer than a few years at this point (except maybe a few RX 580 Sapphire Pulse models, but have only read anecdotes of that, not verified it and would not trust to buy based on that alone). New radeon 6000 will be out by end of Nov and we'll know very soon if it resets.

    Ok I can wait another month.  I'm having some issues after the passthru.   Every time I start both Windows10 and Catalina, one or both will stop.  Any idea what the problem might be?

  19. 2 hours ago, meina222 said:

    For now you can remove everything from the hook script except these (even though vmid is not used it may come handy later):

     

    The critical one is the frame buffer unbinding and then reboot is optional - you normally don't want to reboot but I do it as the GPU can't reset.

     

    #!/bin/bash
    vmid="$1"
    phase="$2"

    if [[ "$phase" == "pre-start" ]]; then
        echo 0 > /sys/class/vtconsole/vtcon0/bind
        echo 0 > /sys/class/vtconsole/vtcon1/bind
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
    #elif [[ "$phase" == "post-stop" ]]; then
        echo "Post-stop VM $vmid"    

        reboot
    fi

    I'm so happy everything works ☺️.  Thanks for that.  I'm experiencing the reset but now and it's quite annoying lol.  Since black friday is near,  I'll be on the hunt for a new GPU.  Is there any decent GPU that you can recommend without the reset bug?

    • +1 1
  20. 1 hour ago, meina222 said:

    Yes, you need the true rom file. I would not move the GPU yet. In fact the GPU being in slot 2 may be an advantage - try even w/out the rom file as I think shadowing the rom may only be required in slot 1.

     

    Also if the VM fails to start - check the host log Syslog as I described above. You can  do it from the web GUI in realtime.

    I finally got it working.  I was sharing a monitor with an all in one pc  with hdmi input and output and that caused a lot of issues. Switched  to an old monitor and almost everything works except for the hookscript.   My host screen and console is blank right now.  I can only connect with teamviewer.  I made some changes to the hookscript.  Can you check to see if this will work?

     

    #!/bin/bash
    vmid="$1"
    phase="$2"

    if [[ "$phase" == "pre-start" ]]; then
        clear
        echo "Starting VM $vmid - please wait..."
        IFILE=/var/lib/vz/snippets/interfaces.$vmid
        if [[ -f "$IFILE" ]]; then
            cp /var/lib/vz/snippets/interfaces.$vmid /etc/network/interfaces
            systemctl restart networking
        fi
        echo 0 > /sys/class/vtconsole/vtcon0/bind
        echo 0 > /sys/class/vtconsole/vtcon1/bind
        echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind
    elif [[ "$phase" == "post-start" ]]; then
        main_pid="$(< /run/qemu-server/$vmid.pid)"

       # cpuset="0-63"
        #cpuset="$(< /etc/pve/qemu-server/$vmid.cpuset)"

        #taskset --cpu-list  --all-tasks --pid "$cpuset" "$main_pid"
    #elif [[ "$phase" == "post-stop" ]]; then
     #   echo "Post-stop VM $vmid"
        reboot
        #shutdown -h now

        #sleep 5
        # Attempt rebind to EFI-Framebuffer
        #echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/bind
        # Attempt rebind to virtual consoles
        #echo 1 > /sys/class/vtconsole/vtcon0/bind
        #echo 1 > /sys/class/vtconsole/vtcon1/bind
        #sleep 5
        #shutdown -h now
    fi
     

  21. 2 minutes ago, meina222 said:

    Basically when you're all done and reboot you need to be able to succeed from your primary display to do this

     

    echo 0 > /sys/class/vtconsole/vtcon0/bind  &  echo 0 > /sys/class/vtconsole/vtcon1/bind &  echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind

     

    This will block your main display but then you can use the web console to start the VM and the card should switch over to the VM bios after 20-30 seconds and you should see the VM bringing your display back alive.

    Updated grub, blacklisted  amdgpu and   pve-efiboot-tool refresh

    I think the only thing left to try is  swapping out the rom file and moving GPU to slot 1

  22. 2 minutes ago, meina222 said:

    Here's my /etc/modprobe.d content

     

    -rw-r--r--  1 root root  205 Jul 18 17:54 blacklist.conf
    -rw-r--r--  1 root root   26 Jun 21 22:54 kvm.conf
    -rw-r--r--  1 root root  171 May 10 15:06 pve-blacklist.conf
    -rw-r--r--  1 root root  148 Jul  5 01:15 vfio.conf

     

    less blacklist.conf

     

    blacklist radeon
    #blacklist nouveau
    #blacklist nvidia
    blacklist amdgpu
    blacklist snd_hda_codec_hdmi
    blacklist snd_hda_codec
    blacklist snd_hda_core
    blacklist snd_hda_intel
    blacklist iwlwifi
    blacklist btusb

     

    less vfio.conf

     

    options vfio-pci ids=1002:731f,1002:ab38 disable_vga=1
    options vfio-pci ids=1022:148c
    options vfio-pci ids=1022:149c
    options vfio-pci ids=8086:1533

     

    For vfio.conf, follow this guide as your id's will be different:

     

    https://pve.proxmox.com/wiki/PCI(e)_Passthrough

    My theory is that you don't disable the host taking over your GPU and cannot unbind the efi framebuffer as a result.

    root@dtk:/etc/modprobe.d# lspci | grep VGA
    05:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 10 (rev c1)
     

    root@dtk:/etc/modprobe.d# cat blacklist.conf
    blacklist radeon
    blacklist nouveau
    blacklist nvidia
     

    root@dtk:/etc/modprobe.d# cat kvm.conf
    options kvm ignore_msrs=1
     

    root@dtk:/etc/modprobe.d# cat vfio.conf
    options vfio-pci ids=1002:731f,1002:ab38 disable_vga=1

     

    root@dtk:/etc/modprobe.d# lspci -n -s 05:00
    05:00.0 0300: 1002:731f (rev c1)
    05:00.1 0403: 1002:ab38
     

     

    Am I missing anything?

     

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.