Jump to content

fabiosun

Recommended Posts

For everyone else recently posting - bear in mind that the bare metal is still not 100% stable. My installation suffered falling into an inexplicable state where the initial EFI (which I used as a rescue reference) and on-disk EFI became severely unstable - random reboots or even failure to load. Something in the OS main disk state changed. Haven't had time to debug - I think I will do a clean reinstall, but I would not trust the bare metal for a production environment (even though so far I am the only one with this issue). On the other hand Proxmox is 100% solid.

 

Also, I am yet to see any performance benefit other than a 3% edge in Geekbench multicore, but 5% lower single core for that matter (I don't trust at Geekbench as the best app to measure with). I just ran a Cinebench out of my Proxmox - matched bare metal with a score 21468. Anecdotally, Proxmox is also more stable with apps. If you do a TRX40 hack, I would definitely keep Proxmox as a production base and bare metal as experimental until the community (and @iGPU as an active member) figure how to make it a bit less cumbersome and maybe stable.

  • +1 1
Link to comment
Share on other sites

My Build is Asrock TRX40 Creator, MSI Vega 64. I have had no problems with random reboots. I tried installing a program called Camo Studio that would allow me to use my iPhone camera on my Hack. It did not work. If fact it would not even come up. I have not had any other issues with other programs. iMessage works for me. It did not work with Proxmox.

I don't use any graphics programs, so I can't say if they work. All in all, I am pretty happy with bare metal. I will probably try Big Sur this weekend.

Link to comment
Share on other sites

5 hours ago, Ploddles said:

@Jaidy I've just checked the USB on my key ring and I have a copy of my EFI on that - I think it is the latest but if not I believe it does work on our MBs.

 

Let me know if you try it and it doesn't but here it is anyway. As I said, it still needs optimising and some kexts etc removing.

 

 

EFI.zip 6.93 MB · 4 downloads

is audio (the built in one) working for you?

Link to comment
Share on other sites

@iGPU - I continue to have reboot issues even from the 'safe' EFI I used initially. I am completely baffled. My only explanation is that the board NVRAM is corrupt. Does the clear CMOS button clear NVRAM too or do I need to short the pins on the board?

 

"Safe EFI" - will reach the point where OC would hand off to MacOS, I would get a black screen but instead of the login screen or apple logo I just get a reboot at this point.

"Last EFI" - I can login into MacOS or start recovery but not even 2 minutes I get a reboot.

 

Next I will try your GitHub EFI.

Edited by meina222
Link to comment
Share on other sites

  • Supervisor

There is no safe efi because this is an unexplored territory

patches, boot loader and so on

we have to find a proper mix of combinations of it to have, maybe a stable system.

if you analyze your boot log I think now you will find in it some errors like unsincronized Tsynch Cpu or similar.. so maybe additional kexts are needed..

in my case executing cpu benchmark with cb 15 or 20 could cause an instant system reset..

using some apps instead like adobe or davinci need perl command patches ..but it is an Amd rig features for now

  • Like 1
Link to comment
Share on other sites

by "safe" I mean the one I installed initially and ran without reboots for a good part of the week. I put in quotations as I know it's not really safe. Will try iGPUs now. If not working I am at a loss on how to debug as this seems to be past the OC handoff.

Edited by meina222
Link to comment
Share on other sites

@iGPU - tried the EFI you just checked in GitHub. Stuck on [ PCI configuration begin ].

 

 

IMG_5789.jpgEdit: Managed to boot after tweaking the boot args (no npci needed for me for some reason) and stripped various kexts

Edited by meina222
Link to comment
Share on other sites

I used to have a working EFI and after a few days started experiencing instability where I could not even boot from it anymore. No idea what changed. Has to be BIOS related.

 

So booted but and got into setup screen but now I have no mouse and kbd. Let me see what's going on.

Link to comment
Share on other sites

@fabiosun - I spent a good majority of yesterday investigating software compatibility, including compiling several libraries such as Tensorflow (if you want to really benchmark your CPU - try compiling Tensorflow. It'll run your CPU at 100% for a good 20-30 mins. Peaked at around 178F for me.) 

 

I think I may have come close to having core AMD compatibility with macOS for applications.

 

Most macos applications (from what I researched) use Intel's MKL library for CPU acceleration for graphics. This includes Adobe suite software. Previously, MKL was an Intel-only feature, but because of some trouble/pressure, they've since open-sourced it. Now it's called oneAPI. It can be compiled with amd64 CPU architecture, which I was able to do, and installed it successfully on my bare metal Catalina. (compiled folder attached via link - it's over 50mb)

 

Some more information related to MKL and AMD/Intel CPU architectures by Puget Systems - MKL was basically a feature that intentionally crippled AMD CPUs that most developers just ended up using.

 

It's likely safe to run

 

Big Sur/Catalina

echo -n 'export MKL_DEBUG_CPU_TYPE=5' >> ~/.zshrc

Mojave/Previous

echo -n 'export MKL_DEBUG_CPU_TYPE=5' >> ~/.profile

 

Now, what I'm currently unsure of, as I haven't yet tested it is:

a) whether or not this new MKL library actually replaces the existing MKL

b) whether the MKL libraries are universally linked to the host or have existing MKL libraries within their own app

c) whether this new MKL is compatible with macOS apps

d) do we need the other MKL libraries as well?

 

Installation Logs

Spoiler



Install the project...
-- Install configuration: "Release"
-- Installing: /usr/local/lib/libdnnl.1.6.dylib
-- Installing: /usr/local/lib/libdnnl.1.dylib
-- Installing: /usr/local/lib/libdnnl.dylib
-- Installing: /usr/local/include/dnnl_config.h
-- Installing: /usr/local/include/dnnl_version.h
-- Installing: /usr/local/include/dnnl.h
-- Installing: /usr/local/include/dnnl.hpp
-- Installing: /usr/local/include/dnnl_debug.h
-- Installing: /usr/local/include/dnnl_threadpool_iface.hpp
-- Installing: /usr/local/include/dnnl_types.h
-- Installing: /usr/local/include/mkldnn.h
-- Installing: /usr/local/include/mkldnn.hpp
-- Installing: /usr/local/include/mkldnn_config.h
-- Installing: /usr/local/include/mkldnn_debug.h
-- Installing: /usr/local/include/mkldnn_dnnl_mangling.h
-- Installing: /usr/local/include/mkldnn_types.h
-- Installing: /usr/local/include/mkldnn_version.h
-- Installing: /usr/local/lib/cmake/dnnl/dnnl-config.cmake
-- Installing: /usr/local/lib/cmake/dnnl/dnnl-config-version.cmake
-- Installing: /usr/local/lib/cmake/dnnl/dnnl-targets.cmake
-- Installing: /usr/local/lib/cmake/dnnl/dnnl-targets-release.cmake
-- Installing: /usr/local/lib/libmkldnn.dylib
-- Installing: /usr/local/lib/libmkldnn.1.dylib
-- Installing: /usr/local/lib/libmkldnn.1.6.dylib
-- Installing: /usr/local/share/doc/dnnl/LICENSE
-- Installing: /usr/local/share/doc/dnnl/THIRD-PARTY-PROGRAMS
-- Installing: /usr/local/share/doc/dnnl/README


 

 

Download Link to precompiled MKL-Dnn 

 

  • Like 4
Link to comment
Share on other sites

  • Supervisor
22 minutes ago, meina222 said:

I used to have a working EFI and after a few days started experiencing instability where I could not even boot from it anymore. No idea what changed. Has to be BIOS related.

 

So booted but and got into setup screen but now I have no mouse and kbd. Let me see what's going on.

This is not good imho

maybe it is related to shutdown problem we have, in old chipset it could reset bios and produce weird corruption problem as in old chipset older than our

i noticed in my case my reboots happen only when i use cinebench

 

i think bare metal  is a way to try, but by now it is not advisable for a production environment

proxmox is safer

 but in baremetal we have all the chances to use any device we have without bridges problem

  • Like 1
Link to comment
Share on other sites

  • Supervisor

@tsongz i will try as you said

thank you

for now problem is stability.. benchmark is sure is oretty good in many tests i did

 

i am now trying others kernel patches and a tsynch kext togheter dummy quirks in OC

maybe i will resume also nullcpupowermanagement kext

  • Like 1
Link to comment
Share on other sites

Yeah I am worried about BIOS corruption.  So I stated from scratch. Managed to get into the installer. My mouse and keyboard froze a few times and had to hard reboot but finally going. Ended up going all the way back to the barebones EFI of iGPU from page 2 or 3 of this thread.

 

Edit: the installer froze. I guess my only option is to pin jump CMOS and retry.

Edited by meina222
Link to comment
Share on other sites

@meina222 your motherboard should have a Clear CMOS button on the back. It's basically my go-to whenever I have to do a hard reboot or anything goes wrong with NVRAM.

 

image.png.3ddf3c00cfd82c03442851e28f594f24.png

 

Don't know if you  do this already, but check your POST [little 2 digit display] monitor on your motherboard when you boot up. More often than not, if macOS caused some issue, POST will either fail to show at all at boot, or sit at 99. (at least for me). Clearing CMOS generally has done the trick.

 

@fabiosun - I'm in agreement with you. My benchmark scores are amazing, even beating out the more modern Pro Vegas with my Vega Frontier  - but, the compatibility and the 'it just works' nature of Apple products is anything but. AMD =/= Intel CPUs is definitely a big hurdle in software development, since all macOS libraries/kernels are compiled only for Intel. On top of that, RocM, AMD's CUDA equivalent has no intention of porting compatibility to Unix/macOS, which makes the Vega FE almost useless for Machine Learning in macOS. (There's a drop in the bucket of support for Metal API in ML - none of which is usable in my current workflow)

 

And the thing that was the deciding factor for me against bare metal going forward - is even if 80% of things work well, lack of modern virtualization (due to lack of Intel's VT-d) support on macOS hypervisor API, I can't take advantage of the hardware (GPU specifically) outside of macOS, as there's not really PCIe passthrough support because of ^.

 

What I am planning on doing now for production is going back to Proxmox, and splitting CPU/RAM allocation for an ubuntu VM + macOS VM (Intel Pendryn CPU), and adding a second GPU so I can use both concurrently. It sort of sucks to not have full hardware access all in one OS, but that's the risk we take. I am still planning on figuring out these nuances as time permits for bare metal but we're trying to break a concrete wall with a spoon, as we have little documentation and historical references.

Link to comment
Share on other sites

I have no idea what happened to my hardware and this setup but I can't install this. I can get in the installer with a very barebones config.plist and a minute into it everything freezes. I wonder if my video card is acting up. I am out for now. No idea how to debug this. I tried several EFI's. Only thing I haven't tried is BIOS reflash. I either get a reboot of OC or I get in but then get a freeze. This is very bizzare as I had no problems whatsoever last week.

Link to comment
Share on other sites

  • Supervisor

Bios corruption could be your problem @meina222

try to re flash your bios 

I think it could solve your actual freeze

main task should be to find exact things that inhibit us to shutdown

or in my case to understand why in my rig now I have gpu acceleration with my same previous efi 

nvram could be related to this quirks

but for now we have to study with classic process of errors and tries

  • Like 1
Link to comment
Share on other sites

  • Moderators
2 hours ago, meina222 said:

@iGPU - tried the EFI you just checked in GitHub. Stuck on [ PCI configuration begin ].

 

 

IMG_5789.jpgEdit: Managed to boot after tweaking the boot args (no npci needed for me for some reason) and stripped various kexts

 

 

This weekend when working up a config.plist file to boot into Big Sur, I would see this if WEG is active. It is okay to leave WEG active, but add -wegbeta to boot arg.

 

I've stressed adding npc-0x2000 several times on this thread whether or not "Above 4G decoding" is enabled or not.

 

The kexts I've loaded with the EFI on GitHub should cause no problems.

 

Overall stability and random reboots will occur if Energy Saver is not disabled. I posted this earlier in this thread and on the GitHub site, Section 7.

  • +1 1
Link to comment
Share on other sites

  • Moderators

fabiosun,

 

On bare metal, the CPU scores are the same as under Proxmox with Geekbench or Cinebench 20.

 

With Luxmark, the scores for Radeon VII are the same under bare metal and Promox (and only Luxmark uses both GPUs).

 

only with Cinebench 15 do I see a decrease of 50%. So maybe a problem with Cinebench 15?

 

1807005489_ScreenShot2020-08-20at12_15_32AM.png.77c40d3083c4281350e07986f5bb2505.png

  • +1 1
Link to comment
Share on other sites

  • Supervisor

IGPU same conclusion I did before

Some openGL (old stuff) work worst

 

About stability I have to proceed in a more systematic way

cpu benchmark with Cinebench have a big probability to reboot my system for now

 

this morning also acquantia ethernet does not connect well...but yesterday I have done many tests so I can't be more specific

It seems also I can't load a simple ssdt in my rig

 

I am locked to high Sierra (my main interest is in it) so I can have also additional problem

Tested nullcpupowermanagement kext, dummy quirk and different sets of kernel patches, but this instability it is not solved

 

as side note with AMD Power Gadget I see a power draw of about 80 watt in idle..about 370 in full Cinebench 20 stress

Idle seems a bit high

 

tested also in different PBO state

results is the same

 

  • Like 1
Link to comment
Share on other sites

  • Moderators
1 hour ago, fabiosun said:

ok i figured out my problem

same refi same bios settings

Catalina is working perfect also with my patched Nvidia web driver

 

So I have to try a different way (maybe) in High Sierra

 

 

 

Schermata 2020-08-20 alle 14.02.28.png

Schermata 2020-08-20 alle 13.57.50.png

Schermata 2020-08-20 alle 13.58.47.png

480413476_Schermata2020-08-20alle14_03_26.png.95a2cf0edb4020b31e358469b8aab4e4.png

 

I'm happy to hear that you solved your problem. What exactly did you do to fix it?

Link to comment
Share on other sites

  • Moderators
2 minutes ago, fabiosun said:

installed Catalina

but it is not that I need

so reverting to a clean copy of HS

 

as side note

I can't do a direct install of a new OS in bare metal

I have to use Proxmox

have you tried?

 

 

No, I've not tried. I have performed an update for Catalina and that went well. I'll attempt an install later on a spare drive.

  • Cross Finger 1
Link to comment
Share on other sites

  • fabiosun changed the title to [Discussion] - TRX40 Bare Metal - Vanilla Patches

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    • There are no registered users currently online
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.