sntx@lemm.eetoSelfhosted@lemmy.world•Guide to Self Hosting LLMs Faster/Better than OllamaEnglish
2·
1 month agoIs there an inherent benefit for using NVLINK? Should I specifically try out Aprodite over the other recommendations when having 2x 3090 with NVLINK available?
Thanks for the writeup! So far I’ve been using ollama, but I’m always open for trying out alternatives. To be honest, it seems I was oblivious to the existence of alternatives.
Your post is suggesting that the same models with the same parameters generate different result when run on different backends?
I can see how the backend would have an influence hanfling concurrent api calls, ram/vram efficiency, supported hardware/drivers and general speed.
But going as far as having different context windows and quality degrading issues is news to me.