@theunknownmuncher

theunknownmuncher@lemmy.world · edit-2 14 hours ago

Ironic, the person condemning America automatically assumes every article is about America… how American of you.

Edit: they’ve edited their comment to swap out “America” for “1st world”, “leading mordern nations”, and “the whole modern world” now. Lol I feel like that equivalency is even more ironic

theunknownmuncher@lemmy.world · 10 days ago

Staging a false flag attack on our own troops in order to manufacture consent for a war with the country holding the world’s largest oil reserves?

theunknownmuncher@lemmy.world · 13 days ago

Yep, this is exactly what I do.

theunknownmuncher@lemmy.world · 1 month ago

People with this big brain take always crack me up… they think they’re shitting on her when they’re actually just admitting how successful she is, because her objective is literally to get publicity and attention for her cause.

theunknownmuncher@lemmy.world · 3 months ago

Can confirm

theunknownmuncher@lemmy.world · 4 months ago

Regardless of who would do better, the entirety of SpaceX and Starlink were paid for using American taxpayer’s money and government contracts. We already paid for it, we should own it.

theunknownmuncher@lemmy.world · edit-2 4 months ago

The funny part is the leak could only have come from inside his administration and party because he withheld briefings from his political opponents

theunknownmuncher@lemmy.world · 5 months ago

If it can power up and decrypt the docker volumes on its own without prompting you for a password in your basement, it will also power up and decrypt the docker volumes on its own without prompting the robbers for a password in their basement

theunknownmuncher@lemmy.world · edit-2 5 months ago

~~Pakistan has made guarantees that they will retaliate with their nuclear weapons if a nuclear strike occurs on Iran.~~

EDIT: I went to find a source and Pakistan has walked this back and are denying that they ever said it

theunknownmuncher@lemmy.world · 5 months ago

This behavior is associated with autism

theunknownmuncher@lemmy.world · edit-2 6 months ago

In most somewhat-but-not-completely sane places, the part of piracy that is illegal/criminal is distributing the copyrighted material, so downloading it to feed an LLM would already not really be an issue. The “problem” here is that most of these LLMs are used for commercial purposes, which is not allowed, and some claim LLMs are a form of distributing the copyrighted material.

If your LLM is only for personal use, then yeah that’s probably totally fine already, unless you live in one of the completely not sane places where both receiving and distibuting copyrighted material are illegal/criminal.

theunknownmuncher@lemmy.world · 6 months ago

Returning the shopping cart

theunknownmuncher@lemmy.world · 7 months ago

Dunno why this is downvoted because RAG is the correct answer. Fine tuning/training is not the tool for this job. RAG is.

theunknownmuncher@lemmy.world · 9 months ago

You can overwrite the model by using the same name instead of creating one with a new name if it bothers you. Either way there is no duplication of the llm model file

theunknownmuncher@lemmy.world · 9 months ago

What I am talking about is when layers are split across GPUs. I guess this is loading the full model into each GPU to parallelize layers and do batching

theunknownmuncher@lemmy.world · 9 months ago

Agreed.

theunknownmuncher@lemmy.world · 9 months ago

Because they were banned.

I didn’t make this claim, no

theunknownmuncher@lemmy.world · 9 months ago

Because they would just be inactive rather than banned? You’re the one claiming that bans have occurred, which would be a major censorship issue for lemmy…

theunknownmuncher@lemmy.world · edit-2 9 months ago

Can you try setting the num_ctx and num_predict using a Modelfile with ollama? https://github.com/ollama/ollama/blob/main/docs/modelfile.md#parameter

theunknownmuncher@lemmy.world · edit-2 9 months ago

Are you using a tiny model (1.5B-7B parameters)? ollama pulls 4bit quant by default. It looks like vllm does not used quantized models by default so this is likely the difference. Tiny models are impacted more by quantization

I have no problems with changing num_ctx or num_predict