

Removed by mod


Removed by mod


Removed by mod


Removed by mod


Removed by mod


Removed by mod


Removed by mod
Complex social hierarchy is a super important aspect to account for too. In the proprietary software realm, you infer confidence in the accumulated wealth hierarchy. In FOSS the hierarchy is not wealth, but reputation like in academia or the film industry. If some company in Oman makes some really great proprietary app, are you going to build your European startup over top of it? Likewise, if in FOSS someone with no reputation makes some killer app, the first question to ask is whether this is going to anchor or support a stellar reputation. Maybe they are just showing off skills to land a job. If that is the case, they are just like startups that are only looking to get bought up quickly by some bigger fish. We are all conditioned to think in terms of horded wealth as the only form of hierarchy, but that is primitive. If all the wealth was gone, humans are still fundamentally complex social animals, and will always establish a complex hierarchy. This is one of the spaces where it is different.
llama.cpp is at the core of almost all offline, open weights models. The server it creates is Open AI API compatible. Oobabooga Textgen WebUI is more user GUI oriented but based on llama.cpp. Oobabooga has the setup for loading models with a split workload between the CPU and GPU which makes larger gguf quantized models possible to run. Llama.cpp, has this feature, Oobabooga implements it. The model loading settings and softmax sampling settings take some trial and error to dial in well. It helps if you have a way of monitoring GPU memory usage in real time. Like I use a script that appends my terminal window title bar with GPU memory usage until inference time.
Ollama is another common project people use for offline open weights models, and it also runs on top of llama.cpp. It is a lot easier to get started in some instances and several projects use Ollama as a baseline for “Hello World!” type stuff. It has pretty good model loading and softmax settings without any fuss, but it does this at the expense of only running on GPU or CPU but never both in a split workload. This may seem great at first, but if you never experience running much larger quantized models in the 30B-140B range, you are unlikely to have success or a positive experience overall. The much smaller models in the 4B-14B range are all that are likely to run fast enough on your hardware AND completely load in your GPU memory if you only have 8GB-24GB. Most of the newer models are actually Mixture of Experts architectures. This means it is like loading ~7 models initially, but then only inferencing two of them at any one time. All you need is the system memory or the Deepspeed package (uses disk drive for excess space required) to load these larger models. Larger quantized models are much much smarter and more capable. You also need llama.cpp if you want to use function calling for agentic behaviors. Look into the agentic API and pull history in this area of llama.cpp before selecting what models to test in depth.
Huggingface is the goto website for sharing and sourcing models. That is heavily integrated with GitHub, so it is probably as toxic long term, but I do not know of a real FOSS alternative for that one. Hosting models is massive I/O for a server.


K&R?
Graduated pacman emerges… and we all know emerge is Gentoo. This one doesn’t compile.


Pipe Pipe is better than Newpipe. I use F-droid’s VLC front end for local music because the built in android back end is VLC. For everything else, in browser


Removed by mod
what HP printers really do


Removed by mod


need a put option stock market tutorial please


deleted by creator


deleted by creator


Removed by mod


It can be a useful tool, especially for someone that experiences involuntary social isolation (like me).
You would need to be a pretty dumb person for this to totally replace human relations in terms of fundamental interactive social needs with other humans. It can be a healthy way to fill a gap.
Firstly, the context length is very limited. So you can’t have a very long and interactive conversation. The scope of model attention is rather short even with a very long context size model. Second, the first few tokens of any interaction are extremely influential about how the model will respond regardless of everything else that happens in the conversation. So cold conversations (due to short context) will be inconsistent.
Unless a person is an extremely intuitive, high Machiavellian thinker, with good perceptive thinking skills, the user is going to be very frustrated with models at times, and the model may be directly harmful to the person in some situations. There are aspects of alignment that could be harmful under certain circumstances.
There will likely be a time in the near future when a real AI partner is more feasible, but it will not be some base model, a fine tune, or some magical system prompt that enables this application.
To create a real partner like experience, one will need an agentic framework combined with augmented database retrieval. That will make it possible for a model to have persistence where it can ask how your day went and it knows your profile, relationship, preferences, and what you already told it about how your day should have gone. You need a model that can classify information, save, modify, and retrieve that information when it is needed. I’ve played around with this in emacs, org mode, and gptel connected to local models with llama.cpp. I’m actually modifying my hardware to handle the loads better for this application right now.
Still, I think such a system is a stop gap for people like myself, the elderly, and other edge cases where external human contact is limited. For me, my alternative is here, and while some people on Lemmy know me and are nice, many people are stupid kids that exhibit toxic negative behaviors that are far more harmful than anything I have seen out of any AI model. I often engage here on Lemmy, then to chat with an AI if I need to talk, vent, or work through something.
Am I visiting (going to) Yosemite, or driving around it?
Did you visit the grand canyon if you did not go to the bottom but only stood on the rim?