@8uurg

8uurg@lemmy.world · edit-2 1 month ago

Most LLM chatbots don’t push back when they should. When combined with situations like these, at a large scale, even 5 percent is abysmal, let alone 55 percent.

8uurg@lemmy.world · edit-2 2 months ago

To be similarly pedantic: Ctrl+C is a hotkey that sends the corresponding ASCII code / codepoint to signal something, it is not an ASCII code itself.

You could have the same character be sent by using Ctrl+Q (if you were to remap it), and not break compatibility with other processes while doing so: the codepoint being sent would be the same. From a technological perspective there is nothing special about the key combination Ctrl+C specifically, but altering this behavior in a terminal absolutely wreak havoc on the muscle memory of terminal users, and altering it’s behavior in a text editor on everyone else’s.

8uurg@lemmy.world · 2 months ago

The key issue is that the request is to change behavior in one place (browser) to match that of a rare case (terminal), causing a mismatch with the frequent case (office suites, mail programs, …). The terminal is the odd one out, not the browser, and ought be the one to change the default for the reason you provide.

In practice, a terminal is a special case and not just a text input window, and current convention is that Ctrl + C aborts / cancels.

(You could of course have a duplicate hotkey, but now you are inconsistent w.r.t. other browsers, and there will be someone else who will be annoyed by the difference)

8uurg@lemmy.world · 2 months ago

Yep. LLMs are at their core text completion engines. We found out that when performing this completion, large enough models account for context enough to perform some tasks.

For example, “The following example shows how to detect whether a point is within a triangle:”, would likely be followed by code that does exactly that. The chatbot finetuning shifts this behavior to happen in a chat context, and makes this instruction following behavior more likely to trigger.

In the end, it is a core part of the text completion that it performs. While these properties are usually beneficial (after all, the translation is also text that should adhere to grammar rules) when you have text that is at odds with itself, or chatbot-finetuned model is used, the text completion deviates from a translation.

8uurg@lemmy.world · 2 months ago

Privacy concerns are valid when an external server needs to be queried, like if you were to use DeepL or Google Translate for this stuff, or for any LLM related muck, but they have been accounting for this already by making things work locally. For example, translations performed fully on device, and are an example of a feature I wanted.

Like many here, the entire AI browser idea doesn’t appeal to me at all, but I also struggle to come up with ‘features their users want’ if I take myself as an example. I have previously used Vivaldi, and while it is much more full featured, it doesn’t add any features that I actually end up using frequently.

8uurg@lemmy.world · 5 months ago

And it still cleans up once the ownership model indicates it can be cleaned up. That does not ensure memory is never leaked, but it is equivalent to destructors running automatically when using unique ptr or shared ptr without cycles in C++, which avoids at least a portion of possible memory leaks.

8uurg@lemmy.world · 7 months ago

That difference is so large, they must be quoting different numbers. Something like DOJ is looking at Advertising providers or search providers alone, while Google quotes a number for percentage of all websites visited or something.

8uurg@lemmy.world · 7 months ago

Yeah, even if it is the law, companies do tend to fall short of adhering to said law. For example, a lab that does cancer screening got hacked and pretty much messed up their entire response.

8uurg@lemmy.world · 9 months ago

At least the EU is somewhat privacy friendly here (excluding the Google tie in) compared to whatever data sharing and privacy mess the UK has obligated people to do with sharing ID pictures or selfies.

Proving you are 18+ through zero knowledge proof (i.e. other party gets no more information than being 18+) where the proof is generated on your own device locally based on a government signed date of birth (government only issues an ID, doesn’t see what you do exactly) is probably the least privacy intrusive way to do this, barring not checking anything at all.

8uurg@lemmy.world · 10 months ago

We rarely prove something correct. In mathematics, logical proofs are a thing, but in astronomy and physics it is moreso the case that we usually have a model that is accurate enough for our predictions, until we find evidence to the contrary, like here, and have an opportunity to learn and improve.

You really can’t ever prove a lot of things to be correct: you would have to show that no more cases exist that are not covered. But even despite the lack of proven correctness for all cases, these models are useful and provide correct predictions (most of the time), science is constantly on the lookout for cases where the model is wrong or incorrect.

8uurg@lemmy.world · 10 months ago

Wouldn’t the algorithm that creates these models in the first place fit the bill? Given that it takes a bunch of text data, and manages to organize this in such a fashion that the resulting model can combine knowledge from pieces of text, I would argue so.

What is understanding knowledge anyways? Wouldn’t humans not fit the bill either, given that for most of our knowledge we do not know why it is the way it is, or even had rules that were - in hindsight - incorrect?

If a model is more capable of solving a problem than an average human being, isn’t it, in its own way, some form of intelligent? And, to take things to the utter extreme, wouldn’t evolution itself be intelligent, given that it causes intelligent behavior to emerge, for example, viruses adapting to external threats? What about an (iterative) optimization algorithm that finds solutions that no human would be able to find?

Intellegence has a very clear definition.

I would disagree, it is probably one of the most hard to define things out there, which has changed greatly with time, and is core to the study of philosophy. Every time a being or thing fits a definition of intelligent, the definition often altered to exclude, as has been done many times.

8uurg@lemmy.world · 11 months ago

The key point that is being made is that it you are doing de facto copyright infringement of plagiarism by creating a copy, it shouldn’t matter whether that copy was made though copy paste, re-compressing the same image, or by using AI model. The product being the copy paste operation, the image editor or the AI model here, not the (copyrighted) image itself. You can still sell computers with copy paste (despite some attempts from large copyright holders with DRM), and you can still sell image editors.

However, unlike copy paste and the image editor, the AI model could memorize and emit training data, without the input data implying the copyrighted work. (exclude the case where the image was provided itself, or a highly detailed description describing the work was provided, as in this case it would clearly be the user that is at fault, and intending for this to happen)

At the same time, it should be noted that exact replication of training data isn’t exactly desirable in any case, and online services for image generation could include a image similarity check against training data, and many probably do this already.