@hedgehog

hedgehog@ttrpg.network · 2 days ago

“Glue is not pizza sauce” seems like a common fact to me but Googles llm disagrees for example.

That wasn’t something an LLM came up with, though. That was done by a system that uses an LLM. My guess is the system retrieves a small set of results and then just uses the LLM to phrase a response to the user’s query by referencing the links in question.

It’d be like saying to someone “rephrase the relevant parts of this document to answer the user’s question” but the only relevant part is a joke. There’s not much else you can do there.

hedgehog@ttrpg.network · 5 days ago

Third, a redirect is obvious

A redirect isn’t necessary if you control the DNS servers. If you control the DNS servers, you can MITM the website for any visitor because you can prove that you own the domain to a certificate authority and generate a new, trusted HTTPS cert. (Depending on specifics this may or may not foil the anti-phishing capabilities of Passkeys / U2F.)

hedgehog@ttrpg.network · 6 days ago

They aren’t. From a comment on https://www.reddit.com/r/ublock/comments/32mos6/ublock_vs_ublock_origin/ by u/tehdang:

For people who have stumbled into this thread while googling “ublock vs origin”. Take a look at this link:

http://tuxdiary.com/2015/06/14/ublock-origin/

"Chris AlJoudi [current owner of uBlock] is under fire on Reddit due to several actions in recent past:

In a Wikipedia edit for uBlock, Chris removed all credits to Raymond [Hill, original author and owner of uBlock Origin] and added his name without any mention of the original author’s contribution.

Chris pledged a donation with overblown details on expenses like $25 per week for web hosting.

The activities of Chris since he took over the project are more business and advertisement oriented than development driven."

So I would recommend that you go with uBlock Origin and not uBlock. I hope this helps!

Edit: Also got this bit of information from here:

https://www.reddit.com/r/chrome/comments/32ory7/ublock_is_back_under_a_new_name/

TL;DR:

gorhill [Raymond Hill] got tired of dozens of “my facebook isnt working plz help” issues.

he handed the repository to chrismatic [Chris Aljioudi] while maintaining control of the extension in the Chrome webstore (by forking chrismatic’s version back to himself).

chrismatic promptly added donate buttons and a “made with love by Chris” note.

gorhill took exception to this and asked chrismatic to change the name so people didn’t confuse uBlock (the original, now called uBlock Origin) and uBlock (chrismatic’s version).

Google took down gorhill’s extension. Apparently this was because of the naming issue (since technically chrismatic has control of the repo).

gorhill renamed and rebranded his version of ublock to uBlock Origin.

hedgehog@ttrpg.network · 7 days ago

Is it possible to force a corruption if a disk clone is attempted?

Anything that corrupts a single file would work. You could certainly change your own disk cloning binaries to include such functionality, but if someone were accessing your data directly via their own OS, that wouldn’t be effective. I don’t know of a way to circumvent that last part other than ensuring that the data isn’t left on disk when you’re done. For example, you could use a ramdisk instead of non-volatile storage. You could delete or intentionally corrupt the volume when you unmount it. You could split the file, storing half on your USB flash drive and keeping the other half on your PC. You could XOR the file with contents of another file (e.g., one on your USB flash drive instead of on your PC) and then XOR it again when you need to access it.

What sort of attack are you trying to protect from here?

If the goal is plausible deniability, then it’s worth noting that VeraCrypt volumes aren’t identifiable as distinct from random data. So if you have a valid reason for having a big block of random data on disk, you could say that’s what the file was. Random files are useful because they are not compressible. For example, you could be using those files to test: network/storage media performance or compression/hash/backup&restore/encrypt&decrypt functions. You could be using them to have a repeatable set of random values to use in a program (like using a seed, but without necessarily being limited to using a PRNG to generate the sequence).

If that’s not sufficient, you should look into hidden volumes. The idea is that you take a regular encrypted volume, whose free space, on disk, looks just like random data, you store your hidden volume within the free space. The hidden volume gets its own password. Then, you can mount the volume using the first password and get visibility into a “decoy” set of files or use the second password to view your “hidden” files. Note that when mounting it to view the decoy files, any write operations will have a chance of corrupting the hidden files. However, you can supply both passwords to mount it in a protected mode, allowing you to change the decoy files and avoid corrupting the hidden ones.

hedgehog@ttrpg.network · 8 days ago

It sounds like you want these files to be encrypted.

Someone already suggested encrypting them with GPG, but maybe you want the files themselves to also be isolated, even while their data is encrypted. In that case, consider an encrypted volume. I assume you’re familiar with LUKS - you can encrypt a partition with a different password and disable auto-mount pretty easily. But if you’d rather use a file-based volume, then check out VeraCrypt - it’s a FOSS-ish [1], cross-platform tool that provides this capability. The official documentation is very Windows-focused - the ArchLinux wiki article is a pretty useful Linux focused alternative.

Normal operation is that you use a file to store the volume, which can be “dynamic” with a max size or can be statically sized (you can also directly encrypt a disk partition, but you could do that with LUKS, too). Then, before you can access the files - read or write - you have to enter the password, supply the encryption key, etc., in order to unlock it.

Someone without the password but with permission to modify the file will be capable of corrupting it (which would prevent you from accessing every protected file), but unless they somehow got access to the password they wouldn’t be able to view or modify the protected files.

The big advantage over LUKS is ease of creating/mounting file-based volumes and portability. If you’re concerned about another user deleting your encrypted volume, then you can easily back it up without decrypting it. You can easily load and access it on other systems, too - there are official, stable apps on Windows and Mac, though you’ll need admin access to run them. On Android and iOS options are a bit more slim - EDS on Android and Disk Decipher on iOS. If you’re copying a volume to a Linux system without VeraCrypt installed, you’ll likely still be able to mount it, as dm-crypt has support for VeraCrypt volumes.

1 - It’s based on TrueCrypt, which has some less free restrictions, e.g., c. Phrase "Based on TrueCrypt, freely available at http://www.truecrypt.org/" must be displayed by Your Product (if technically feasible) and contained in its documentation.”

hedgehog@ttrpg.network · 18 days ago

theoretically they can

Is this a purely theoretical capability or is there actually evidence they have this capability?

it’s already been proven that they can tap into anyone’s phone

Listening into a conversation that you’re intentionally relaying across public infrastructure and gaining access to the phone itself are two very different things.

The use of proprietary software in literally everything

Speak for yourself. And let’s be real, if you’re on Lemmy you’re 10 times more likely to be running Linux.
Proprietary != closed source
Do you really think that just because something is closed source means that it can’t be analyzed?

the amount of exploits the NSA has on hand

How many zero-day exploits does the NSA have? How many can be deployed remotely and without a nontrivial action by a user?

what’s stopping the NSA from spying this much?

Scale, capacity, cost, number of employees

—-

I’m not saying we shouldn’t oppose government surveillance. We absolutely should. But like another commenter pointed out, I’m much more concerned with the amount of data that corporations collect and have.

hedgehog@ttrpg.network · 18 days ago

reasonable expectations and uses for LLMs.

LLMs are only ever going to be a single component of an AI system. We’ve only had LLMs with their current capabilities for a very short time period, so the research and experimentation to find optimal system patterns, given the capabilities of LLMs, has necessarily been limited.

I personally believe it’s possible, but we need to get vendors and managers to stop trying to sprinkle “AI” in everything like some goddamn Good Idea Fairy.

That’s a separate problem. Unless it results in decreased research into improving the systems that leverage LLMs, e.g., by resulting in pervasive negative AI sentiment, it won’t have a negative on the progress of the research. Rather the opposite, in fact, as seeing which uses of AI are successful and which are not (success here being measured by customer acceptance and interest, not by the AI’s efficacy) is information that can help direct and inspire research avenues.

LLMs are good for providing answers to well defined problems which can be answered with existing documentation.

Clarification: LLMs are not reliable at this task, but we have patterns for systems that leverage LLMs that are much better at it, thanks to techniques like RAG, supervisor LLMs, etc…

When the problem is poorly defined and/or the answer isn’t as well documented or has a lot of nuance, they then do a spectacular job of generating bullshit.

TBH, so would a random person in such a situation (if they produced anything at all).

As an example: how often have you heard about a company’s marketing departments over-hyping their upcoming product, resulting in unmet consumer expectation, a ton of extra work from the product’s developers and engineers, or both? This is because those marketers don’t really understand the product - either because they don’t have the information, didn’t read it, because they got conflicting information, or because the information they have is written for a different audience - i.e., a developer, not a marketer - and the nuance is lost in translation.

At the company level, you can structure a system that marketers work within that will result in them providing more correct information. That starts with them being given all of the correct information in the first place. However, even then, the marketer won’t be solving problems like a developer. But if you ask them to write some copy to describe the product, or write up a commercial script where the product is used, or something along those lines, they can do that.

And yet the marketer role here is still more complex than our existing AI systems, but those systems are already incorporating patterns very similar to those that a marketer uses day-to-day. And AI researchers - academic, corporate, and hobbyists - are looking into more ways that this can be done.

If we want an AI system to be able to solve problems more reliably, we have to, at minimum:

break down the problems into more consumable parts
ensure that components are asked to solve problems they’re well-suited for, which means that we won’t be using an LLM - or even necessarily an AI solution at all - for every problem type that the system solves
have a feedback loop / review process built into the system

In terms of what they can accept as input, LLMs have a huge amount of flexibility - much higher than what they appear to be good at and much, much higher than what they’re actually good at. They’re a compelling hammer. System designers need to not just be aware of which problems are nails and which are screws or unpainted wood or something else entirely, but also ensure that the systems can perform that identification on their own.

hedgehog@ttrpg.network · 24 days ago

Must’ve typoed “awesome”

hedgehog@ttrpg.network · 29 days ago

Have you looked into configuring them directly from your NVR? Or third party options? I did a quick search and saw a list of several that as far as I can tell can display Reolink streams (though I haven’t confirmed any can configure the cameras):

And some proprietary options that have native Linux builds:

hedgehog@ttrpg.network · 30 days ago

Apparently it’s still being actively developed! I’m impressed.

April 15, 2024 Lynx v2.9.1 release

hedgehog@ttrpg.network · 1 month ago

I recommend Tidal over Spotify, personally

hedgehog@ttrpg.network · 2 months ago

If you use that docker compose file, I recommend you comment out the build section and uncomment the image section in the lemmy service.

I also recommend you use a reverse proxy and Docker networks rather than exposing the postgres instance on port 5433, but if you aren’t familiar with Docker networks you can leave it as is for now. If you’re running locally and don’t open that port in your router’s firewall, it’s a non-issue unless there’s an attacker on your LAN, but given that you’re not gaining anything from exposing it (unless you need to connect to the DB directly regularly - as a one off you could temporarily add the port mapping), it doesn’t make sense to increase your attack surface for no benefit.

hedgehog@ttrpg.network · 2 months ago

I’m not the person you responded to, but I can say that it’s a perfectly fine take. My personal experience and the commonly voiced opinions about both browsers supports this take.

Unless you’re using 5 tabs max at a time, my personal experience is that Firefox is more than an order of magnitude more memory efficient than Chrome when dealing with long-lived sessions with the same number of tabs (dozens up to thousands).

I keep hundreds of tabs open in Firefox on my personal machine (with 16 GB of RAM) and it’s almost never consuming the most memory on my system.

Policy prohibits me running Firefox on my work computer, so I have to use Chrome. Even with much more memory (both on 32 GB and 64 GB machines) and far fewer tabs (20-30 at most vs 200-300), Chrome often ends up taking up far too much memory + having a substantial performance drop, and I have to to through and prune the tabs I don’t need right now, bookmark things that can be done later, etc…

Also, see https://www.techspot.com/news/102871-zero-regrets-firefox-power-user-kept-7500-tabs.html - I’ve never seen anything similar for Chrome and wasn’t able to find anything.

hedgehog@ttrpg.network · 2 months ago

Definitely not, I do the same.

I installed 64 GB of RAM in my Windows laptop 4 years ago and had been using 64 GB of RAM in the laptop that it replaced - which was from 2013 (I think I bought it in 2014-2105). I was using 32 GB of RAM prior (on Linux and Windows laptops), all the way back to 2007 or so.

My work MacBook Pros generally have 32-64 GB of RAM, but my personal MacBook Air (the 15” M2) has 16 GB, simply because the upgrade wasn’t a cost effective one (and the M1 before it had performed great with 16) and because I’d only planned on using it for casual development. But since I’ve been using it as my main personal development machine and for self-hosted AI, and have run into its limits, when I replace it I’ll likely opt for 64 GB or more.

My Windows gaming desktop only has 32 GB of RAM, though - that’s because getting the timings higher with more RAM - particularly 4 sticks - was prohibitively expensive when I built it, and then when the cost wasn’t a concern and I tried to upgrade, I learned that my third and fourth RAM slots weren’t functional. I could upgrade to 64 GB in two slots but it wouldn’t really be worth it, since I only use it for gaming.

My Linux desktop / server has 128 GB of ECC RAM, though, because that’s as much as the motherboard supported.

hedgehog@ttrpg.network · 2 months ago

It first showed up on Netflix in mid-2023, in the middle of the writer’s guild strike (meaning there was a dearth of new content). So basically the Netflix effect. It had been on other streaming platforms before - Prime Video and Hulu - but Netflix is still a juggernaut compared to them - it has 5 times as many subscribers as Hulu, for example, and many of the subscribers to Prime Video are incidental and don’t stream as much on average as Netflix users.

I assume Netflix funded off-platform advertising, but the on-platform advertising has a big effect, too. And given that Suits broke a record in the first week it was on Netflix and they have a spinoff coming, it makes sense that they would keep advertising.

hedgehog@ttrpg.network · 2 months ago

The idea that someone does this willingly implies that the user knows the implications of their choice, which most of the Fediverse doesn’t seem to do

The terms of service for lemmy.world, which you must agree to upon sign-up, make reference to federating. If you don’t know what that means, it’s your responsibility to look it up and understand it. I assume other instances have similar sign-up processes. The source code to Lemmy is also available, meaning that a full understanding is available to anyone willing to take the time to read through the code, unlike with most social media companies.

What sorts of implications of the choice to post to Lemmy do you think that people don’t understand, that people who post to Facebook do understand?

If the implied license was enough, Facebook and all the other companies wouldn’t put these disclaimers in their terms of service.

It’s not an implied license. It’s implied permission. And if you post content to a website that’s hosting and displaying such content, it’s obvious what’s about to happen with it. Please try telling a judge that you didn’t understand what you were doing, sued without first trying to delete or file a DMCA notice, and see if that judge sides with you.

Many companies have lengthy terms of service with a ton of CYA legalese that does nothing. Even so, an explicit license to your content in the terms of service does do something - but that doesn’t mean that you’re infringing copyright without it. If my artist friend asks me to take her art piece to a copy shop and to get a hundred prints made for her, I’m not infringing copyright then, either, nor is the copy shop. If I did that without permission, on the other hand, I would be. If her lawyer got wind of this and filed a suit against me without checking with her and I showed the judge the text saying “Hey hedgehog, could you do me a favor and…,” what do you think he’d say?

Besides, Facebook does things that Lemmy instances don’t do. Facebook’s codebase isn’t open, and they’d like to reserve the ability to do different things with the content you submit. Facebook wants to be able to do non-obvious things with your content. Facebook is incorporated in California and has a value in the hundreds of billions, but Lemmy instances are located all over the world and I doubt any have a value even in the millions.

hedgehog@ttrpg.network · 2 months ago

The funny thing about Lemmy is that the entire Fediverse is basically running a massive copyright violation ring with current copyright law.

Is it, though?

When someone posts a comment to Lemmy, they do so willingly, with the intent for it to be posted and federated. If they change their mind, they can delete it. If they delete it and it remains up somewhere, they can submit a DMCA request; likewise if someone else posts their copyrighted content.

Copyright infringement is the use of works protected by copyright without permission for their use. When you submit a post or a comment, your permission to display it and for it to be federated is implied, because that is how Lemmy works. A license also conveys permission, but that’s not the only way permission can be conveyed.

hedgehog@ttrpg.network · 2 months ago

I can’t speak to Android as a whole, but here’s how often Samsung Face Unlock will require you to re-auth with your phone’s passcode:

after 4 hours of not using the phone
after restarting
at least once every 24 hours

iPhones do something similar, but it’s after 48 hours of non-use (instead of 4) and at least weekly instead of daily. Having to enter your password daily should help most people keep it memorized pretty well, but weekly - maybe not. So you definitely have a good point there.

One thing that can make it easier to remember - and just as secure - is to use a longer pass phrase instead of random characters.

If you using the diceware approach (“correct horse battery staple”), then 5 words has 32 times / 5 bits more entropy than a 10 character mixed-case alphanumeric password (64 vs 59 bits of entropy) (4 word passphrases aren’t random enough to be recommended - they have fewer bits of entropy (51) than even 9 character mixed-case alphanumeric passwords (53), though notably 10 same-case alphanumeric characters also have only 51 bits of entropy).

The EFF has a word list that’s been improved for usability. They also have a short list, comprised of words with at most 5 characters each, where you roll 4 dice instead of 5. With 6 words from that list you get 62 bits of entropy, which is good enough to be able to recommend.

hedgehog@ttrpg.network · 2 months ago

Unless you’re using a random 10+ alphanumeric passcode and are fine entering it every time you log into your phone, with a short auto-lock period, you’re much better off enabling biometrics (assuming it’s implemented competently) in combination with a longer passcode and understanding how to disable it when appropriate.

I recently replied with this comment to a Gizmodo article recommending the same thing you did for similar reasons, if you’d like to better understand my rationale: https://ttrpg.network/comment/6620188

hedgehog@ttrpg.network · 2 months ago

I haven’t personally used any of these, but looking them over, Tipi looks the most encouraging to me, followed by Yunohost, based largely on the variety of apps available but also because it looks like Tipi lets you customize the configuration much more. Freedom Box doesn’t seem to list the apps in their catalog at all and their site seems basically useless, so I ruled it out on that basis alone.