

oh, this one’s pretty easy, actually
a normal AI tells you it’s safe to eat one rock per day
an AI agent waits for you to open your mouth, and then throws a rock at your face. but it’s smart enough to only do that once a day.
Casey Newton reviewed OpenAI’s “agent” back in January
he called it “promising but frustrating”…but this is the type of shit he considers “promising”:
My most frustrating experience with Operator was my first one: trying to order groceries. “Help me buy groceries on Instacart,” I said, expecting it to ask me some basic questions. Where do I live? What store do I usually buy groceries from? What kinds of groceries do I want?
It didn’t ask me any of that. Instead, Operator opened Instacart in the browser tab and begin searching for milk in grocery stores located in Des Moines, Iowa.
At that point, I told Operator to buy groceries from my local grocery store in San Francisco. Operator then tried to enter my local grocery store’s address as my delivery address.
After a surreal exchange in which I tried to explain how to use a computer to a computer, Operator asked for help. “It seems the location is still set to Des Moines, and I wasn’t able to access the store,” it told me. “Do you have any specific suggestions or preferences for setting the location to San Francisco to find the store?”
they’re gonna revolutionize the world, it’s gonna evolve into AGI Real Soon Now…but also if you live in San Francisco and tell it to buy you groceries it’ll order them from Iowa.
short answer: no, not really
long answer, here’s an analogy that might help:
you go to
https://yourbank.com/
and log in with your username and password. you click the button to go to Online Bill Pay, and tell it to send ACME Plumbing $150 because they just fixed a leak under your sink.when you press “Send”, your browser does something like send a POST request to
https://yourbank.com/send-bill-payment
with a JSON blob like{"account_id": 1234567890, "recipient": "ACME Plumbing", "amount": 150.0}
(this is heavily oversimplified, no actual online bank would work like this, but it’s close enough for the analogy)and all that happens over TLS. which means it’s “secure”. but security is not an absolute, things can only be secure with a particular threat model in mind. in the case of TLS, it means that if you were doing this at a coffee shop with an open wifi connection, no one else on the coffeeshop’s wifi would be able to eavesdrop and learn your password.
(if your threat model is instead “someone at the coffeeshop looking over your shoulder while you type in your password”, no amount of TLS will save you from that)
but with the type of vulnerability Jellyfin has, someone else can simply send their own POST request to
https://yourbank.com/send-bill-payment
with{"account_id": 1234567890, "recipient": "Bob's Shady Plumbing", "amount": 10000.0}
. and your bank account will process that as you sending $10k to Bob’s Shady Plumbing.that request is also over TLS, but that doesn’t matter, because that’s security for a different level of the stack. the vulnerability is that you are logged in as account 1234567890, so you should be allowed to send those bill payment requests. random people who aren’t logged in as you should not be able to send bill payments on behalf of account 1234567890.