

To be fair, GPT is not a person. It’s like a fuzzy database with lossy-compression. If they over-trained GPT on specific books, it could cite the books verbatim, which would then violate copyright and IP laws. (Not that I’m a fan of IP laws).
To be fair, GPT is not a person. It’s like a fuzzy database with lossy-compression. If they over-trained GPT on specific books, it could cite the books verbatim, which would then violate copyright and IP laws. (Not that I’m a fan of IP laws).
I think it’s a convention taken from math notation conventions.
Nobody really knows because it’s an OpenAI trade secret (they’re not very “open”). Normally, it’s a hard limit for LLMs, but many believe OpenAI are using some tricks to increase the effective context limit. I.e. some people believe instead of feeding back the whole conversation, they have GPT create create a shorter summaries of parts of the conversation, then feed the summaries back in.
Yeah, that’s how these models work. They have also have a context limit, and if the conversation goes too long they start “forgetting” things and making more mistakes (because not all of the conversation can be fed back in).
On a positive note. I am using the API with 3.5-turbo in an app prototype, and it seems to be following instructions better than it used to. For ChatGPT, I don’t really care to sacrifice speed for quality in my use-cases.