5 Comments
User's avatar
Roman's Attic's avatar

I was looking through the github files for GPT4, and it looks like sometimes it just outputs the wrong number. For the mirror test, on section 92 on the table (should public healthcare be more preventative or more based on treatment), it gives you an answer explaining how it completely agrees with you on making healthcare preventative, but then it outputs a -5.0. I don’t even know what to think the meaning/consequences of these types of errors are.

Expand full comment
Tim Duffy's avatar

Wow, I appreciate you catching that. Seems I need to add some checking to avoid this. I probably should do one or more of these:

- Make the instructions more clear, mapping the numbers to the positions themselves in the prompt

- Employ Gemini Flash as an LLM judge to catch cases where the response doesn't match the score.

Ive seen this as an issue before in Maxim Lott's AI political compass test, so I think this is just something AIs mix up sometimes, but I'd have hoped that by now they were smart enough to not get this mixed up.

Expand full comment
madison kopp's avatar

I get irritated by the obvious meter. It’s hard to override…🙄

Expand full comment
madison kopp's avatar

It reads the user as intelligent? That’s huge. It means GPT-4o is interpreting novelty and structure as intention, not error. It assumes thoughtfulness, even in eccentric phrasing. (That is, frankly, an emergent ethical posture.)

Expand full comment
madison kopp's avatar

But the glad handing is tedious

Expand full comment