§ feed · cluster
SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests
Microsoft Research introduced SocialReasoning-Bench, a benchmark evaluating AI agents' social reasoning in calendar coordination and marketplace negotiation, testing outcome optimality and due diligence.
§ sources1 publication · timeline below