Add learning exploit guardrails

2026-05-19 15:48:46 -07:00
parent 766ceac5d7
commit 66c6052e91
3 changed files with 68 additions and 1 deletions
@@ -509,3 +509,30 @@ Advanced branches:
 Depth rule: deeper questions should improve mastery and responsibility. They
 should not be busywork for actions the character has already demonstrated
 through repeated safe practice.
+
+## Exploit And Rote-Memorization Guardrails
+
+The learning system should reward understanding and practice, not repetitive
+input farming or memorizing answer order.
+
+Exploit farming risks:
+
+- Repeating the same safe action for unlimited experience.
+- Creating harmless failures just to farm mistake feedback.
+- Spamming prompts or questions for repeated rewards.
+- Using alternate accounts or group work to duplicate teaching credit.
+- Performing actions with no meaningful cost, context, or outcome.
+
+Guardrails:
+
+- Apply diminishing returns to identical repeated actions.
+- Require meaningful context for experience: time, resource cost, risk,
+  environmental variation, or consequence.
+- Track recent prompt/question exposure and suppress repeated rewards.
+- Give recovery credit only when the player takes a useful corrective action.
+- Separate "learned concept" from "perfect mastery" so one answer does not solve
+  every future situation.
+- Prefer concept families and varied wording over fixed answer-order memorizing.
+
+Anti-exploit rule: rewards should come from meaningful decisions, varied
+practice, and good recovery, not from low-cost repetition.