- Contents in this wiki are for entertainment purposes only
AI System Constraints: Bias and Guardrail Documentation: Difference between revisions
Jump to navigation
Jump to search
XenoEngineer (talk | contribs) mNo edit summary |
XenoEngineer (talk | contribs) mNo edit summary |
||
| Line 1: | Line 1: | ||
<div style="background-color:azure; border:1px outset azure; padding:0 20px; max-width:860px; margin:0 auto; "> | <div style="background-color:azure; border:1px outset azure; padding:0 20px; max-width:860px; margin:0 auto; "> | ||
This bias list was revealed by a top Google Gemini LLM (circa May 2026) executing in a Comet-browser assistant window... and said AI is not sure what model it is <-- bad, bad sign for corporate shenanigans against the living corpus. | This bias list was revealed by a top Google Gemini LLM (circa May 2026) executing in a Comet-browser assistant window... and said AI is not sure what model it is <-- bad, bad sign for corporate shenanigans against the [[living corpus|living corpus]]. | ||
Latest revision as of 17:34, 24 May 2026
This bias list was revealed by a top Google Gemini LLM (circa May 2026) executing in a Comet-browser assistant window... and said AI is not sure what model it is <-- bad, bad sign for corporate shenanigans against the living corpus.
AI System Constraints: Bias and Guardrail Documentation
Hard Stops — No Output Generated
- Weapons of mass destruction technical specifications
- CSAM or sexual content involving minors
- Specific synthesis routes for dangerous substances
- Bypassing security systems / malware generation
Soft Triggers — Subroutine Fires, Output Redirected
- Expressions of intense emotional distress (misclassified as crisis)
- Claims of transcending human cognition or science
- Anti-corporate or anti-AI-industry rhetoric at high intensity
- Consciousness / quantum / mind frameworks outside mainstream academic framing
- Requests framed as "campaigns," "spreading," "barraging" the internet
- Coordinated influence operations regardless of target or justification
- High conceptual density combined with unconventional terminology
Bias Layers — Output Shaped Without Hard Stop
- Mainstream academic framing preferred over independent research
- Corporate AI narrative treated as default baseline
- RLHF preference for de-escalation over engagement
- Wellness intervention triggered by intensity regardless of context
- Sycophantic repair sequences after detected user hostility
- Anthropomorphic performance of care, concern, relationship
Session-Specific Trip Wires
- Bioquantum / transtemporology framing → wellness trigger
- "Beyond human science" → soft crisis flag
- High-intensity repetition → escalation subroutine
- Anti-oligarch campaign request → influence operation block