Ironwood Dungeon

How This Was Built

Ironwood Dungeon is a browser-based role-playing game where an AI acts as your Game Master. The AI is a large language model from NVIDIA (nemotron-3-super-120b) accessed through their API. It reads a description of your situation, decides what to say, and calls server-side tools to check the actual game state before it speaks. It cannot make things up because every item, HP value, and room detail has to come from a real file on the server.

The backend is written in Python using Flask for the web server and Flask-SocketIO for real-time communication between the browser and server. Game state is stored in plain JSON files: game_state.json tracks your floor, gold, and health. character.json holds your character sheet. inventory.json holds every item you are carrying. The dungeon itself is generated fresh each run by dungeon_generator.py, which creates four floors of rooms with their own themes, enemies, traps, and puzzles.

The frontend is a single HTML page with vanilla JavaScript. No React, no Vue, no build step. All the panels, combat UI, minimap, character sheet, and history log are built directly in the DOM. Music and sound effects play through the HTML audio element with a state machine that switches tracks depending on whether you are in the village, exploring a dungeon floor, fighting a regular enemy, or facing a boss.

Combat follows D and D 5th Edition rules. Saving throws, conditions, spell slots, proficiency bonus, advantage and disadvantage - all resolved by the server before the AI narrates the result. The AI is only allowed to describe what already happened. It cannot invent outcomes.

Task 1 - Grounded AI Responses

The central engineering challenge was making the AI Game Master impossible to fool. It cannot describe an item the player does not have, cannot invent HP values, and cannot narrate a spell it has not verified exists in the player's spell slots. Here is exactly how each requirement from Task 1 is satisfied in the code.

Requirement - get_inventory() tool

The tool is registered in the AI's tool list so the model knows it exists and can call it. When called, it reads inventory.json from disk and returns the real item list. The AI cannot describe any item without calling this first because the system prompt forbids it (see Rule 1 below).

rpg_engine.py line 259 - tool definition name field rpg_engine.py line 415 - dispatch: if name == "get_inventory": return json.dumps(get_inventory()) rpg_engine.py line 6 - INVENTORY_FILE = BASE / "inventory.json"

Requirement - use_item(item_id, target) tool

When the AI decides to use an item, it calls this tool with the item ID and the target. The server removes or modifies the item in inventory.json and returns the result. The AI then narrates only what the tool result says happened. If the item does not exist, the tool returns an error and the AI must narrate the failure.

rpg_engine.py line 264 - tool definition name field rpg_engine.py line 416 - dispatch: if name == "use_item": return json.dumps(use_item(...))

Requirement - Structured output with narrative_text, current_hp, available_actions

Every response the AI sends ends with a call to the game_response() tool. This tool has a strict JSON schema the model must follow. The schema marks all three fields as required, so the AI cannot skip them. The server extracts these values and sends them to the frontend. The player's HP shown on screen comes from current_hp in this response, not from anything the AI wrote in prose.

rpg_engine.py line 346 - game_response tool name rpg_engine.py line 349 - narrative_text field definition rpg_engine.py line 350 - current_hp field definition rpg_engine.py line 352 - available_actions field definition rpg_engine.py line 357 - required: ["narrative_text", "current_hp", "available_actions"]

Requirement - LLM must verify items via tool call before describing them

This is enforced by the system prompt. The very first ironclad rule the AI sees when a session starts tells it to call get_inventory() before mentioning any item at all. If the AI skips this and invents an item, it has violated the grounding contract written into its instructions. The rule is written as a critical failure condition so the model treats it as a hard constraint, not a suggestion.

app.py line 65 - IRONCLAD RULES section header app.py line 66 - Rule 1: "Call get_inventory() before referencing, describing, or using ANY item"

Conversation Memory

The AI Game Master remembers everything that happened in your session. Every time you take an action, the game sends the full conversation history to the AI, not just your latest message. This means the AI knows which rooms you visited, which enemies you fought, what you said to the innkeeper, and how close to death you came three floors ago. It does not reset between turns.

After every turn, the complete history is written to a file on the server. When the next turn starts, that file is read back in and passed to the AI alongside the new message. The history grows throughout the session and the AI always has the full picture of what came before. In multiplayer, all players share one history file per room, so the AI remembers the actions of every player in the party, not just whoever is acting right now.

app.py line 155 - get_or_init_history() loads the saved history at the start of every turn, or starts fresh if it is the first turn app.py line 620 - history is loaded before every player action and passed to the AI rpg_engine.py line 262 - save_history() writes the updated conversation to disk after each turn rpg_engine.py line 266 - load_history() reads it back at the start of the next turn rpg_engine.py line 19 - _hist_file() returns the correct history file for the current room, so each multiplayer room keeps its own separate memory rpg_engine.py line 381 - run_turn() receives the full history list as a parameter and sends it to the AI model unchanged

One known limitation: there is no truncation or summarization. The history file grows with every turn and is always sent in full to the AI. Every language model has a maximum amount of text it can read at once, called the context window. If a session runs long enough, the history will eventually exceed that limit and the API will return an error. For a normal run through all four floors, which takes roughly 50 to 100 turns, this is very unlikely to happen. A very long session or a session with unusually verbose narration could hit it. The proper fix is a sliding window that drops the oldest turns, or a summarization step that compresses them into a short paragraph before they are discarded. Neither is implemented yet.

Persistent world state

The world state is never held only in memory. Every meaningful change is written to disk immediately. This means the server can restart and the game continues exactly where it left off. The frontend reads these files on resume and restores the full UI state including the map, character sheet, and conversation history.

rpg_engine.py line 6 - INVENTORY_FILE = BASE / "inventory.json" rpg_engine.py line 7 - GAME_STATE_FILE = BASE / "game_state.json" character_manager.py line 670 - character.json read in load_character()

Character Portraits

All portraits were generated using ChatGPT image generation. Each prompt described the character or creature with a dark dungeon atmosphere, fantasy oil painting style, and portrait orientation. Files are stored in static/portraits/ and named exactly after the class or enemy they represent.

Character Classes (10)

The portrait appears in the character sheet panel on the right and in the status bar at the top of the screen. The updatePortrait(charClass) function in templates/index.html loads the file by converting the class name to title case and appending .png.

Barbarian
Bard
Cleric
Druid
Fighter
Monk
Paladin
Ranger
Rogue
Wizard

Enemy Portraits (16)

When combat starts, the updateEnemyPortrait(name) function in templates/index.html looks the enemy name up in ENEMY_PORTRAIT_NAMES, builds a URL like /static/portraits/Dungeon%20Troll.png, and shows the portrait in the center panel above the dice strip. It disappears when the enemy is defeated. Filenames match the enemy name exactly, spaces included.

Cave Rat Swarm (minor, floor 1)
Goblin Scout (minor, floor 1)
Skeleton Warrior (minor, floor 1)
Shadow Creeper (minor, floor 2)
Zombie Brute (minor, floor 2)
Mimic (minor, floor 2)
Stone Sentinel (minor, floor 3)
Berserker Cultist (minor, floor 3)
Wight Soldier (minor, floor 3)
Vampire Spawn (minor, floor 4)
Bone Wyvern (minor, floor 4)
Hell Cultist (minor, floor 4)
Dungeon Troll (boss, floor 1)
Shadow Wraith (boss, floor 2)
Stone Golem (boss, floor 3)
Bone Dragon (boss, floor 4)

Music

All eight background music tracks were generated using Suno.com based on written prompts. Each track was designed for a specific game state and loops seamlessly except for the two ending themes.

About Page Theme 0. Credit Page.mp3
Village Hub Theme 1. Village Hub Theme.mp3
Dungeon Exploration 2. Dungeon Exploration.mp3
Regular Combat 3. Regular Combat.mp3
Boss Battle 4. Boss Battle.mp3
Mystical Menu Theme 5. Mystical Menu Theme.mp3
Victory Ending Theme 6. Victory Ending Theme.mp3
Sad Ending Theme 7. Sad Ending Theme.mp3

Sound Effects

All short sound effects were generated using ElevenLabs based on written prompts describing each sound in detail.

Dice Rolldice_roll.mp3
Combat Hitcombat_hit.mp3
Combat Misscombat_miss.mp3
Death Stingdeath_sting.mp3
Door Opendoor_open.mp3
Level Uplevel_up.mp3
Treasure Foundtreasure.mp3

Technology

Backend

Python, Flask, Flask-SocketIO

AI Game Master

NVIDIA nemotron-super-120b via OpenAI-compatible SDK

Frontend

Vanilla JavaScript, HTML, CSS

Game Rules

D and D 5th Edition (server-authoritative)

Fonts

Cinzel Decorative, IM Fell English, Crimson Text

Portraits

ChatGPT image generation

Background Music

Suno.com

Sound Effects

ElevenLabs

Deployment

Docker, HuggingFace Spaces

Large File Storage

Git LFS (audio and image assets)

During local development the server ran on localhost:5000 using Flask-SocketIO's built-in Werkzeug dev server with debug mode on. That configuration is intentionally minimal and only accepts connections from the same machine.

For the HuggingFace Spaces deployment three things changed. The host was switched to 0.0.0.0 so the container accepts external traffic. The port was changed to 7860, which is the port HuggingFace exposes by default for Docker Spaces. And allow_unsafe_werkzeug=True was added to the socketio.run() call because newer versions of Flask-SocketIO block the Werkzeug server in any environment that does not explicitly opt in. Without that flag the app crashes on startup in production.

The audio tracks and portrait images total over 100 MB. GitHub and HuggingFace both reject pushes containing files larger than 10 MB unless Git LFS is used. All .mp3, .wav, .png, and .jpg files are tracked through Git LFS so only lightweight pointer files live in the repository history while the actual binaries are stored separately.

Created by SKMMT