mario

Key facts

Subjects

5 — sub-01 ✅ · sub-02 ✅ · sub-03 ✅ · sub-04 ❌ · sub-05 ✅ · sub-06

Tasks

🎮 Super Mario Bros — in-scanner gameplay — 22 levels (excluding water and boss levels); discovery phase (all levels in order, unlimited attempts) then practice phase (random level per run); 3374 total level attempts across 5 participants; ~16.8 h fMRI per participant

Data

🧠 fMRI — 18.4 h/subject

🎬 Naturalistic video — 18.4 h/subject

🔊 Audio — 18.4 h/subject

🎮 Gameplay — 18.4 h/subject

🫀 ECG — 18.4 h/subject

🫁 Respiration — 18.4 h/subject

🫀 Pulse — 18.4 h/subject

😓 Skin conductance (EDA) — 18.4 h/subject

🕹️ Game logs (annotated events (.tsv) ✅, game replay (.bk2) ✅, video replay (.mp4) ✅, replay summary (.json) ✅, mapped RAM variables (.json) ✅, low-level features (.npy) ✅, scene annotations ✅)

👁️ Eyetracking (gaze 🔒, pupillometry 🔒)

Assets

📁 BIDS · 🧠 fMRIPrep · 🫀 PhysPrep · 🗺️ Mario scenes

How to cite

Paugam, François, Pinsard, Basile, St-Laurent, Marie, Lajoie, Guillaume, Bellec, Lune (2025). Training neural networks from scratch in a videogame leads to brittle brain encoding. doi: 10.1101/2025.11.28.691119

Contributors

François Paugam 🔣📓 · RainyFields 🔣 · Basile 🎨📆🔣 · emilie-dessureault 🔣📆 · Yann Harel 🎨🔣📓 · Julie A. Boyle 🎨📆 · maelleF 🔣 · Marie-Eve Picard 🔣 · Motahareh 🔣 · Lune Bellec 🎨


🔣 data · 🎨 design · 📆 project management · 📓 user testing

Overview

The mario dataset contains in-scanner gameplay of Super Mario Bros. (Nintendo, 1985) for five CNeuroMod participants (sub-01, sub-02, sub-03, sub-05, sub-06). Participants played 22 of the original game’s levels across two phases — a structured discovery phase followed by a longer practice phase of randomly selected levels.

Prior gameplay experience varied across participants: sub-01 and sub-06 had previously played SMB; sub-01 and sub-02 were regular videogame players; sub-03 reported no prior videogame experience.

Game environment

Participants used the CNeuroMod fiber-optic MRI controller described in Harel et al. (2023). The game ran on a console emulator via OpenAI’s gym-retro, recorded at 60 Hz. Because the game is fully deterministic, only player inputs were stored; the .bk2 replay files allow exact reconstruction of every play.

Run design

We use run for a single fMRI acquisition and repetition for a single play of a level — from start to either completion or losing all three lives. Each repetition corresponds to exactly one .bk2 replay file. Each repetition began with no power-up and three lives; after death, the player resumed from the level start or from a checkpoint when one was available in the original level design.

The experiment was structured in two phases:

  • Discovery — every level was played in order, with unlimited attempts per level until at least one successful completion before moving to the next.

  • Practice — the remaining sessions used randomly selected levels for each repetition.

Levels

22 of the 32 original SMB levels were used. Water levels and boss levels were excluded because their mechanics differ substantially from the rest of the game.

Post-run questionnaire

At the end of each run, participants completed a short questionnaire including the items of the Flow Short Scale 2 (FSS-2), plus two additional items aimed at evaluating player fatigue and frustration. These two extra items were introduced after data collection had begun and are therefore absent from the earliest runs.

Per-subject summary

Subject

Repetitions (Discovery)

Repetitions (Practice)

Duration (Discovery)

Duration (Practice)

Success rate (Discovery)

Success rate (Practice)

Repetitions (Total)

Success rate (Total)

Duration (Total)

sub-01

230

567

03:54:27

09:47:11

0.578

0.781

797

0.723

13:41:38

sub-02

227

487

04:57:35

12:30:24

0.401

0.671

714

0.585

17:27:59

sub-03

176

451

04:49:38

11:57:19

0.432

0.698

627

0.624

16:46:57

sub-05

177

457

05:30:04

12:27:44

0.367

0.582

634

0.522

17:57:48

sub-06

134

468

04:25:41

13:37:40

0.627

0.857

602

0.806

18:03:22

Total

944

2430

23:37:27

60:20:19

0.481

0.718

3374

0.652

83:57:47

Event files and annotations

For each run, a _events.tsv file lists the timing of each repetition. A richer _desc-annotated_events.tsv file provides three categories of events:

  • button presses — every controller input;

  • in-game events — game-state annotations derived from RAM (kills, deaths, power-ups, etc.);

  • replay events — one entry per repetition with trial_type gym-retro_game, indicating which .bk2 replay was played at which onset.

Companion .bk2 replays, .mp4 videos, .json summaries, mapped RAM variables, and low-level visual features are provided alongside the events files.

In addition, the 22 levels are split into 313 short scenes annotated with 29 design patterns (23 from Dahlskog & Togelius, 2012, plus 6 contextual ones). See SCENES.md and the mario.scenes submodule for details and tooling to generate clip-level metadata, video, and memory dumps for each scene attempt.

Tutorials

The mario.tutorials repository provides a set of Colab-ready Jupyter notebooks that illustrate end-to-end use of the dataset on a single participant / single session, suitable for running on a laptop:

  1. Dataset overview — exploration of the BIDS layout and behavioral annotations.

  2. Event-based analysis — session-level GLM with hand-crafted action and game-event regressors and interpretable contrasts.

  3. Reinforcement learning — training a CNN-based RL agent on the same gameplay and extracting layer activations.

  4. Brain encoding — ridge-regression encoding models that map RL-agent activations onto BOLD signals, comparing layers.

The notebooks adapt and combine methodology from the shinobi_fmri and mario_generalization repositories.

Reference

A detailed description of the dataset and an associated modelling study are available in Paugam et al., bioRxiv 2025.