How I Built SANDBOX with Claude Code

A field guide for shipping a real Expo app in a series of focused conversations. Reusable for any project — copy the structure, replace the specifics.

The premise: most "build an app with AI" guides break down because the agent and the human don't share the same map. The fix is to invest 30 minutes up front making that map (a few markdown files), then turn every conversation into a vertical slice against it. What follows is the exact pattern I used, the prompts that worked, and the gotchas I hit.

0 · The Mindset

Three rules that everything else flows from:

The brief is the project, not the chat. If a fact lives only in a conversation, it's lost. Write it to CLAUDE.md / DESIGN.md / PRD.md / todo.md first.
Build in vertical slices. Each slice should produce something you can open and tap. No 5-day refactors that don't render a pixel.
The agent is a coordinator, not an oracle. It's fast at translating intent into code, slow at making decisions you should be making.

If you only remember one thing: the docs are the prompt. Every chat reads them.

1 · Scaffolding (Expo)

The exact CLI sequence to get to a running blank Android-target Expo app:

# Create the project (pick the TypeScript blank-tabs template)
npx create-expo-app@latest sandbox --template

cd sandbox

# Start the dev server
npm start
# Press 'a' to launch on an Android emulator/device

Expo's default template already gives you:

expo-router (file-based routing — folders in app/ become routes)
A tabs layout (app/(tabs)/) and a modal (app/modal.tsx)
Reanimated, gesture-handler, safe-area-context, vector-icons pre-installed
TypeScript + ESLint configured

Use npx expo install <pkg> instead of npm install when adding libraries — it picks the version that's compatible with your SDK. Mismatched native libs are the #1 source of "metro started but it crashes on launch" bugs.

For SANDBOX I added these one at a time, only when the slice needed them:

Slice	Packages
Runtime spike	`react-native-webview`
Library + storage	`expo-sqlite`, `expo-file-system`, `expo-document-picker`, `expo-image-picker`, `expo-sharing`, `expo-crypto`
Polish (later)	`nativewind`, `tailwindcss`, `@expo-google-fonts/inter`

2 · The Doc Layer (Most of the Magic)

Before writing any feature code, drop these five files at the project root. They take ~30 minutes total. They're what makes every subsequent Claude conversation cheap.

2.1 `CLAUDE.md` — the standing brief

What it contains:

One-paragraph product summary
Platform + scope (e.g. "Android only. iOS share extension out of scope.")
Tech stack (installed vs. to-add)
Repo layout (where each kind of file lives)
Workflow rules (lint commands, "ask before X")
Build order — the slice plan
Out-of-scope list

This file is loaded into every Claude conversation. Treat it as the constitution.

2.2 `DESIGN.md` — the visual contract

For SANDBOX I locked: tokens (colors, type sizes, radii), a named variant per screen (LibraryC, SettingsA, DetailsA...), and a "do not change variants without asking" rule. This stops the agent from picking a different layout halfway through.

If your design lives in Figma, link the frame URLs here and paste the key numbers (font sizes, paddings) inline — Claude can't open Figma.

2.3 `AGENT.md` — workflow rules

Short. Defines how the agent should behave:

Keep responses concise
Use sub-agents for research before proposing
Don't rm -rf without asking
Run lint + type-check after feature work

2.4 `plan/PRD.md` — the product spec

User flows, data model, storage layout, success criteria. Even a half-baked PRD beats no PRD because it surfaces decisions you forgot you needed to make.

2.5 `todo.md` — the slice checklist

Two sections: Done (with dates) and Todo (to final product). Group todos by slice. This file is how you say "do the next task" without having to remember what's next.

3 · Slice the Work, Then Slice It Again

My build order for SANDBOX (a chromeless artifact-runner app):

1. Runtime spike   → prove "hello world artifact renders in a WebView"
2. Library + storage → SQLite + file system + grid view
3. Share-in        → Android intent filters → bottom sheet
4. Screens         → Empty / Details / Settings / Dependencies
5. Runtime hardening → bundle libs locally, dependency cache
6. Polish          → dark mode, fonts, motion, haptics

Each slice obeys two rules:

It produces something demoable. Even if it's ugly. Tap-able beats correct-on-paper.
It writes its own gotchas back into the docs. A bug you hit twice is a brief update you forgot to make.

4 · Prompt Patterns That Worked

The actual prompts I used for SANDBOX were short. The docs do the heavy lifting.

4.1 The "next slice" prompt

check todo and do the next task Library

That's it. Because todo.md is loaded and structured by section, the agent knows what "Library" means.

4.2 The "bug report" prompt

check the test jsx in test/workout-plan when i open it with sandbox app
from expo it gives me error: SyntaxError: Cannot use import statement
outside a module
    at new Function (<anonymous>)
    at mountJSX (https://localhost/:57:21)
    ...

Paste the stack trace verbatim. Don't try to diagnose. The line numbers + file references let the agent jump straight to the cause.

4.3 The "scope check" prompt

finish artifact action

Or:

lets do next task - Screens not yet built

When a section has multiple items, the agent should pause and ask whether you want them all in one batch or split. If it doesn't ask, you should ask: "any decisions to make before you start?"

4.4 The "explain + do" prompt

finish Runtime hardening and one more thing why jsx loading is slow now?

Two requests, one prompt. The agent gets to:

Explain the cause (which makes the fix obvious)
Apply the fix as part of the slice it was already doing

4.5 Anti-patterns

❌ "Build the entire app." — produces an unreviewable mess
❌ "What do you think of my code?" — agent will hedge; ask specific questions
❌ Pasting a 200-line file without saying what to do with it — agent will guess; tell it the goal

5 · The Working Loop

For each slice I followed the same micro-loop:

1. Open conversation, paste the slice name
2. Agent reads todo.md + DESIGN.md, proposes a plan
3. (Optional) Agent asks one clarifying question if a path isn't covered
4. Agent writes code, marks tasks done as it goes
5. Agent runs `npx tsc --noEmit` and `npx expo lint` — both must pass
6. Agent updates todo.md to move items to Done with a date
7. Agent summarizes what landed and what's next

Step 5 is non-negotiable. Don't ship a slice that doesn't type-check and lint clean. It's the cheapest possible signal that you haven't broken something orthogonal.

6 · Reusable Patterns I Discovered

These came up in multiple slices. Worth pre-deciding so the agent doesn't reinvent each one.

6.1 The shared-constants file

When two screens reused the same emoji palette, I extracted it to lib/icons.ts:

// lib/icons.ts
export const EMOJIS = ['✶', '✓', '◔', '♞', '$', '⚡', '◉', ...];

Now app/share.tsx and components/change-icon-sheet.tsx both import it. Pattern: two callers = one shared module.

6.2 The bottom-sheet recipe

Every "select something, save, dismiss" UI in this app is the same shape:

<Modal visible={...} transparent animationType="slide">
  <View style={styles.scrim}>
    <Pressable style={StyleSheet.absoluteFillObject} onPress={onCancel} />
    <View style={styles.sheet}>
      <View style={styles.grabber} />
      <Header cancel onCancel save onSave />
      {/* sheet body */}
    </View>
  </View>
</Modal>

Once you've shipped one, every subsequent sheet is a paste-and-rename. Codify the shape early.

6.3 The icon component with type-switching

<AppIcon
  glyph={a.iconType === 'emoji' ? a.iconValue : ''}
  imageUri={a.iconType === 'image' ? a.iconValue : undefined}
  size={42}
/>

The component decides what to render based on which prop is set. Callers don't branch — they just spread the artifact metadata. Pattern: let the leaf component do the rendering switch.

6.4 The DAO module

Every database table got a lib/<table>.ts module exporting plain async functions: listX(), getX(id), createX(input), updateX(id, patch), deleteX(id). Nothing fancier than that. Screens import the functions they need.

6.5 Additive SQLite migrations via `PRAGMA table_info`

When v1.2 needed an icon_fill column on existing artifact rows, I didn't bump a schema version or write a migration runner. Just check what's there and ALTER TABLE if missing:

const cols = await db.getAllAsync<{ name: string }>('PRAGMA table_info(artifacts)');
if (!cols.some((c) => c.name === 'icon_fill')) {
  await db.execAsync('ALTER TABLE artifacts ADD COLUMN icon_fill TEXT');
}

Same for adding the prefs table later — CREATE TABLE IF NOT EXISTS. No version bookkeeping, no broken users.

Pattern: for a single-user app, additive migrations + IF NOT EXISTS covers 95% of evolutions. Keep the version-table machinery for when you actually need it.

6.6 Hook + non-hook variants for the same logic

useIconProps(item) is the natural API — but you can't call it inside .map((a) => <AppIcon {...useIconProps(a)} />), which is exactly what every grid wants to do. Solution: ship two functions.

export function iconPropsFor(item, scheme) { /* pure */ }
export function useIconProps(item) { return iconPropsFor(item, useScheme()); }

Components that already have a scheme (because they call useScheme() once at the top) use the non-hook form inside loops. Single-use callers use the hook.

Pattern: any hook that derives from a single use* call should also expose the pure version it wraps. Costs nothing, unblocks the only place hooks bite you.

6.7 Settings as a context, prefs as the persistence

For app-wide settings (theme override, network toggle), the shape that worked:

lib/prefs.ts — pure SQLite key/value DAO. No React.
lib/settings-context.tsx — <SettingsProvider> loads prefs once into state, exposes setters that update both state and SQLite.
useSettings() returns a fallback object if the provider is missing — never throws (see §7.7).

Hooks like useEffectiveScheme() (theme override + system fallback) live in the provider file too. Anything else that needs to read prefs synchronously off-React (e.g. dep-fetcher checking allow_network) calls getPref() directly from SQLite.

Pattern: persistent prefs split into a stateless DAO + a React-state mirror in a provider. Don't let one layer do both jobs — it tangles your reads and writes.

7 · Real Errors I Hit (and the Fixes)

These are the bugs that ate the most time. Each one is a brief I wish I'd had on day one.

7.1 `SyntaxError: Cannot use import statement outside a module`

Cause: Babel Standalone was configured with modules: false, which leaves ES module syntax untouched. Then new Function(transformedCode) rejected the import statement.

Fix: Switch the preset to modules: 'commonjs' and inject a require shim that returns the preloaded React/ReactDOM globals.

// inside the runtime shell
var out = Babel.transform(source, {
  presets: [['env', { modules: 'commonjs' }], 'react'],
}).code;

var require = function (name) {
  if (name === 'react') return React;
  if (name === 'react-dom' || name === 'react-dom/client') return ReactDOM;
  throw new Error('Module not available: ' + name);
};
var moduleObj = { exports: {} };

// factory now reads from module.exports.default so `export default X` works
var Component = factory(React, ReactDOM, require, moduleObj, moduleObj.exports);

Lesson: when you bridge to a dynamic-eval runtime, decide upfront how it talks to module syntax. The default Babel preset isn't set up for it.

7.2 `expo-image-picker`'s `MediaTypeOptions` deprecation

Old API:

mediaTypes: ImagePicker.MediaTypeOptions.Images

New API (SDK 51+):

mediaTypes: ['images']

Lesson: check node_modules/<pkg>/build/*.d.ts for @deprecated comments before using anything from memory. Library APIs drift faster than your habits.

7.3 Metro doesn't bundle `.js` files in `assets/`

I needed to ship React, ReactDOM, Babel Standalone, and Tailwind locally for offline use. Putting them as .js files in assets/runtime/ doesn't work — Metro treats them as source code and tries to parse them.

Fix: rename the libs to a non-.js extension and register it in Metro's asset list.

// metro.config.js
const { getDefaultConfig } = require('expo/metro-config');
const config = getDefaultConfig(__dirname);
config.resolver.assetExts = [...config.resolver.assetExts, 'html', 'bundle'];
module.exports = config;

Now require('../assets/runtime/react.bundle') returns an Asset module ID. At runtime:

import { Asset } from 'expo-asset';
import * as FileSystem from 'expo-file-system/legacy';

const asset = Asset.fromModule(require('../assets/runtime/react.bundle'));
await asset.downloadAsync();
const source = await FileSystem.readAsStringAsync(asset.localUri!);

Lesson: Metro's bundler is config-driven, not magic. If a file isn't loading, check resolver.assetExts.

7.4 Inlining JS into HTML breaks if the JS contains `</script>`

When you inline Babel Standalone's 3 MB minified bundle inside a <script> tag, somewhere in there is a regex literal that contains </script as a string — and the HTML parser sees it as a closing tag.

Fix: escape it before injection.

function escapeForScript(src: string): string {
  return src.replace(/<\/(script|style)/gi, '<\\/$1');
}

Lesson: if you're concatenating user-supplied or third-party content into HTML, escape the tag boundary.

7.5 `reanimated-color-picker`'s `onComplete` is worklet-only

When I added a custom color picker for the Color tab, dragging the wheel and releasing force-closed Expo Go. No JS error — the bridge just died.

The library exposes two pairs of callbacks: onChange / onComplete (worklet) and onChangeJS / onCompleteJS (JS thread). Calling setColorState(c.hex) from a worklet — without runOnJS — is a hard crash on the native side.

Fix:

<ColorPicker
  value={draft}
  onCompleteJS={(c) => setDraft(c.hex)}  // JS-thread variant
>
  <Preview /><Panel1 /><HueSlider />
</ColorPicker>

Lesson: any reanimated-based gesture library that exposes both worklet and JS callbacks is telling you they're not interchangeable. If your handler touches React state, you need the JS variant (or wrap with runOnJS).

7.6 Gesture-handler context dies inside RN's `Modal`

The same color picker still crashed inside the <ChangeIconSheet> modal. <Modal> mounts a separate native window; it doesn't inherit the GestureHandlerRootView that expo-router installs at app root.

Fix: wrap the modal contents in their own root.

import { GestureHandlerRootView } from 'react-native-gesture-handler';

<Modal visible={open} transparent>
  <GestureHandlerRootView style={{ flex: 1 }}>
    {/* picker, sheet body */}
  </GestureHandlerRootView>
</Modal>

Lesson: anywhere you create a new native view tree (Modal, popup window), you have to re-install the providers gesture-handler and reanimated need.

7.7 Throwing from `useContext(Ctx)` crashes during fast refresh

I had useSettings() throw 'SettingsProvider missing' if the context was null. That fired on first launch — even though <SettingsProvider> clearly wrapped the tree. Cause: under Metro fast refresh (and in some expo-router edge mounts) the consumer module and the provider module can briefly hold different Ctx symbols, so useContext returns null for a frame.

Fix: never throw from a hook. Return a safe default.

const FALLBACK: SettingsState = { themePref: 'system', /* ... */ };

export function useSettings(): SettingsState {
  return useContext(Ctx) ?? FALLBACK;
}

Lesson: treat context defaults as the contract, not as a "should never happen" path. The minute you throw, hot-reload becomes a crash machine.

7.8 RN→WebView bridge is slow for huge `source.html` strings

After bundling 3.5 MB of libs into the runtime shell, every artifact open got noticeably slower. The cost was the RN bridge serializing the giant string to the WebView.

Fix: write the shell to documentDirectory/runtime/shell-{fingerprint}.html once, then point the WebView at the file URI.

<WebView
  source={{ uri: shellUri }}  // instead of { html: shellHtml }
  allowFileAccess
  allowFileAccessFromFileURLs
  allowUniversalAccessFromFileURLs
/>

Warm the file on app boot in your root _layout.tsx so the first artifact tap doesn't pay the write cost.

Lesson: the RN bridge is JSON-stringifying everything that crosses it. If you can hand the WebView a path instead of a payload, do that.

8 · Performance Triage (When Things Get Slow)

In order of cost-vs-impact:

Where is the time going? Add console.time / console.timeEnd around the suspected block. Don't guess.
Is something rebuilding that shouldn't? Module-scope cache the result.
Is the bridge involved? WebView, AsyncStorage, native module — every cross is a serialization cost. Batch payloads, hand off URIs not strings.
Is React re-rendering? Pull state out of the parent if children don't need it; memoize derived values with useMemo.

Specific to WebView-runtime apps: keep the WebView alive across artifact views if you can. WebView cold-start + Babel parse is your biggest fixed cost.

9 · Files I Actually Used (Annotated)

This is what the SANDBOX repo looks like, with a one-line "why" per file:

sandbox/
├── CLAUDE.md          ← project memory; loaded every chat
├── DESIGN.md          ← locked visual variants
├── AGENT.md           ← workflow rules
├── todo.md            ← slice checklist
├── workflow.md        ← this file
├── plan/
│   └── PRD.md         ← spec; flows, data model, success criteria
│
├── app/                                 ← expo-router routes
│   ├── _layout.tsx                       ← root stack; storage + runtime warm
│   ├── (tabs)/
│   │   ├── _layout.tsx                    ← bottom tabs (Library, Settings)
│   │   ├── index.tsx                      ← Library — grid/list + long-press menu
│   │   └── settings.tsx                   ← Settings A
│   ├── share.tsx                          ← Receive-share bottom sheet
│   ├── run/[id].tsx                       ← Artifact runner (chromeless + control bar)
│   ├── details/[id].tsx                   ← Details A grouped table
│   └── dependencies.tsx                   ← Dependencies B + C overlay
│
├── components/
│   ├── app-icon.tsx                       ← rounded tile; emoji or image
│   ├── artifact-runner.tsx                ← WebView wrapper; postMessage bridge
│   └── change-icon-sheet.tsx              ← emoji grid + camera-roll picker
│
├── lib/
│   ├── storage.ts                         ← SQLite init + dirs + reset + size calc
│   ├── artifacts.ts                       ← Artifact DAO
│   ├── dependencies.ts                    ← Dependency DAO
│   ├── icons.ts                           ← shared emoji palette
│   ├── import-detector.ts                 ← regex pass for `import`/`require`
│   ├── dep-fetcher.ts                     ← CDN whitelist; ensure + link deps
│   ├── runtime-assets.ts                  ← load + cache libs, write shell to disk
│   ├── runtime-shell.ts                   ← HTML builder; postMessage protocol
│   └── sample-artifact.ts                 ← canned source for the first-run demo
│
├── assets/
│   └── runtime/
│       ├── react.bundle                   ← React UMD (committed)
│       ├── react-dom.bundle               ← ReactDOM UMD (committed)
│       ├── babel.bundle                   ← Babel Standalone (committed, 3 MB)
│       └── tailwind.bundle                ← Tailwind Play CDN (committed)
│
└── scripts/
    └── fetch-runtime-libs.sh              ← re-fetch libs at pinned versions

The shape that matters:

One folder per concern (app/, components/, lib/, assets/, scripts/).
Routes drive layout — every screen is a file in app/. Don't sneak screens into components/.
lib/ is the brain — every screen imports from lib/, never from another screen.
assets/ is for things Metro doesn't compile — fonts, images, the runtime libs.

10 · The Validation Step (Don't Skip)

After every slice:

npx tsc --noEmit     # type-check the whole project
npx expo lint        # ESLint with Expo defaults

These take 5-15 seconds and they catch:

Forgotten imports
Prop type drift
Renamed-but-not-updated callers
Unused state, missing keys, illegal hooks

If either fails, fix it before claiming the slice is done. Compiled-and-linted ≠ correct, but uncompiled = definitely broken somewhere.

For UI changes, add the manual step: open the screen on a device, tap through the golden path. Don't trust an LLM's claim that "this works" — it didn't run the app.

11 · What Goes in Memory (and What Doesn't)

If you're using Claude Code with the auto-memory system, save:

User profile — what you do, what your stack is, how detailed you want responses
Feedback — corrections that should apply to future work ("don't add comments unless WHY is non-obvious")
Project context — the why behind in-flight work, decisions made
External references — Linear projects, Slack channels, dashboards

Don't save:

Code patterns (the code itself is the source of truth)
Git history (git log is)
Bug recipes (the fix is in the commit)
Recent activity (it'll be stale tomorrow)

12 · A Quick Replay of How SANDBOX Got Built

So you can see what a real session timeline looks like:

Slice	Prompt	What landed
0	(none)	Expo template, doc layer
1	"let's do the runtime spike"	WebView shell, sample artifact, hello-world render
2	"check todo and do the next task Library"	SQLite + DAO, grid view, `+` document picker
3	"next: share-in"	Android intent filters, share bottom sheet
3.5	"tap → run real artifact"	wire `/run/[id]` to FS source + `touchOpened`
—	(bug report with stack trace)	Babel `modules:'commonjs'` fix
4	"next: list view + view toggle + search + long-press menu"	Library polish, rename modal, delete confirm
5	"finish artifact actions"	Change-icon sheet, share-out via expo-sharing
6	"next: screens not yet built"	Empty A, Details A, Settings A, Dependencies B + C
7	"let's do runtime hardening"	Bundle libs, import detector, dep fetcher, link + load
8	"finish runtime hardening + why is it slow?"	Shell-to-disk perf fix, control bar, offline banner
9 (v1.1)	"design changes for v1.1, ask questions"	Lucide glyph set, sliding view-toggle pill, theme-aware tab bar, header + search restyle
10 (v1.2)	"color in change icon menu, muted naturals"	Tabs `[Icon
11 (v1.2)	"let's start working with settings"	Prefs table, settings context, appearance / about / credits / storage / logs sub-screens, real network toggle

Total: 11 conversations to a v1.2 app. Each kept its own focused scope.

13 · The Underlying Principles

Strip away the specifics and what's left:

Write the brief once, read it every chat. Markdown is the cheapest agent state.
Vertical slices over horizontal layers. Always have something to tap.
One decision per question. When the agent asks you a thing, answer with intent, not vibes.
Lint + type-check is the slice gate. Below that bar isn't done.
Every gotcha goes back into the brief. A bug you hit twice is documentation you didn't write.
Pre-decide the patterns. Bottom sheets, modals, DAOs, icon components — codify the shape once.
The agent is fastest when it has the smallest decision to make. Narrow the question, narrow the answer.

That's the whole system. The rest is execution.

Made while building SANDBOX, an Android-only mini-app runtime for AI-generated artifacts. Adapt freely.

← Back to lab notebook