95% of Claude Code Success is Folder Design: The Infrastructure Behind the Magic

For all the glitz and glamour that is Claude Code to most people, there's a lot of system maintenance that goes on behind the scenes.

Every day I document the changes I make within my system. Without this process, the whole thing falls apart.

95% of Claude Code success is folder design. Not prompts, not skills, not agents, not MCPs (these are the layers on top that make it efficient to use, but they don't enable the context and infrastructure necessary to make the system effective). Folder structure, progressive disclosure, and context architecture. The invisible infrastructure nobody talks about because it's not as sexy as "build an agent".

Why Folder Structure Is Everything

Claude Code reads context from the folder you open and nothing else. Your brilliant skill in another folder is invisible. Your perfect context file in the wrong location never gets loaded. The folder structure IS the context architecture.

Every file placement is a context decision.

What goes in your root CLAUDE.md are global preferences, identity, and universal rules.
What goes in a project CLAUDE.md is job specific context, relevant skills, and this job only instructions.
What goes in the .claude folder are settings, permissions, and project level skill files.

When showcasing Claude Code demos, people see the magic moment where the skill runs and the output appears. They don't see the time spent designing and re-architecting the folder structure, deciding what goes where, building the progressive disclosure index, and testing which context loads help versus hurt. This is the grunt upfront work required to make Claude Code hum. It's the invisible 95%.

The Librarian Problem

Think of working with Claude Code like being a librarian in an infinitely expanding library (more on this analogy here https://www.linkedin.com/pulse/ais-big-misconception-better-prompts-more-context-best-mike-bayly-ndu8e/?trackingId=ohUG3XZKQs%2BxdwKtD1lOLg%3D%3D).

When you start, you just throw books on shelves and it works.

Then you get more books and suddenly you can't find anything.
So you create a catalogue.
Then you need categories.
Then you realise some books belong in multiple categories.
Then you hire an assistant (the AI) but discover the assistant can only help if they can find the catalogue.
And the catalogue only works if it's accurate.
And it's only accurate if someone updates it.

This is the core problem with Claude Code that nobody warns you about. The AI is only as good as its ability to find what it needs, and its ability to find what it needs is entirely determined by how you've organised your files.

The solution is something called progressive disclosure. You don't load everything at once. You load in layers. A lightweight pointer file points to an index, the index points to a more detailed search, and the search retrieves the actual content.

My vault level instruction file used to be 2,000 tokens. That loads automatically every single conversation, even when I just want to edit one file. Wasteful, consuming context that can lead to context rot (effectively a reduction in output quality from AI). Now it's 200 tokens, a pointer that says "here's where to find the detailed protocol if you need it". 90% token reduction. The full protocol only loads when the AI needs it.

Every unnecessary token is confusion. Progressive disclosure is clarity.

My Daily Maintenance Workflow

On top of the folder structure, I've built a maintenance layer that runs every day.

When I start a new Claude Code session, a hook called session-init runs automatically in the background. It regenerates my skill indexes, updates my system state file, and cleans up any ephemeral files older than 7 days. I never have to think about it. The system maintains itself.
Throughout the day, I work on whatever I'm working on: content creation, client work, building new capabilities. The key is that everything I build gets saved to specific locations that my infrastructure knows how to find.
At the end of a session, I run /finish. This command analyses what I accomplished, identifies which files were modified, and recommends specific commands to run based on what type of work I did. It tells me exactly what will happen if I save my learnings and how that connects to tomorrow's content pipeline.
Based on the /finish recommendations, I typically run /extract-learnings. This analyses the entire conversation and categorises what I learned into four types: project learnings that go into process notes, workflow learnings that become reusable templates, working patterns that update my main instruction file, and business knowledge that updates my context files.
If I completed a discrete task, I run /mark-complete to create a completion record. If I'm mid task and need to pause, I run /checkpoint instead, which saves the current state to a file I can resume from later.
Periodically, I run /daily-changes. This scans 7 infrastructure locations for files modified in the last 48 hours, groups the changes by category, and generates prioritised content ideas based on what I've been building. Tinkering becomes my content pipeline.

The maintenance burden is heavy. Every time I add a new skill or command, I need to update the index. Every time I reorganise files, I need to update the pointers. I've automated most of this with hooks or commands, but not all of it. Some things still require human judgment.

How The System Have Evolved in the Last 3 Weeks Alone

Managing a Claude Code instance is like gardening. Plenty of maintenance to refine it.

36 custom skills and over 40 commands now exist in my system. Skills are reusable building blocks like "analyse a transcript" or "extract key ideas from a book". Commands are complete workflows like "run my daily review" or "process content ideas". Skills can be combined together while commands tend to run on their own.
A complete workspace architecture with a dedicated folder that has a predictable home for everything: a place for temporary working files, a place for saving progress on long tasks, a place for organised project work, and a place for reference material.
Checkpoint and resume capability so that long tasks can span multiple sessions. Without this, Claude forgets everything when the conversation ends. Now the current state gets saved to a file, and picking up where you left off the next day is as simple as running a resume command.
A custom integration with Fireflies (MCP) so that meeting transcripts get pulled automatically, saved to organised client folders, and made searchable. When asking Claude about what was discussed in a meeting, it can actually look it up instead of asking me to paste it in.
A layered knowledge system with 37 books worth of frameworks, 219 newsletters, and 200+ notes, all organised so Claude can navigate them without loading everything at once. When asking about something like work life balance, it searches across books and notes, finds relevant frameworks, and returns focused findings without bloating the conversation.
Framework first content generation where the key ideas get extracted from books once, then used to generate multiple content types from those same frameworks. The old approach took 45 minutes. The new approach takes 15 minutes.
The LinkedIn command got refactored from 375 lines down to 97 lines. The initial assumption was that more commands meant more power, but the opposite turned out to be true. Claude already knows how to read, write, search, and edit. It should compose solutions from those basic capabilities, not follow overly prescriptive scripts. Six commands got deleted because they weren't needed.
Context files got consolidated because the same information was scattered across 5 different files. Now it lives in one place.
Review commands got restructured into 7 life planning reviews with progressive context loading. The annual review now loads 8 relevant files before the conversation starts, which solves the problem of forgetting your own goals by surfacing them before asking questions.

The Things That Broke Along the Way

Nobody shows you this in the demos. The things that broke, and more importantly, why they broke and how I fixed them.

PDF processing failed because chunks were too big. Claude can't process entire books because the context window has limits. My first attempts to extract knowledge from books crashed completely because I was trying to feed in too much at once. The solution was smart PDF chunking. I wrote a Python script that breaks books into approximately 5,000 token chunks with paragraph aware breaks, finding natural stopping points instead of cutting mid sentence. Now I can process a 400 page book systematically, one chunk at a time, with parallel agents analysing different sections simultaneously.
Session end hooks didn't work for me (or didn't exist, I'm not sure). I built a maintenance hook that was supposed to run at the end of each session to clean up files and update indexes. Spent several hours on it. Then discovered Claude Code doesn't have session end hooks because that feature simply doesn't exist. I had to completely redesign around session-init hooks instead, which run at the start of each session. Actually turned out better because it refreshes state BEFORE the new session starts rather than after the old one ends, so the agent sees fresh data immediately. Maybe I'm missing something here that the techies understand that I don't...
Symlinks break when Google Drive syncs. My vault lives on Google Drive so I can access it from multiple machines, and I use symlinks to connect different parts of my file system. But Google Drive sync operations randomly break these symlinks. One day everything works, the next day Claude can't find my context files. The solution was building regular health checks into my workflow, plus documenting exactly how to recreate each symlink when they break. Not elegant, but functional.
The autonomous loop stopped after 1 story. I built an autonomous coding loop called Ralph Wiggum that should run through a backlog of user stories, completing each one without human intervention. It stopped after completing just one story instead of continuing to the next. The root cause was a bash scripting bug, which caused the script to exit immediately if any command returns a non zero exit code (who knows what that means). Simple fix once I found it, but it completely broke the promise of autonomous operation until I debugged it. I haven't used the Ralph Wiggum loop since.
A personality file broke autonomous operation. I added a CLAUDE.md file with personality instructions to one of my projects, thinking it would make the AI responses more consistent. Instead, it broke the autonomous loop entirely because the personality instructions were conflicting with the headless operation mode. Learned that some files interfere with autonomous operation in ways you don't expect. Now I'm much more careful about what instructions I add to project level configuration files.
Geocoding failed with stock coordinates. I built a location based app for a client and tested with hardcoded coordinates, which worked perfectly. Tested with real New Zealand addresses and it broke completely. The geocoding providers I was using, Nominatim and Photon, don't have good coverage for NZ addresses. After three failed approaches, I finally switched to Google Places API, which has authoritative and accurate NZ data. The lesson: always test with real data early because stock coordinates masked the problem initially.

The Pattern

You build something and it works, then it breaks, then you rebuild it and it's 3x better than before. That's not a bug, that's the process.

My honest assessment after 6 months in Claude Code, 2-months of taking it seriously: 95% of the work is folder design and infrastructure, not building cool stuff. The cool stuff only works because the infrastructure exists. Every system needs a maintenance strategy or it decays. Files as interface beats databases for now because human readable state is debuggable state. Pragmatism beats purity. Ship a working system, iterate later.

The time allocation looks like this: 50% building new infrastructure, 30% using what I've built, 20% maintaining and fixing what broke.

The Secret Nobody Posts About

The infrastructure work never stops. You just get better at it.

Every week, my system gets a little smarter. Every week, something breaks and I fix it. Every week, I delete something I thought I needed and replace it with something simpler.

That's the actual work of managing an AI system, That's the reality behind the "I automated my entire job" posts.

The next video you watch about Claude Code will probably show a cool skill or agent. Think about what folder structure makes that skill work. That's the thing nobody talks about. That's the 95% that decides everything.

Start with folders. End with magic.

Written by Mike ✌

Passionate about all things AI, emerging tech and start-ups, Mike is the Founder of The AI Corner.

Subscribe to The AI Corner

The fastest way to keep up with AI in New Zealand, in just 5 minutes a week. Join thousands of readers who rely on us every Monday for the latest AI news.