Basically, when you’re making a game, especially one like an eroge, you need a whole heap of different data files to contain the different elements that make up that game. For a typical eroge, you need script files (containing the text of the scenario as well as some commands so your engine knows when to do things like show a different character, change the background, change the music, show a CG etc.), you need a heap of images for background graphics, character sprites (also known as tachie, standing pictures, paper dolls etc.), CGs etc. These files can run into the thousands or tens of thousands pretty easily.
You also need music, for each background music track you have you need a music file. In the days of old, this would be done with ‘red book’ audio, in which music would be stored on the CD in the same way music is stored on standard audio CDs you could put in any CD player. This means you could put one of these CDs in a CD player and although the first track would usually be unplayable the remaining ones would contain the game’s background music. This practice has largely been abandoned as it is a hugely inefficient way of storing music. Now they typically just get stored as audio files in some compressed format, the most common being Ogg Vorbis. Sound effects need to be stored as well, as do voices if your game is voiced. Voiced games typically contain tens of thousands of voice files.
There’s also a bunch of other supplementary files that need to be stored as well, such as information for the graphical transitions, graphics for UI elements such as buttons and labels for the configuration screen and a heap more, so it’s not odd for an eroge to make use of over a hundred thousand files. So where do you keep all these?
Well, that’s an question pretty much every engine solves differently. Some really do just store the files straight on disk. The VN engine HNS, used for things like the doujin game ‘Moonlight Blue’ does exactly this. This is why it was one of the first VNs I started playing with the script of. Other engines create their own file formats entirely for the purpose of aggregating a large number of files into one file and making it easy for the engine to scan the index and just pick out the file it needs or to read every file into memory at once. This is the most common approach and the formats used for this are anything but standard. Thankfully, if the music files are still in a common, non-proprietary format (like Ogg Vorbis) you can normally scan one of these files and, even though you don’t know how the data is stored, you can still see where an Ogg Vorbis file starts and ends and hence extract music and other sound files that way. This is the job a ‘file scraper’ basically does, although I don’t really have any that are particularly generally good to recommend. Normally if I need to file scrape something I write my own. Some engines go all the way, encrypting the files so you basically don’t have a chance without either reverse engineering the format yourself or finding someone else who has reverse engineered the format for you.
Kimuzukashii MEIJI, are you still reading Kanon?