Thanks guys, great answers. That is exactly what I would suggest for a Kentico project that has a functional architecture, but I'm afraid this one does not. I'm not allowed to explain the specifics, but providing an overview is not easy.. the imported XML documents are to documents in the content tree what options are for e-commerce products. There are multiple options that are exactly the same, except for the foreign key that links it to a document in the content tree. Only one of the documents in each set is added to the content tree so that it can be searchable, the others are added to the database table for use in a custom Web part (just a repeater-type web part).
Already two things wrong here:
All of the foreign keys for a set of documents could have been added to one field, in one of the documents. This way nothing would be inserted into the database without the API and there would only be one copy of an Option.
If #1 is not an option, the records could have been stored in a custom table and a) we could index them b) we could stage them.
The reason why not all documents are added to the tree is because 1) there are 1000s of them, 2) their only purpose is to show up in a custom Web part which uses SQL queries. When you select a document that has these options belonging to it, the Web part runs a SQL query on the table and uses the current document foreign key in the WHERE condition. We don't really care if they are not proper objects in Kentico because this is their only use. Except this approach is causing issues with deployments, and their budget is not big enough to restructure this from the ground up.
To make matters worse, the import script doesn't clean up after itself. If an option is removed from the XML file during the next import, the old record won't be removed from the database..
It is a nightmare, and that's why I'm here :)