In general, it should be good enough to set the log level to INFO
for
org.onehippo.forge.content.exim
logger category like the following example:
<Logger name="org.onehippo.forge.content.exim" level="info" />
The module itself is not really using any new features of Hippo CMS 10.2 or later versions.
However, when you write and execute a groovy script, the following code will not work in earlier versions
because visitorContext
has been supported since 10.2:
visitorContext.reportUpdated(documentLocation) //, visitorContext.reportFailed(documentLocation) //, or visitorContext.reportSkipped(documentLocation) //.
When exporting content based on a query which returns each live or preview variant node to export,
you don't have to invoke the methods of visitorContext
because the node iteration by the
groovy updater engine manages the batch size and threshold automatically based on the node iteration count
(which is the same as invocation count on #doUpdate(Node)
as a result).
However, it is very important to invoke those methods of visitorContext
when you import a lot of documents in single #doUpdate(Node)
method execution
because the groovy updater engine cannot know how to count the execution count
if you don't invoke the methods of visitorContext
manually.
Therefore, it is really important to invoke the methods of visitorContext
on each unit of importing task
when importing a lot of content in a single #doUpdate(Node)
execution in a groovy script.
Please see the Reporting of Execution section in the
Using the Updater Editor
page for more detail.
Note:
If you implement a new Java application using the migration task components instead of using Groovy Updater scripts
or need to use an earlier version such as 10.1.x, then it's totally up to you when you import much data in single #doUpdate(Node)
execution.
You should maintain the batch size and threshold for system availability by yourself.
For example, you can invoke javax.jcr.Session#save()
on every N processing,
or javax.jcr.Session#refresh(false)
on any exception
to keep consistency and avoid unexpected huge memory consumption like the following example:
boolean isDryRun = false; // do the following in iteration. if (processCount++ % 10 == 0) { if (isDryRun) { session.refresh(false); } else { try { session.save(); } catch (RepositoryException e) { session.refresh(false); } } } // do the following again to revert or save after getting out iteration. if (isDryRun) { session.refresh(false); } else { try { session.save(); } catch (RepositoryException e) { session.refresh(false); } }
Well, it's about Content migration and migration tasks are either EXport or IMport, right? :-)
But also, EXIM Banks exist in many countries to help exporters and importers in their country mostly finacially. Most hard work is done by exporters and importers, but their financial support might be helpful.
In the same way, Content EXIM module wishes to help content exporters and importers mostly technically by providing core components and sharing knowledges in the community. Most hard work is done by content exporters and importers, but the technical support with the community might be helpful. ;-)