If you do a lot of Drupal development and need to deploy configuration I am sure that you are using update hooks to some extent at least. If you don't use Features and want to create a taxonomy vocabulary or something in code, the hook_update_N() hook is the way to go.
But have you ever needed to perform an update the size of which would exceed PHP's maximum execution time? If you need to create 1000 entities (let's just say as an example), it's not a good idea to trust that the production server will not max out and leave you hanging in the middle of a deploy. So what's the solution?
You can use the batch capability of the update hook. If you were wondering what the &$sandbox
argument is for, it's just for that. You use it for two things mainly:
- store data required for your operations across multiple passes (since it is passed by reference the values remain)
- tell Drupal when it should stop the process by setting the
$sandbox['#finished']
value to 1.
Let me show you how this works. Let's say we want to create a vocabulary and a bunch of taxonomy terms with names from a big array. We want to break this array into chunks and create the terms one chunk at the time so as to avoid the load on the server.
So here is how you do it:
/**
* Create all the terms
*/
function my_module_update_7001(&$sandbox) {
$names = array(
'Fiona',
'Jesse',
'Michael',
...
'Sam',
'Nate',
);
if (!isset($sandbox['progress'])) {
$sandbox['progress'] = 0;
$sandbox['limit'] = 5;
$sandbox['max'] = count($names);
// Create the vocabulary
$vocab = (object) array(
'name' => 'Names',
'description' => 'My name vocabulary.',
'machine_name' => 'names_vocabulary',
);
taxonomy_vocabulary_save($vocab);
$sandbox['vocab'] = taxonomy_vocabulary_machine_name_load('names_vocabulary');
}
// Create the terms
$chunk = array_slice($names, $sandbox['progress'], $sandbox['limit']);
if (!empty($chunk)) {
foreach ($chunk as $key => $name) {
$term = (object) array(
'name' => $name,
'description' => 'The name is: ' . $name,
'vid' => $sandbox['vocab']->vid,
);
taxonomy_term_save($term);
$sandbox['progress']++;
}
}
$sandbox['#finished'] = ($sandbox['progress'] / $sandbox['max']);
}
So what happens here? First, we are dealing with an array of names (can anybody recognise them by the way?) Then we basically see if we are at the first pass by checking if we had set already the progress
key in $sandbox
. If we are at the first pass, we set some defaults: a limit of 5 terms per pass out of a total of count($names)
. Additionally, we create the vocabulary and store it as a loaded object in the sandbox as well (because we need its id for creating the terms).
Then, regardless of the pass we are on, we take a chunk out of the names always offset by the progress of the operation. And with each term created, we increment this progress by one (so with each chunk, the progress increases by 5) and of course create the terms. At the very end, we keep setting the value of $sandbox['#finished']
to the ratio of progress per total. Meaning that with each pass, this value increases from an original of 0 to a maximum of 1 (at which point Drupal knows it needs to stop calling the hook).
And like this, we save a bunch of terms without worrying that PHP will time out or the server will be overloaded. Drupal will keep calling the hook as many times as needed. And depending on the operation, you can set your own sensible chunk sizes.
Hope this helps.
Daniel Sipos
Danny founded WEBOMELETTE in 2012 as a passion project, mostly writing about Drupal problems he faced day to day, as well as about new technologies and things that he thought other developers would find useful. Now he now manages a team of developers and designers, delivering quality products that make businesses successful.
Add new comment