Description
I have a database with relative large tables. When I run db:build its hangs multiple minutes for checking if the database still contains old enum values.
I found the origin and patched it for me in the past. As we use postgres I think it be a niche issue.
As now I have to migrate my patch to ss-6, I will submit a PR for that anyway.
Additional context or points of discussion
I was able to narrow down the issue to this section:
// Get all enum values
$enumValues = $this->getEnum();
if (DB::get_schema()->hasField($table, $name)) {
$existing = DB::query("SELECT DISTINCT \"{$name}\" FROM \"{$table}\"")->column();
$enumValues = array_unique(array_merge($enumValues, $existing));
}
I am not completly sure why, but for large tables this can be realy slow:
QUERY PLAN
Unique (cost=1000.46..588711.55 rows=1 width=10) (actual time=2764.849..2786.626 rows=1 loops=1)
-> Gather Merge (cost=1000.46..588711.55 rows=2 width=10) (actual time=2764.848..2786.615 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Unique (cost=0.44..587711.29 rows=1 width=10) (actual time=0.062..1831.997 rows=1 loops=3)
" -> Parallel Index Only Scan using ix_e3dc4f5497f7676548ca90e713c5e001 on ""Container_Versions"" (cost=0.44..559543.66 rows=11267055 width=10) (actual time=0.060..1187.245 rows=9052704 loops=3)"
Heap Fetches: 6232848
Planning Time: 0.160 ms
JIT:
Functions: 4
" Options: Inlining true, Optimization true, Expressions true, Deforming true"
" Timing: Generation 0.591 ms, Inlining 150.968 ms, Optimization 23.279 ms, Emission 23.194 ms, Total 198.031 ms"
Execution Time: 2786.999 ms
A group by does't help either.
My solution:
I fetch the current schema of the enum column, compare it to the new schema.
Than I check for the obsolete classes if any of them exists in the table. I don't know why but a EXISTS-Query is fast 🤷
// Get all enum values
$newEnumValues = $this->getEnum();
if (!DB::get_schema()->hasField($table, $name)) {
DBEnum::$enum_cache[$table][$name] = $newEnumValues;
return $newEnumValues;
}
$currentConstraintValues = DB::get_schema()->enumValuesForField($table, $name);
if (!in_array('', $currentConstraintValues)) {
$currentConstraintValues[] = '';
}
if (!in_array('', $newEnumValues)) {
$newEnumValues[] = '';
}
$removedValues = array_diff($currentConstraintValues, $newEnumValues);
if (empty($removedValues)) {
DBEnum::$enum_cache[$table][$name] = $newEnumValues;
return $newEnumValues;
}
$obsoleteValues = [];
foreach ($removedValues as $value) {
$exists = DB::prepared_query(
"SELECT EXISTS(
SELECT 1 FROM \"{$table}\"
WHERE \"{$name}\" = ?
LIMIT 1
) as exists",
[$value]
)->value();
if ($exists) {
$obsoleteValues[] = $value;
}
}
if (!empty($obsoleteValues)) {
DB::alteration_message(
"Found obsolete {$name} values in {$table}: " . implode(', ', $obsoleteValues),
'notice'
);
$newEnumValues = array_merge($newEnumValues, $obsoleteValues);
}
// Cache and return
DBEnum::$enum_cache[$table][$name] = $enumValues;
PR
Validations
Description
I have a database with relative large tables. When I run db:build its hangs multiple minutes for checking if the database still contains old enum values.
I found the origin and patched it for me in the past. As we use postgres I think it be a niche issue.
As now I have to migrate my patch to ss-6, I will submit a PR for that anyway.
Additional context or points of discussion
I was able to narrow down the issue to this section:
I am not completly sure why, but for large tables this can be realy slow:
A group by does't help either.
My solution:
I fetch the current schema of the enum column, compare it to the new schema.
Than I check for the obsolete classes if any of them exists in the table. I don't know why but a EXISTS-Query is fast 🤷
PR
Validations