All the above queries are incredibly slow on big tables. A change of strategy is needed. Here there is the code I used for a DB of mine, you can transliterate changing the fields and table names.
This is the strategy: you create two implicit temporary tables and make a union of them.
- The first temporary table comes from a selection of all the rows of the first original table the fields of which you wanna control that are NOT present in the second original table.
- The second implicit temporary table contains all the rows of the two original tables that have a match on identical values of the column/field you wanna control.
- The result of the union is a table that has more than one row with the same control field value in case there is a match for that value on the two original tables (one coming from the first select, the second coming from the second select) and just one row with the control column value in case of the value of the first original table not matching any value of the second original table.
- You group and count. When the count is 1 there is not match and, finally, you select just the rows with the count equal to 1.
Seems not elegant, but it is orders of magnitude faster than all the above solutions.
IMPORTANT NOTE: enable the INDEX on the columns to be checked.
SELECT name, source, id FROM ( SELECT name, "active_ingredients" as source, active_ingredients.id as id FROM active_ingredients UNION ALL SELECT active_ingredients.name as name, "UNII_database" as source, temp_active_ingredients_aliases.id as id FROM active_ingredients INNER JOIN temp_active_ingredients_aliases ON temp_active_ingredients_aliases.alias_name = active_ingredients.name ) tbl GROUP BY name HAVING count(*) = 1 ORDER BY name