mirror of
https://github.com/django/django.git
synced 2024-12-01 15:42:04 +01:00
4a954cfd11
This patch does not remove all occurrences of the words in question. Rather, I went through all of the occurrences of the words listed below, and judged if they a) suggested the reader had some kind of knowledge/experience, and b) if they added anything of value (including tone of voice, etc). I left most of the words alone. I looked at the following words: - simply/simple - easy/easier/easiest - obvious - just - merely - straightforward - ridiculous Thanks to Carlton Gibson for guidance on how to approach this issue, and to Tim Bell for providing the idea. But the enormous lion's share of thanks go to Adam Johnson for his patient and helpful review.
344 lines
14 KiB
Plaintext
344 lines
14 KiB
Plaintext
==============
|
|
Custom Lookups
|
|
==============
|
|
|
|
.. currentmodule:: django.db.models
|
|
|
|
Django offers a wide variety of :ref:`built-in lookups <field-lookups>` for
|
|
filtering (for example, ``exact`` and ``icontains``). This documentation
|
|
explains how to write custom lookups and how to alter the working of existing
|
|
lookups. For the API references of lookups, see the :doc:`/ref/models/lookups`.
|
|
|
|
A lookup example
|
|
================
|
|
|
|
Let's start with a small custom lookup. We will write a custom lookup ``ne``
|
|
which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')``
|
|
will translate to the SQL:
|
|
|
|
.. code-block:: sql
|
|
|
|
"author"."name" <> 'Jack'
|
|
|
|
This SQL is backend independent, so we don't need to worry about different
|
|
databases.
|
|
|
|
There are two steps to making this work. Firstly we need to implement the
|
|
lookup, then we need to tell Django about it::
|
|
|
|
from django.db.models import Lookup
|
|
|
|
class NotEqual(Lookup):
|
|
lookup_name = 'ne'
|
|
|
|
def as_sql(self, compiler, connection):
|
|
lhs, lhs_params = self.process_lhs(compiler, connection)
|
|
rhs, rhs_params = self.process_rhs(compiler, connection)
|
|
params = lhs_params + rhs_params
|
|
return '%s <> %s' % (lhs, rhs), params
|
|
|
|
To register the ``NotEqual`` lookup we will need to call ``register_lookup`` on
|
|
the field class we want the lookup to be available for. In this case, the lookup
|
|
makes sense on all ``Field`` subclasses, so we register it with ``Field``
|
|
directly::
|
|
|
|
from django.db.models.fields import Field
|
|
Field.register_lookup(NotEqual)
|
|
|
|
Lookup registration can also be done using a decorator pattern::
|
|
|
|
from django.db.models.fields import Field
|
|
|
|
@Field.register_lookup
|
|
class NotEqualLookup(Lookup):
|
|
# ...
|
|
|
|
We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that
|
|
this registration happens before you try to create any querysets using it. You
|
|
could place the implementation in a ``models.py`` file, or register the lookup
|
|
in the ``ready()`` method of an ``AppConfig``.
|
|
|
|
Taking a closer look at the implementation, the first required attribute is
|
|
``lookup_name``. This allows the ORM to understand how to interpret ``name__ne``
|
|
and use ``NotEqual`` to generate the SQL. By convention, these names are always
|
|
lowercase strings containing only letters, but the only hard requirement is
|
|
that it must not contain the string ``__``.
|
|
|
|
We then need to define the ``as_sql`` method. This takes a ``SQLCompiler``
|
|
object, called ``compiler``, and the active database connection.
|
|
``SQLCompiler`` objects are not documented, but the only thing we need to know
|
|
about them is that they have a ``compile()`` method which returns a tuple
|
|
containing an SQL string, and the parameters to be interpolated into that
|
|
string. In most cases, you don't need to use it directly and can pass it on to
|
|
``process_lhs()`` and ``process_rhs()``.
|
|
|
|
A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for
|
|
left-hand side and right-hand side. The left-hand side is usually a field
|
|
reference, but it can be anything implementing the :ref:`query expression API
|
|
<query-expression>`. The right-hand is the value given by the user. In the
|
|
example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a
|
|
reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the
|
|
right-hand side.
|
|
|
|
We call ``process_lhs`` and ``process_rhs`` to convert them into the values we
|
|
need for SQL using the ``compiler`` object described before. These methods
|
|
return tuples containing some SQL and the parameters to be interpolated into
|
|
that SQL, just as we need to return from our ``as_sql`` method. In the above
|
|
example, ``process_lhs`` returns ``('"author"."name"', [])`` and
|
|
``process_rhs`` returns ``('"%s"', ['Jack'])``. In this example there were no
|
|
parameters for the left hand side, but this would depend on the object we have,
|
|
so we still need to include them in the parameters we return.
|
|
|
|
Finally we combine the parts into an SQL expression with ``<>``, and supply all
|
|
the parameters for the query. We then return a tuple containing the generated
|
|
SQL string and the parameters.
|
|
|
|
A transformer example
|
|
=====================
|
|
|
|
The custom lookup above is great, but in some cases you may want to be able to
|
|
chain lookups together. For example, let's suppose we are building an
|
|
application where we want to make use of the ``abs()`` operator.
|
|
We have an ``Experiment`` model which records a start value, end value, and the
|
|
change (start - end). We would like to find all experiments where the change
|
|
was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``),
|
|
or where it did not exceed a certain amount
|
|
(``Experiment.objects.filter(change__abs__lt=27)``).
|
|
|
|
.. note::
|
|
This example is somewhat contrived, but it nicely demonstrates the range of
|
|
functionality which is possible in a database backend independent manner,
|
|
and without duplicating functionality already in Django.
|
|
|
|
We will start by writing an ``AbsoluteValue`` transformer. This will use the SQL
|
|
function ``ABS()`` to transform the value before comparison::
|
|
|
|
from django.db.models import Transform
|
|
|
|
class AbsoluteValue(Transform):
|
|
lookup_name = 'abs'
|
|
function = 'ABS'
|
|
|
|
Next, let's register it for ``IntegerField``::
|
|
|
|
from django.db.models import IntegerField
|
|
IntegerField.register_lookup(AbsoluteValue)
|
|
|
|
We can now run the queries we had before.
|
|
``Experiment.objects.filter(change__abs=27)`` will generate the following SQL:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT ... WHERE ABS("experiments"."change") = 27
|
|
|
|
By using ``Transform`` instead of ``Lookup`` it means we are able to chain
|
|
further lookups afterwards. So
|
|
``Experiment.objects.filter(change__abs__lt=27)`` will generate the following
|
|
SQL:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT ... WHERE ABS("experiments"."change") < 27
|
|
|
|
Note that in case there is no other lookup specified, Django interprets
|
|
``change__abs=27`` as ``change__abs__exact=27``.
|
|
|
|
This also allows the result to be used in ``ORDER BY`` and ``DISTINCT ON``
|
|
clauses. For example ``Experiment.objects.order_by('change__abs')`` generates:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT ... ORDER BY ABS("experiments"."change") ASC
|
|
|
|
And on databases that support distinct on fields (such as PostgreSQL),
|
|
``Experiment.objects.distinct('change__abs')`` generates:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT ... DISTINCT ON ABS("experiments"."change")
|
|
|
|
When looking for which lookups are allowable after the ``Transform`` has been
|
|
applied, Django uses the ``output_field`` attribute. We didn't need to specify
|
|
this here as it didn't change, but supposing we were applying ``AbsoluteValue``
|
|
to some field which represents a more complex type (for example a point
|
|
relative to an origin, or a complex number) then we may have wanted to specify
|
|
that the transform returns a ``FloatField`` type for further lookups. This can
|
|
be done by adding an ``output_field`` attribute to the transform::
|
|
|
|
from django.db.models import FloatField, Transform
|
|
|
|
class AbsoluteValue(Transform):
|
|
lookup_name = 'abs'
|
|
function = 'ABS'
|
|
|
|
@property
|
|
def output_field(self):
|
|
return FloatField()
|
|
|
|
This ensures that further lookups like ``abs__lte`` behave as they would for
|
|
a ``FloatField``.
|
|
|
|
Writing an efficient ``abs__lt`` lookup
|
|
=======================================
|
|
|
|
When using the above written ``abs`` lookup, the SQL produced will not use
|
|
indexes efficiently in some cases. In particular, when we use
|
|
``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND
|
|
``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``).
|
|
|
|
So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate
|
|
the following SQL:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27
|
|
|
|
The implementation is::
|
|
|
|
from django.db.models import Lookup
|
|
|
|
class AbsoluteValueLessThan(Lookup):
|
|
lookup_name = 'lt'
|
|
|
|
def as_sql(self, compiler, connection):
|
|
lhs, lhs_params = compiler.compile(self.lhs.lhs)
|
|
rhs, rhs_params = self.process_rhs(compiler, connection)
|
|
params = lhs_params + rhs_params + lhs_params + rhs_params
|
|
return '%s < %s AND %s > -%s' % (lhs, rhs, lhs, rhs), params
|
|
|
|
AbsoluteValue.register_lookup(AbsoluteValueLessThan)
|
|
|
|
There are a couple of notable things going on. First, ``AbsoluteValueLessThan``
|
|
isn't calling ``process_lhs()``. Instead it skips the transformation of the
|
|
``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we
|
|
want to get ``"experiments"."change"`` not ``ABS("experiments"."change")``.
|
|
Referring directly to ``self.lhs.lhs`` is safe as ``AbsoluteValueLessThan``
|
|
can be accessed only from the ``AbsoluteValue`` lookup, that is the ``lhs``
|
|
is always an instance of ``AbsoluteValue``.
|
|
|
|
Notice also that as both sides are used multiple times in the query the params
|
|
need to contain ``lhs_params`` and ``rhs_params`` multiple times.
|
|
|
|
The final query does the inversion (``27`` to ``-27``) directly in the
|
|
database. The reason for doing this is that if the ``self.rhs`` is something else
|
|
than a plain integer value (for example an ``F()`` reference) we can't do the
|
|
transformations in Python.
|
|
|
|
.. note::
|
|
In fact, most lookups with ``__abs`` could be implemented as range queries
|
|
like this, and on most database backends it is likely to be more sensible to
|
|
do so as you can make use of the indexes. However with PostgreSQL you may
|
|
want to add an index on ``abs(change)`` which would allow these queries to
|
|
be very efficient.
|
|
|
|
A bilateral transformer example
|
|
===============================
|
|
|
|
The ``AbsoluteValue`` example we discussed previously is a transformation which
|
|
applies to the left-hand side of the lookup. There may be some cases where you
|
|
want the transformation to be applied to both the left-hand side and the
|
|
right-hand side. For instance, if you want to filter a queryset based on the
|
|
equality of the left and right-hand side insensitively to some SQL function.
|
|
|
|
Let's examine case-insensitive transformations here. This transformation isn't
|
|
very useful in practice as Django already comes with a bunch of built-in
|
|
case-insensitive lookups, but it will be a nice demonstration of bilateral
|
|
transformations in a database-agnostic way.
|
|
|
|
We define an ``UpperCase`` transformer which uses the SQL function ``UPPER()`` to
|
|
transform the values before comparison. We define
|
|
:attr:`bilateral = True <django.db.models.Transform.bilateral>` to indicate that
|
|
this transformation should apply to both ``lhs`` and ``rhs``::
|
|
|
|
from django.db.models import Transform
|
|
|
|
class UpperCase(Transform):
|
|
lookup_name = 'upper'
|
|
function = 'UPPER'
|
|
bilateral = True
|
|
|
|
Next, let's register it::
|
|
|
|
from django.db.models import CharField, TextField
|
|
CharField.register_lookup(UpperCase)
|
|
TextField.register_lookup(UpperCase)
|
|
|
|
Now, the queryset ``Author.objects.filter(name__upper="doe")`` will generate a case
|
|
insensitive query like this:
|
|
|
|
.. code-block:: sql
|
|
|
|
SELECT ... WHERE UPPER("author"."name") = UPPER('doe')
|
|
|
|
Writing alternative implementations for existing lookups
|
|
========================================================
|
|
|
|
Sometimes different database vendors require different SQL for the same
|
|
operation. For this example we will rewrite a custom implementation for
|
|
MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=``
|
|
operator. (Note that in reality almost all databases support both, including
|
|
all the official databases supported by Django).
|
|
|
|
We can change the behavior on a specific backend by creating a subclass of
|
|
``NotEqual`` with an ``as_mysql`` method::
|
|
|
|
class MySQLNotEqual(NotEqual):
|
|
def as_mysql(self, compiler, connection, **extra_context):
|
|
lhs, lhs_params = self.process_lhs(compiler, connection)
|
|
rhs, rhs_params = self.process_rhs(compiler, connection)
|
|
params = lhs_params + rhs_params
|
|
return '%s != %s' % (lhs, rhs), params
|
|
|
|
Field.register_lookup(MySQLNotEqual)
|
|
|
|
We can then register it with ``Field``. It takes the place of the original
|
|
``NotEqual`` class as it has the same ``lookup_name``.
|
|
|
|
When compiling a query, Django first looks for ``as_%s % connection.vendor``
|
|
methods, and then falls back to ``as_sql``. The vendor names for the in-built
|
|
backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``.
|
|
|
|
How Django determines the lookups and transforms which are used
|
|
===============================================================
|
|
|
|
In some cases you may wish to dynamically change which ``Transform`` or
|
|
``Lookup`` is returned based on the name passed in, rather than fixing it. As
|
|
an example, you could have a field which stores coordinates or an arbitrary
|
|
dimension, and wish to allow a syntax like ``.filter(coords__x7=4)`` to return
|
|
the objects where the 7th coordinate has value 4. In order to do this, you
|
|
would override ``get_lookup`` with something like::
|
|
|
|
class CoordinatesField(Field):
|
|
def get_lookup(self, lookup_name):
|
|
if lookup_name.startswith('x'):
|
|
try:
|
|
dimension = int(lookup_name[1:])
|
|
except ValueError:
|
|
pass
|
|
else:
|
|
return get_coordinate_lookup(dimension)
|
|
return super().get_lookup(lookup_name)
|
|
|
|
You would then define ``get_coordinate_lookup`` appropriately to return a
|
|
``Lookup`` subclass which handles the relevant value of ``dimension``.
|
|
|
|
There is a similarly named method called ``get_transform()``. ``get_lookup()``
|
|
should always return a ``Lookup`` subclass, and ``get_transform()`` a
|
|
``Transform`` subclass. It is important to remember that ``Transform``
|
|
objects can be further filtered on, and ``Lookup`` objects cannot.
|
|
|
|
When filtering, if there is only one lookup name remaining to be resolved, we
|
|
will look for a ``Lookup``. If there are multiple names, it will look for a
|
|
``Transform``. In the situation where there is only one name and a ``Lookup``
|
|
is not found, we look for a ``Transform`` and then the ``exact`` lookup on that
|
|
``Transform``. All call sequences always end with a ``Lookup``. To clarify:
|
|
|
|
- ``.filter(myfield__mylookup)`` will call ``myfield.get_lookup('mylookup')``.
|
|
- ``.filter(myfield__mytransform__mylookup)`` will call
|
|
``myfield.get_transform('mytransform')``, and then
|
|
``mytransform.get_lookup('mylookup')``.
|
|
- ``.filter(myfield__mytransform)`` will first call
|
|
``myfield.get_lookup('mytransform')``, which will fail, so it will fall back
|
|
to calling ``myfield.get_transform('mytransform')`` and then
|
|
``mytransform.get_lookup('exact')``.
|