附录C 数据库API参考
Django数据库API是附录B中讨论过的数据模型API的另一部分。一旦定义了数据模型,你将会在任何要访问数据库的时候使用数据库API。你已经在本书中看到了很多数据库API的例子,这篇附录对数据库API的各种变化详加阐释。
和附录B中讨论的数据模型API时一样,尽管认为这些API已经很稳定,Django开发者一直在增加各种便捷方法。因此,查看最新的在线文档是个好方法,在线文档可以在 http://www.djangoproject.com/documentation/0.96/db-api/ 找到.
贯穿这个参考文档,我们都会提到下面的这个models。它或许来自于一个简单的博客程序。
from django.db import models
class Blog(models.Model):
name =
models.CharField(max_length=100)
tagline = models.TextField()
def __str__(self):
return self.name
class Author(models.Model):
name =
models.CharField(max_length=50)
email = models.EmailField()
def __str__(self):
return self.name
class Entry(models.Model):
blog = models.ForeignKey(Blog)
headline =
models.CharField(max_length=255)
body_text =
models.TextField()
pub_date =
models.DateTimeField()
authors =
models.ManyToManyField(Author)
def __str__(self):
return self.headline
创建对象
要创建一个对象, 用模型类使用关键字参数实例化它, 接着调用 save() 将它保存到数据库中:
>>> from mysite.blog.models import
Blog
>>> b = Blog(name='Beatles Blog',
tagline='All the latest Beatles news.')
>>> b.save()
这会在后台执行一个SQL语句. 如果您不显式地调用 save() , Django不会保存到数据库.
save() 方法没有返回值.
要在一个步骤中创建并保存一个对象, 参见会稍后讨论的 create 管理者方法,
当您保存的时候发生了什么?
当您保存一个对象的时候, Django执行下面的步骤:
发出一个预存信号。 它发出一个将要存储一个对象的通知。你可以注册一个监听程序,在信号发出的时候就会被调用。到本书出版时,这些信号仍在开发中并且没有文档化,请查看在线文档来获得最新的消息。
预处理数据. 对于对象的每个字段,将根据需要进行自动的数据修改。
大部分字段并不预处理,它们会保持它们原来的样子。预处理仅仅用在那些有特殊性质的字段,比如文件字段。
为数据库准备数据。 每一个字段先要把当前值转化成数据库中可以保存的数据的类型。
大多数字段的数据不需要预先准备。简单的数据类型,比如整型和字符串等python对象可以直接写进数据库。然而,更复杂的数据类型需要做一些修改。比如, DateFields 使用python的datetime 对象来存储数据。数据库并不能存储 datetime 对象,所以该字段要存入数据库先要把值转化为符合ISO标准的日期字符串。
向数据库中插入数据。 经过预处理准备好的数据然后会组合成一条SQL语句来插入数据库。
发出存毕信号。 与预存信号类似,存毕信号在对象成功保存之后发出。同样,这些信号也还没有文档化。
自增主键
为了方便,每个数据库模型都会添加一个自增主键字段,即 id 。除非你在某个字段属性中显式的指定 primary_key=True (参见附录B中题为AutoField的章节)。
如果你的数据库模型中包括 AutoField ,这个自增量的值将会在你第一次调用 save() 时作为对象的一个属性计算得出并保存起来。
>>> b2 = Blog(name='Cheddar Talk',
tagline='Thoughts on cheese.')
>>> b2.id # Returns None,
because b doesn't have an ID yet.
None
>>> b2.save()
>>> b2.id # Returns the ID of
your new object.
14
在调用 save() 方法之前没有办法知道ID的值,因为这个值是数据库计算出来的,不是Django。
如果你想在一个新数据存储时,定义其 AutoField 字段值,而不依赖于数据库自动分配,明确赋值即可。
>>> b3 = Blog(id=3, name='Cheddar
Talk', tagline='Thoughts on cheese.')
>>> b3.id
3
>>> b3.save()
>>> b3.id
3
如果你手动指定自增主键的值,要确保这个主键在数据库中不存在!如果你显式地指定主键来创建新对象,而这个主键在数据库中已经存在的话,Django会认为你要更改已经存在的那条记录,而不是创建一个新的。
以前面的 'Cheddar Talk' blog为例,下面的例子会覆盖数据库中已经存在的记录:
>>> b4 = Blog(id=3, name='Not
Cheddar', tagline='Anything but cheese.')
>>> b4.save() # Overrides the previous blog with ID=3!
如果你确信不会产生主键冲突的话,当需要保存大量对象的时候,明确指定自增主键的值是非常有用的。
保存对对象做的修改
要保存一个已经在数据库中存在的对象的变更, 使用 save() .
假定 b5 这个 Blog 实例已经保存到数据库中,下面这个例子更改了它的名字,并且更新了它在数据库中的记录:
>>> b5.name = 'New name'
>>> b5.save()
这个例子在后台执行了 UPDATE 这一SQL语句。再次声明,Django在你显式地调用 save() 之前是不会更新数据库的。
Django如何得知何时 UPDATE ,何时 INSERT 呢
你可能已经注意到Django数据库对象在创建和更改对象时,使用了同一个 save() 函数。Django抽象化了对SQL语句中的INSERT 和 UPDATE 的需求,当你调用 save() 的时候,Django会遵守下面的原则:
§
如果对象的主键属性被设置成相当于 True 的值(比如 None 或者空字符串之外的值),Django会执行一个 SELECT 查询来检测是否已存在一个相同主键的记录。
§
如果已经存在一个主键相同的记录,Django就执行 UPDATE 查询。
§
如果对象的主键属性 没有 被设置,或者被设置但数据库中没有与之同主键的记录,那么Django就会执行 INSERT 查询。
正因如此,如果你不能确信数据库中不存在主键相同的记录的话,你应该避免没有明确指定主键的值。
更新 ForeignKey 字段原理是一样的,只是要给这个字段赋予正确类型的对象就行了。
>>> joe =
Author.objects.create(name="Joe")
>>> entry.author = joe
>>> entry.save()
如果你把一个错误类型的对象赋给它,Django会警报的。
获取对象
在这本书中,获取对象都使用下面这样的代码实现的:
>>> blogs =
Blog.objects.filter(author__name__contains="Joe")
在这幕后会有相当多的步骤:当你从数据库中获取对象的时候,你实际上用 Manager 模块构造了一个 QuerySet ,这个 QuerySet知道怎样去执行SQL语句并返回你想要的对象。
附录B从模块定义的角度讨论了这两个对象,现在让我们研究一下它们是怎么工作的。
QuerySet 代表了你的数据库中的对象的一个集合。它根据所给参数可以构造若干个 过滤器 来缩小这个集合的规模。用SQL术语来讲,一个 QuerySet 就相当于一个 SELECT 语句,过滤器相当于诸如 WHERE 或者 LIMIT 的限定语。
你通过模块的 Manager 就可以得到一个 QuerySet 。每个模块至少有一个 Manager ,默认名称是 objects 。可以通过模块类来直接访问它,比如:
>>> Blog.objects
<django.db.models.manager.Manager object
at 0x137d00d>
为了强制分离数据表级别的操作和数据记录级别的操作, Manager 只能通过模块类而不是模块实例来访问:
>>> b = Blog(name='Foo',
tagline='Bar')
>>> b.objects
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: Manager isn't accessible
via Blog instances.
对一个模块来讲, Manager 是 QuerySets 的主要来源。它就像一个根本的 QuerySet ,可以对模块的数据库表中的所有对象进行描述。比如, Blog.objects 就是包含着数据库中所有的 Blog 对象的一个根本的 QuerySet 。
缓存与查询集
为了减少数据库访问次数,每个 QuerySet 包含一个缓存,要写出高效的代码,理解这一点很重要。
在刚被创建的 QuerySet 中,缓存是空的。当 QuerySet 第一次被赋值,就是执行数据库查询的时候,Django会把查询结果保存到这个 QuerySet 的缓存中,并返回请求结果(例如, QuerySet 迭代结束的时候,就会返回下一条记录)。再次使用 QuerySet的值的话会重复使用缓存中的内容。
要时刻记住这种缓存机制,因为如果你不正确的使用 QuerySet 的话,可能会遇到麻烦。例如,下面这段代码会分别产生两个QuerySet ,计算出来然后丢弃。
print [e.headline for e in
Entry.objects.all()]
print [e.pub_date for e in
Entry.objects.all()]
这就意味着相同的数据库的查询会被执行两次,使数据库的负载加倍。而且这两个列表包含的数据可能不同,因为在两次查询的间隙,可能有一个 Entry 被添加或是删除了。
避免这个问题,简单的方法是保存这个 QuerySet 并且重用它。
queryset = Poll.objects.all()
print [p.headline for p in queryset] #
Evaluate the query set.
print [p.pub_date for p in queryset] #
Reuse the cache from the evaluation.
过滤器对象
从数据表中获取对象的最简单的方法就是得到所有的对象,就是调用一个 Manager 的 all() 方法。
>>> Entry.objects.all()
all() 方法返回一个包含数据库的所有对象的 QuerySet 。
但是通常情况下,只需要从所有对象中请求一个子集,这就需要你细化一下刚才的 QuerySet ,加一些过滤条件。用 filter() 和exclude() 方法可以实现这样的功能:
>>> y2006 =
Entry.objects.filter(pub_date__year=2006)
>>> not2006 =
Entry.objects.exclude(pub_date__year=2006)
filter() 和 exclude() 方法都接受 字段查询 参数,我们稍后会详细讨论。
级联过滤器
细化过的 QuerySet 本身就是一个 QuerySet ,所以可以进一步细化,比如:
>>> qs =
Entry.objects.filter(headline__startswith='What')
>>> qs =
qs..exclude(pub_date__gte=datetime.datetime.now())
>>> qs =
qs.filter(pub_date__gte=datetime.datetime(2005, 1, 1))
这样,我们把最初过的数据库中所有内容的一个 QuerySet 经过添加一个过滤器、一个反向过滤器和另外一个过滤器,得到一个最终的 QuerySet ,最终结果中包含了所有标题以“What”开头的2005年至今的出版的条目。
这里需要指出的一点是,创建一个 QuerySet 并不会牵涉到任何数据库动作。事实上,上面的三行并不会产生 任何的 数据库调用。就是说你可以连接任意多个过滤器,只要你不把这个 QuerySet 用于赋值的话,Django是不会执行查询的。
你可以用下面的方法来计算 QuerySet 的值:
迭代 : QuerySet 是可以迭代的,它会在迭代结束的时候执行数据库查询。例如,下面的这个QuerySet 在for循环迭代完毕之前,是不会被赋值的:
qs =
Entry.objects.filter(pub_date__year=2006)
qs =
qs.filter(headline__icontains="bill")
for e in qs:
print e.headline
它会打印2006年所有包含bill的标题,但只会触发一次数据库访问。
打印 :对 QuerySet 使用 repr() 方法时,它是会被赋值的。这是为了方便Python的交互解释器,这样在交互环境中使用API时就会立刻看到结果。
切片 : 在接下来的“限量查询集”一节中就会解释这一点, QuerySet 是可以用Python的数组切片的语法来切片的。通常切片过的 QuerySet 会返回另外一个(尚未赋值的) QuerySet ,但是如果在切片时使用步长参数的话,Django会执行数据库查询的。
转化成列表 :对 QuerySet 调用 list() 方法的话,就可以对它强制赋值,比如:
>>> entry_list = list(Entry.objects.all())
但是,需要警告的是这样做会导致很大的内存负载,因为Django会把列表的每一个元素加载到内存。相比之下,对 QuerySet 进行迭代会利用数据库来加载数据,并且在需要的时候才会把对象实例化。
过滤过的查询集是独一无二的
你每次细化一个 QuerySet 都会得带一个崭新的 QuerySet ,绝不会与之前的 QuerySet 有任何的瓜葛。每次的细化都会创建一个各自的截然不同的 QuerySet ,可以用来存储、使用和重用。
q1 = Entry.objects.filter(headline__startswith="What")
q2 =
q1.exclude(pub_date__gte=datetime.now())
q3 =
q1.filter(pub_date__gte=datetime.now())
这三个 QuerySet 是无关的。第一个基础查询集包含了所有标题以What开始的条目。第二个查询集是第一个的子集,只是过滤掉了 pub_date 比当前时间大的记录。第三个查询集也是第一个的子集,只保留 pub_date 比当前时间大的记录。初始的 QuerySet( q1 )是不受细化过程的影响。
限量查询集
可以用Python的数据切片的语法来限定 QuerySet 的结果数量,这和SQL中的 LIMIT 和 OFFSET 语句是一样的。
比如,这句返回前五个条目( LIMIT 5 ):
>>> Entry.objects.all()[:5]
这句返回第六到第十个条目( OFFSET 5 LIMIT 5 ):
>>> Entry.objects.all()[5:10]
一般地,对 QuerySet 进行切片会返回一个新的 QuerySet ,但并不执行查询。如果你在Python切片语法中使用步长参数的话,就会出现特例。例如,要返回前十个对象中的偶序数对象的列表时,实际上会执行查询:
>>> Entry.objects.all()[:10:2]
要得到 单个 对象而不是一个列表时(例如 SELECT foo FROM bar LIMIT 1 ),可以不用切片而是使用下标。例如,这样就会返回数据库中对标题进行字母排序后的第一个 Entry :
>>>
Entry.objects.order_by('headline')[0]
刚才这句和下面的大致相当:
>>>
Entry.objects.order_by('headline')[0:1].get()
但是要记住,如果没有符合条件的记录的话,第一种用法会导致 IndexError ,而第二种用法会导致 DoesNotExist 。
返回新的 QuerySets 的 查询方法
Django提供了一系列的 QuerySet 细化方法,既可以修改 QuerySet 返回的结果的类型,又可以修改对应的SQL查询的执行方法。这就是这一节我们要讨论的内容。其中有一些细化方法会接收字段查询参数,我们稍后会详细讨论。
filter(**lookup)
返回一个新的 QuerySet ,包含匹配参数lookup的对象。
exclude(**kwargs)
返回一个新的 QuerySet ,包含不匹配参数kwargs的对象。
order_by(*fields)
By
default, results returned by a QuerySet are ordered
by the ordering tuple given by the ordering option in the
models metadata (see Appendix B). You can override this for a particular query
using the order_by()method:
>>
Entry.objects.filter(pub_date__year=2005).order_by('-pub_date', 'headline')
This
result will be ordered by pub_date descending,
then by headline ascending. The negative sign in front of"-pub_date" indicates descending order. Ascending order is assumed if the - is absent. To order randomly, use"?" , like so:
>>> Entry.objects.order_by('?')
distinct()
Returns
a new QuerySet that uses SELECT DISTINCT in its SQL query. This eliminates duplicate rows from the query results.
By
default, a QuerySet will not eliminate duplicate rows. In
practice, this is rarely a problem, because simple queries such as Blog.objects.all() dont introduce the possibility of duplicate result rows.
However,
if your query spans multiple tables, its possible to get duplicate results when
a QuerySet is evaluated. Thats when youd use distinct() .
values(*fields)
Returns
a special QuerySet that evaluates to a list of dictionaries
instead of model-instance objects. Each of those dictionaries represents an
object, with the keys corresponding to the attribute names of model objects:
# This list contains a Blog object.
>>>
Blog.objects.filter(name__startswith='Beatles')
[Beatles Blog]
# This list contains a dictionary.
>>>
Blog.objects.filter(name__startswith='Beatles').values()
[{'id': 1, 'name': 'Beatles Blog',
'tagline': 'All the latest Beatles news.'}]
values() takes optional positional arguments, *fields , which specify field names to which the SELECT should be limited. If you specify the fields, each dictionary will contain
only the field keys/values for the fields you specify. If you dont specify the
fields, each dictionary will contain a key and value for every field in the
database table:
>>> Blog.objects.values()
[{'id': 1, 'name': 'Beatles Blog',
'tagline': 'All the latest Beatles news.'}],
>>> Blog.objects.values('id',
'name')
[{'id': 1, 'name': 'Beatles Blog'}]
This
method is useful when you know youre only going to need values from a small
number of the available fields and you wont need the functionality of a model
instance object. Its more efficient to select only the fields you need to use.
dates(field, kind, order)
Returns
a special QuerySet that evaluates to a list of datetime.datetime objects representing all available dates of a particular kind within the
contents of the QuerySet .
The field argument must be the name of a DateField or DateTimeField of your model. The kind argument must
be either "year" , "month" , or "day" . Each datetime.datetime object in the
result list is truncated to the giventype :
§
"year" returns a list of all distinct year values for the field.
§
"month" returns a list of all distinct year/month values for the field.
§
"day" returns a list of all distinct year/month/day values for the field.
order , which defaults to 'ASC' , should be
either 'ASC' or 'DESC' . This specifies how to order the results.
Here
are a few examples:
>>>
Entry.objects.dates('pub_date', 'year')
[datetime.datetime(2005, 1, 1)]
>>> Entry.objects.dates('pub_date',
'month')
[datetime.datetime(2005, 2, 1),
datetime.datetime(2005, 3, 1)]
>>>
Entry.objects.dates('pub_date', 'day')
[datetime.datetime(2005, 2, 20),
datetime.datetime(2005, 3, 20)]
>>>
Entry.objects.dates('pub_date', 'day', order='DESC')
[datetime.datetime(2005, 3, 20),
datetime.datetime(2005, 2, 20)]
>>>
Entry.objects.filter(headline__contains='Lennon').dates('pub_date', 'day')
[datetime.datetime(2005, 3, 20)]
select_related()
Returns
a QuerySet that will automatically follow foreign key
relationships, selecting that additional related-object data when it executes
its query. This is a performance booster that results in (sometimes much)
larger queries but means later use of foreign key relationships wont require
database queries.
The
following examples illustrate the difference between plain lookups and select_related() lookups. Heres standard lookup:
# Hits the database.
>>> e = Entry.objects.get(id=5)
# Hits the database again to get the
related Blog object.
>>> b = e.blog
And
heres select_related lookup:
# Hits the database.
>>> e =
Entry.objects.select_related().get(id=5)
# Doesn't hit the database, because e.blog
has been prepopulated
# in the previous query.
>>> b = e.blog
select_related() follows foreign keys as far as possible. If you have the following models:
class City(models.Model):
# ...
class Person(models.Model):
# ...
hometown =
models.ForeignKey(City)
class Book(models.Model):
# ...
author =
models.ForeignKey(Person)
then
a call to Book.objects.select_related().get(id=4) will cache the related Person and the related City :
>>> b =
Book.objects.select_related().get(id=4)
>>> p = b.author
# Doesn't hit the database.
>>> c = p.hometown # Doesn't
hit the database.
>>> b = Book.objects.get(id=4) #
No select_related() in this example.
>>> p = b.author
# Hits the database.
>>> c = p.hometown # Hits the
database.
Note
that select_related() does not follow foreign keys that have null=True .
Usually,
using select_related() can vastly improve performance because
your application can avoid many database calls. However, in situations with
deeply nested sets of relationships, select_related() can sometimes
end up following too many relations and can generate queries so large that they
end up being slow.
extra()
Sometimes,
the Django query syntax by itself cant easily express a complex WHERE clause. For these edge cases, Django provides the extra() QuerySet modifier a hook for injecting specific clauses into the SQL generated by a QuerySet .
By
definition, these extra lookups may not be portable to different database
engines (because youre explicitly writing SQL code) and violate the DRY
principle, so you should avoid them if possible.
Specify
one or more of params , select , where , or tables . None of the arguments is required, but
you should use at least one of them.
The select argument lets you put extra fields in the SELECT clause. It should be a dictionary mapping attribute names to SQL clauses
to use to calculate that attribute:
>>> Entry.objects.extra(select={'is_recent':
"pub_date > '2006-01-01'"})
As
a result, each Entry object will have an extra attribute, is_recent , a Boolean representing whether the entrys pub_date is greater than January 1, 2006.
The
next example is more advanced; it does a subquery to give each resulting Blog object an entry_countattribute, an integer count of associated Entry objects:
>>> subq = 'SELECT COUNT(*) FROM
blog_entry WHERE blog_entry.blog_id = blog_blog.id'
>>>
Blog.objects.extra(select={'entry_count': subq})
(In
this particular case, were exploiting the fact that the query will already
contain the blog_blog table in itsFROM clause.)
You
can define explicit SQL WHERE clauses
perhaps to perform nonexplicit joins by using where . You can manually add tables to the SQL FROM clause by using tables .
where and tables both take a list of strings. All where parameters are ANDed to any other search criteria:
>>> Entry.objects.extra(where=['id
IN (3, 4, 5, 20)'])
The select and where parameters described previously may use
standard Python database string placeholders: '%s' to indicate parameters the database engine should automatically quote. The paramsargument is a list of any extra parameters
to be substituted:
>>>
Entry.objects.extra(where=['headline=%s'], params=['Lennon'])
Always
use params instead of embedding values directly into select or where because params will ensure values are quoted correctly according to your particular
database.
Heres
an example of the wrong way:
Entry.objects.extra(where=["headline='%s'"
% name])
Heres
an example of the correct way:
Entry.objects.extra(where=['headline=%s'],
params=[name])
QuerySet Methods That Do Not Return
QuerySets
The
following QuerySet methods evaluate the QuerySet and return something otherthan a QuerySet a single
object, value, and so forth.
get(**lookup)
Returns
the object matching the given lookup parameters, which should be in the format
described in the Field Lookups section. This raises AssertionError if more than one object was found.
get() raises a DoesNotExist exception if an object wasnt found for the
given parameters. The DoesNotExistexception is an attribute of the model class,
for example:
>>> Entry.objects.get(id='foo') #
raises Entry.DoesNotExist
The DoesNotExist exception inherits from django.core.exceptions.ObjectDoesNotExist , so you can target multipleDoesNotExist exceptions:
>>> from django.core.exceptions
import ObjectDoesNotExist
>>> try:
... e =
Entry.objects.get(id=3)
... b =
Blog.objects.get(id=1)
... except ObjectDoesNotExist:
... print "Either the
entry or blog doesn't exist."
create(**kwargs)
This
is a convenience method for creating an object and saving it all in one step.
It lets you compress two common steps:
>>> p =
Person(first_name="Bruce", last_name="Springsteen")
>>> p.save()
into
a single line:
>>> p =
Person.objects.create(first_name="Bruce",
last_name="Springsteen")
get_or_create(**kwargs)
This
is a convenience method for looking up an object and creating one if it doesnt
exist. It returns a tuple of (object, created) , where object is the retrieved or created object and created is a Boolean specifying whether a new object was created.
This
method is meant as a shortcut to boilerplate code and is mostly useful for
data-import scripts, for example:
try:
obj = Person.objects.get(first_name='John',
last_name='Lennon')
except Person.DoesNotExist:
obj =
Person(first_name='John', last_name='Lennon', birthday=date(1940, 10, 9))
obj.save()
This
pattern gets quite unwieldy as the number of fields in a model increases. The previous
example can be rewritten using get_or_create() like so:
obj, created =
Person.objects.get_or_create(
first_name = 'John',
last_name = 'Lennon',
defaults = {'birthday': date(1940, 10, 9)}
)
Any
keyword arguments passed to get_or_create() except an optional one called defaults will be used
in aget() call. If an
object is found, get_or_create() returns a tuple of that object and False . If an object is notfound, get_or_create() will instantiate and save a new object, returning a tuple of the new
object and True . The new object will be created according
to this algorithm:
defaults = kwargs.pop('defaults', {})
params = dict([(k, v) for k, v in
kwargs.items() if '__' not in k])
params.update(defaults)
obj = self.model(**params)
obj.save()
In
English, that means start with any non-'defaults' keyword argument that doesnt contain a
double underscore (which would indicate a nonexact lookup). Then add the
contents of defaults , overriding any keys if necessary, and
use the result as the keyword arguments to the model class.
If
you have a field named defaults and want to
use it as an exact lookup in get_or_create() , just use'defaults__exact' like so:
Foo.objects.get_or_create(
defaults__exact = 'bar',
defaults={'defaults': 'baz'}
)
Note
As
mentioned earlier, get_or_create() is mostly
useful in scripts that need to parse data and create new records if existing
ones arent available. But if you need to use get_or_create() in a view, please make sure to use it only in POST requests unless you have a good reason not to. GET requests shouldnt have any effect on data; use POST whenever a request to a page has a side effect on your data.
count()
Returns
an integer representing the number of objects in the database matching the QuerySet . count() never raises exceptions. Heres an example:
# Returns the total number of entries in
the database.
>>> Entry.objects.count()
4
# Returns the number of entries whose
headline contains 'Lennon'
>>>
Entry.objects.filter(headline__contains='Lennon').count()
1
count() performs a SELECT COUNT(*) behind the
scenes, so you should always use count() rather than
loading all of the records into Python objects and calling len() on the result.
Depending
on which database youre using (e.g., PostgreSQL or MySQL), count() may return a long integer instead of a normal Python integer. This is an
underlying implementation quirk that shouldnt pose any real-world problems.
in_bulk(id_list)
Takes
a list of primary key values and returns a dictionary mapping each primary key
value to an instance of the object with the given ID, for example:
>>> Blog.objects.in_bulk([1])
{1: Beatles Blog}
>>> Blog.objects.in_bulk([1, 2])
{1: Beatles Blog, 2: Cheddar Talk}
>>> Blog.objects.in_bulk([])
{}
IDs
of objects that dont exist are silently dropped from the result dictionary. If
you pass in_bulk() an empty list, youll get an empty
dictionary.
latest(field_name=None)
Returns
the latest object in the table, by date, using the field_name provided as the date field. This example returns the latest Entry in the table, according to the pub_date field:
>>>
Entry.objects.latest('pub_date')
If
your models Meta specifies get_latest_by , you can leave off the field_name argument to latest() . Django will use the field specified in get_latest_by by default.
Like get() , latest() raises DoesNotExist if an object doesnt exist with the given parameters.
Field Lookups
Field
lookups are how you specify the meat of an SQL WHERE clause. Theyre specified as keyword arguments to the QuerySet methods filter() , exclude() , and get() .
Basic
lookup keyword arguments take the form field__lookuptype=value (note the double underscore). For example:
>>>
Entry.objects.filter(pub_date__lte='2006-01-01')
translates
(roughly) into the following SQL:
SELECT * FROM blog_entry WHERE pub_date
<= '2006-01-01';
If
you pass an invalid keyword argument, a lookup function will raise TypeError .
The
supported lookup types follow.
exact
Performs
an exact match:
>>>
Entry.objects.get(headline__exact="Man bites dog")
This
matches any object with the exact headline Man bites dog.
If
you dont provide a lookup type that is, if your keyword argument doesnt contain
a double underscore the lookup type is assumed to be exact .
For
example, the following two statements are equivalent:
>>> Blog.objects.get(id__exact=14)
# Explicit form
>>> Blog.objects.get(id=14) #
__exact is implied
This
is for convenience, because exact lookups are
the common case.
iexact
字符串比较(大小写无关)
>>>
Blog.objects.get(name__iexact='beatles blog')
This
will match 'Beatles Blog' , 'beatles blog' , 'BeAtLes BLoG' , and so
forth.
contains
Performs
a case-sensitive containment test:
Entry.objects.get(headline__contains='Lennon')
This
will match the headline 'Today Lennon honored' but not 'today lennon honored' .
SQLite
doesnt support case-sensitive LIKE statements;
when using SQLite,``contains`` acts like icontains .
Escaping
Percent Signs and Underscores in LIKE Statements
The
field lookups that equate to LIKE SQL
statements (iexact , contains , icontains , startswith , istartswith ,endswith , and iendswith ) will automatically escape the two special characters used in LIKE statements the percent sign and the underscore. (In a LIKE statement, the percent sign signifies a multiple-character wildcard and
the underscore signifies a single-character wildcard.)
This
means things should work intuitively, so the abstraction doesnt leak. For
example, to retrieve all the entries that contain a percent sign, just use the
percent sign as any other character:
Entry.objects.filter(headline__contains='%')
Django
takes care of the quoting for you. The resulting SQL will look something like
this:
SELECT ... WHERE headline LIKE '%\%%';
The
same goes for underscores. Both percentage signs and underscores are handled
for you transparently.
icontains
Performs
a case-insensitive containment test:
>>>
Entry.objects.get(headline__icontains='Lennon')
Unlike contains , icontains will match 'today lennon honored' .
gt, gte, lt, and lte
These
represent greater than, greater than or equal to, less than, and less than or
equal to:
>>> Entry.objects.filter(id__gt=4)
>>>
Entry.objects.filter(id__lt=15)
>>>
Entry.objects.filter(id__gte=0)
These
queries return any object with an ID greater than 4, an ID less than 15, and an
ID greater than or equal to 1, respectively.
Youll
usually use these on numeric fields. Be careful with character fields since
character order isnt always what youd expect (i.e., the string 4 sorts after the string 10).
in
Filters
where a value is on a given list:
Entry.objects.filter(id__in=[1, 3, 4])
This
returns all objects with the ID 1, 3, or 4.
startswith
Performs
a case-sensitive starts-with:
>>>
Entry.objects.filter(headline__startswith='Will')
This
will return the headlines Will he run? and Willbur named judge, but not Who is
Will? or will found in crypt.
istartswith
Performs
a case-insensitive starts-with:
>>>
Entry.objects.filter(headline__istartswith='will')
This
will return the headlines Will he run?, Willbur named judge, and will found in
crypt, but not Who is Will?
endswith and iendswith
Perform
case-sensitive and case-insensitive ends-with:
>>>
Entry.objects.filter(headline__endswith='cats')
>>>
Entry.objects.filter(headline__iendswith='cats')
range
Performs
an inclusive range check:
>>> start_date =
datetime.date(2005, 1, 1)
>>> end_date = datetime.date(2005,
3, 31)
>>>
Entry.objects.filter(pub_date__range=(start_date, end_date))
You
can use range anywhere you can use BETWEEN in SQL for dates, numbers, and even characters.
year, month, and day
For
date/datetime fields, perform exact year, month, or day matches:
# Year lookup
>>>Entry.objects.filter(pub_date__year=2005)
# Month lookup -- takes integers
>>>
Entry.objects.filter(pub_date__month=12)
# Day lookup
>>>
Entry.objects.filter(pub_date__day=3)
# Combination: return all entries on
Christmas of any year
>>>
Entry.objects.filter(pub_date__month=12, pub_date_day=25)
isnull
Takes
either True or False , which correspond to SQL queries of IS NULL and IS NOT NULL ,
respectively:
>>>
Entry.objects.filter(pub_date__isnull=True)
__isnull=True vs. __exact=None
There
is an important difference between __isnull=True and __exact=None . __exact=None will always return an empty result set, because SQL requires that no value is equal to NULL . __isnull determines if the field is currently
holding the value of NULL without
performing a comparison.
search
A
Boolean full-text search that takes advantage of full-text indexing. This is
like contains but is significantly faster due to
full-text indexing.
Note
this is available only in MySQL and requires direct manipulation of the
database to add the full-text index.
The pk Lookup Shortcut
For
convenience, Django provides a pk lookup type, which stands for primary_key.
In
the example Blog model, the primary key is the id field, so these three statements are equivalent:
>>> Blog.objects.get(id__exact=14)
# Explicit form
>>> Blog.objects.get(id=14) #
__exact is implied
>>> Blog.objects.get(pk=14) # pk
implies id__exact
The
use of pk isnt limited to __exact queries any query term can be combined with pk to perform a query on the primary key of a model:
# Get blogs entries with id 1, 4, and 7
>>>
Blog.objects.filter(pk__in=[1,4,7])
# Get all blog entries with id > 14
>>> Blog.objects.filter(pk__gt=14)
pk lookups also work across joins. For example, these three statements are
equivalent:
>>>
Entry.objects.filter(blog__id__exact=3) # Explicit form
>>>
Entry.objects.filter(blog__id=3) # __exact is implied
>>> Entry.objects.filter(blog__pk=3)
# __pk implies __id__exact
Complex Lookups with Q Objects
Keyword
argument queries in filter() and so on are
ANDed together. If you need to execute more complex queries (e.g., queries with OR statements), you can use Q objects.
A Q object (django.db.models.Q ) is an object used to encapsulate a collection of keyword arguments.
These keyword arguments are specified as in the Field Lookups section.
For
example, this Q object encapsulates a single LIKE query:
Q(question__startswith='What')
Q objects can be combined using the & and | operators. When an operator is used on two Q objects, it yields a new Q object. For example, this statement yields
a single Q object that represents the OR of two"question__startswith" queries:
Q(question__startswith='Who') |
Q(question__startswith='What')
This
is equivalent to the following SQL WHERE clause:
WHERE question LIKE 'Who%' OR question LIKE
'What%'
You
can compose statements of arbitrary complexity by combining Q objects with the & and | operators. You can also use parenthetical grouping.
Each
lookup function that takes keyword arguments (e.g., filter() , exclude() , get() ) can also be
passed one or more Q objects as positional (not-named)
arguments. If you provide multiple Q object arguments to a lookup function, the
arguments will be ANDed together, for example:
Poll.objects.get(
Q(question__startswith='Who'),
Q(pub_date=date(2005, 5, 2))
| Q(pub_date=date(2005, 5, 6))
)
roughly
translates into the following SQL:
SELECT * from polls WHERE question LIKE
'Who%'
AND (pub_date = '2005-05-02'
OR pub_date = '2005-05-06')
Lookup
functions can mix the use of Q objects and keyword arguments. All
arguments provided to a lookup function (be they keyword arguments or Q objects) are ANDed together. However, if a Q object is provided, it must precede the definition of any keyword
arguments. For example, the following:
Poll.objects.get(
Q(pub_date=date(2005, 5, 2))
| Q(pub_date=date(2005, 5, 6)),
question__startswith='Who')
would
be a valid query, equivalent to the previous example, but this:
# INVALID QUERY
Poll.objects.get(
question__startswith='Who',
Q(pub_date=date(2005, 5, 2))
| Q(pub_date=date(2005, 5, 6)))
would
not be valid.
You
can find some examples online athttp://www.djangoproject.com/documentation/0.96/models/or_lookups/.
关系对象
When
you define a relationship in a model (i.e., a ForeignKey , OneToOneField , or ManyToManyField ), instances of that model will have a convenient API to access the
related object(s).
For
example, an Entry object e can get its associated Blog object by
accessing the blog attribute e.blog .
Django
also creates API accessors for the other side of the relationship the link from
the related model to the model that defines the relationship. For example, a Blog object b has access to a list of all related Entryobjects via the entry_set attribute: b.entry_set.all() .
All
examples in this section use the sample Blog , Author , and Entry models defined at the top of this page.
Lookups That Span Relationships
Django
offers a powerful and intuitive way to follow relationships in lookups, taking
care of the SQL JOIN s for you automatically behind the scenes.
To span a relationship, just use the field name of related fields across
models, separated by double underscores, until you get to the field you want.
This
example retrieves all Entry objects with
a Blog whose name is 'Beatles Blog' :
>>>
Entry.objects.filter(blog__name__exact='Beatles Blog')
This
spanning can be as deep as youd like.
It
works backward, too. To refer to a reverse relationship, just use the lowercase
name of the model.
This
example retrieves all Blog objects that
have at least one Entry whose headline contains 'Lennon' :
>>>
Blog.objects.filter(entry__headline__contains='Lennon')
外键关系
如果一个模型里面有一个 ForeignKey 字段,那么它的实例化对象可以很轻易的通过模型的属性来访问与其关联的关系对象,例如:
e = Entry.objects.get(id=2)
e.blog # Returns the related Blog object.
你可以通过外键属性来获取并设置关联的外键对象。如你所料,单纯修改外键的操作是不能马上将修改的内容同步到数据库中的,你还必须调用 save() 方法才行,例如:
e = Entry.objects.get(id=2)
e.blog = some_blog
e.save()
如果一个 ForeignKey 字段设置了 null=True 选项(允许 NULL 值)时,你可以将 None 赋给它(译注:但纯设置null=True其实还是不行的,会抛出异常的,还不须把blank=True也设了才行,不知道什么原因,我一直以来都有点怀疑这是个BUG):
e = Entry.objects.get(id=2)
e.blog = None
e.save() # "UPDATE blog_entry SET
blog_id = NULL ...;"
Forward
access to one-to-many relationships is cached the first time the related object
is accessed. Subsequent accesses to the foreign key on the same object instance
are cached, for example:
e = Entry.objects.get(id=2)
print e.blog # Hits the database to retrieve the
associated Blog.
print e.blog # Doesn't hit the database; uses cached
version.
Note
that the select_related() QuerySet method
recursively prepopulates the cache of all one-to-many relationships ahead of
time:
e = Entry.objects.select_related().get(id=2)
print e.blog # Doesn't hit the database; uses cached
version.
print e.blog # Doesn't hit the database; uses cached
version.
select_related() is documented in the QuerySet Methods That Return New QuerySets section.
Reverse Foreign Key Relationships
Foreign
key relationships are automatically symmetrical a reverse relationship is
inferred from the presence of a ForeignKey pointing to
another model.
If
a model has a ForeignKey , instances of the foreign key model will
have access to a Manager that returns all instances of the first
model. By default, this Manager is named FOO_set , where FOO is the source model name, lowercased. This Manager returns QuerySets , which can be filtered and manipulated as
described in the Retrieving Objects section.
Heres
an example:
b = Blog.objects.get(id=1)
b.entry_set.all() # Returns all Entry
objects related to Blog.
# b.entry_set is a Manager that returns
QuerySets.
b.entry_set.filter(headline__contains='Lennon')
b.entry_set.count()
You
can override the FOO_set name by setting the related_name parameter in the ForeignKey() definition.
For example, if the Entry model was
altered to blog = ForeignKey(Blog, related_name='entries') , the preceding example code would look like this:
b = Blog.objects.get(id=1)
b.entries.all() # Returns all Entry objects
related to Blog.
# b.entries is a Manager that returns
QuerySets.
b.entries.filter(headline__contains='Lennon')
b.entries.count()
You
cannot access a reverse ForeignKey Manager from the class; it must be accessed from an instance:
Blog.entry_set # Raises AttributeError:
"Manager must be accessed via instance".
In
addition to the QuerySet methods defined in the Retrieving Objects
section, the ForeignKey Manager has these
additional methods:
add(obj1, obj2, ...) : Adds the specified model objects to
the related object set, for example:
b = Blog.objects.get(id=1)
e = Entry.objects.get(id=234)
b.entry_set.add(e) # Associates Entry e
with Blog b.
create(**kwargs) : Creates a new object, saves it, and
puts it in the related object set. It returns the newly created object:
b = Blog.objects.get(id=1)
e = b.entry_set.create(headline='Hello',
body_text='Hi', pub_date=datetime.date(2005, 1, 1))
# No need to call e.save() at this point --
it's already been saved.
This is equivalent to (but much simpler than) the following:
b = Blog.objects.get(id=1)
e = Entry(blog=b, headline='Hello',
body_text='Hi', pub_date=datetime.date(2005, 1, 1))
e.save()
Note that theres no need to specify the
keyword argument of the model that defines the relationship. In the preceding
example, we dont pass the parameter blog to create() . Django figures out that the new Entry objects blog field should be set to b .
remove(obj1, obj2, ...) : Removes the specified model objects
from the related object set:
b = Blog.objects.get(id=1)
e = Entry.objects.get(id=234)
b.entry_set.remove(e) # Disassociates Entry
e from Blog b.
In order to prevent database inconsistency,
this method only exists on ForeignKey objects where null=True . If the related field cant be set to None (NULL ), then an object cant be removed from a
relation without being added to another. In the preceding example, removing efrom b.entry_set() is equivalent to doing e.blog = None , and because the blog ForeignKeydoesnt have null=True , this is invalid.
clear() : Removes all objects from the related
object set:
b = Blog.objects.get(id=1)
b.entry_set.clear()
Note this doesnt delete the related objects
it just disassociates them.
Just like remove() , clear() is only available on ForeignKey``s where ``null=True .
通过给关联集分配一个可迭代的对象可以实现一股脑的把多个对象赋给它
b = Blog.objects.get(id=1)
b.entry_set = [e1, e2]
If
the clear() method is available, any pre-existing
objects will be removed from the entry_set before all
objects in the iterable (in this case, a list) are added to the set. If the clear() method is not available, all objects in the iterable
will be added without removing any existing elements.
Each
reverse operation described in this section has an immediate effect on the
database. Every addition, creation, and deletion is immediately and
automatically saved to the database.
多对多关系
在多对多关系的两端,都可以通过相应的API来访问另外的一端。 API的工作方式跟前一节所描述的反向一对多关系差不多。
唯一的不同在于属性的命名:定义了``ManyToManyField``的model的实例使用属性名称本身,另外一端的model的实例则使用model名称的小写加上``_set``来活得关联的对象集(就跟反向一对多关系一样)
用例子来说明一下大家会更容易理解:
e = Entry.objects.get(id=3)
e.authors.all() # Returns all Author
objects for this Entry.
e.authors.count()
e.authors.filter(name__contains='John')
a = Author.objects.get(id=5)
a.entry_set.all() # Returns all Entry
objects for this Author.
Like ForeignKey , ManyToManyField can specify related_name . In the preceding example, if the ManyToManyField inEntry had specified related_name='entries' , then each Author instance would have an entries attribute instead of entry_set .
How
Are the Backward Relationships Possible?
Other
object-relational mappers require you to define relationships on both sides.
The Django developers believe this is a violation of the DRY (Dont Repeat
Yourself) principle, so Django requires you to define the relationship on only
one end. But how is this possible, given that a model class doesnt know which
other model classes are related to it until those other model classes are
loaded?
The
answer lies in the INSTALLED_APPS setting. The
first time any model is loaded, Django iterates over every model in INSTALLED_APPS and creates the backward relationships in memory as needed. Essentially,
one of the functions of INSTALLED_APPS is to tell
Django the entire model domain.
Queries Over Related Objects
Queries
involving related objects follow the same rules as queries involving normal
value fields. When specifying the value for a query to match, you may use
either an object instance itself or the primary key value for the object.
For
example, if you have a Blog object b with id=5 , the following three queries would be
identical:
Entry.objects.filter(blog=b) # Query using
object instance
Entry.objects.filter(blog=b.id) # Query
using id from instance
Entry.objects.filter(blog=5) # Query using
id directly
Deleting Objects
The
delete method, conveniently, is named delete() . This method
immediately deletes the object and has no return value:
e.delete()
You
can also delete objects in bulk. Every QuerySet has a delete() method, which deletes all members of thatQuerySet . For example, this deletes all Entry objects with a pub_date year of 2005:
Entry.objects.filter(pub_date__year=2005).delete()
When
Django deletes an object, it emulates the behavior of the SQL constraint ON DELETE CASCADE in other words, any objects that had foreign keys pointing at the object
to be deleted will be deleted along with it, for example:
b = Blog.objects.get(pk=1)
# This will delete the Blog and all of its
Entry objects.
b.delete()
Note
that delete() is the only QuerySet method that is not exposed on a Manager itself. This
is a safety mechanism to prevent you from accidentally requesting Entry.objects.delete() and deleting all the entries. If you do want to delete all the objects, then you have
to explicitly request a complete query set:
Entry.objects.all().delete()
Extra Instance Methods
In
addition to save() and delete() , a model object might get any or all of the following methods.
get_FOO_display()
For
every field that has choices set, the
object will have a get_FOO_display() method, where FOO is the name of the field. This method returns the human-readable value of
the field. For example, in the following model:
GENDER_CHOICES = (
('M', 'Male'),
('F', 'Female'),
)
class Person(models.Model):
name =
models.CharField(max_length=20)
gender =
models.CharField(max_length=1, choices=GENDER_CHOICES)
每一个 Person 实例都将有一个 get_gender_display() 方法:
>>> p = Person(name='John',
gender='M')
>>> p.save()
>>> p.gender
'M'
>>> p.get_gender_display()
'Male'
get_next_by_FOO(**kwargs) and
get_previous_by_FOO(**kwargs)
For
every DateField and DateTimeField that does not have null=True , the object
will have get_next_by_FOO() andget_previous_by_FOO() methods, where FOO is the name
of the field. This returns the next and previous object with respect to the
date field, raising the appropriate DoesNotExist exception
when appropriate.
Both
methods accept optional keyword arguments, which should be in the format
described in the Field Lookups section.
Note
that in the case of identical date values, these methods will use the ID as a
fallback check. This guarantees that no records are skipped or duplicated. For
a full example, see the lookup API samples athttp://www.djangoproject.com/documentation/0.96/models/lookup/.
get_FOO_filename()
For
every FileField , the object will have a get_FOO_filename() method, where FOO is the name
of the field. This returns the full filesystem path to the file, according to
your MEDIA_ROOT setting.
Note
that ImageField is technically a subclass of FileField , so every model with an ImageField will also get
this method.
get_FOO_url()
For
every FileField , the object will have a get_FOO_url() method, where FOO is the name
of the field. This returns the full URL to the file, according to your MEDIA_URL setting. If the value is blank, this method returns an empty string.
get_FOO_size()
For
every FileField , the object will have a get_FOO_size() method, where FOO is the name
of the field. This returns the size of the file, in bytes. (Behind the scenes,
it uses os.path.getsize .)
save_FOO_file(filename, raw_contents)
For
every FileField , the object will have a save_FOO_file() method, where FOO is the name
of the field. This saves the given file to the filesystem, using the given file
name. If a file with the given file name already exists, Django adds an
underscore to the end of the file name (but before the extension) until the
file name is available.
get_FOO_height() and get_FOO_width()
For
every ImageField , the object will have get_FOO_height() and get_FOO_width() methods, where FOO is the name of the field. This returns the height (or width) of the image,
as an integer, in pixels.
Shortcuts
As
you develop views, you will discover a number of common idioms in the way you
use the database API. Django encodes some of these idioms as shortcuts that can
be used to simplify the process of writing views. These functions are in the django.shortcuts module.
get_object_or_404()
One
common idiom to use get() and raise Http404 if the object doesnt exist. This idiom is captured byget_object_or_404() . This
function takes a Django model as its first argument and an arbitrary number of
keyword arguments, which it passes to the default managers get() function. It raises Http404 if the object
doesnt exist, for example:
# Get the Entry with a primary key of 3
e = get_object_or_404(Entry, pk=3)
When
you provide a model to this shortcut function, the default manager is used to
execute the underlyingget() query. If you dont want to use the default manager, or if you want to
search a list of related objects, you can provide get_object_or_404() with a Manager object instead:
# Get the author of blog instance e with a
name of 'Fred'
a = get_object_or_404(e.authors,
name='Fred')
# Use a custom manager 'recent_entries' in
the search for an
# entry with a primary key of 3
e = get_object_or_404(Entry.recent_entries,
pk=3)
get_list_or_404()
get_list_or_404 行为与 get_object_or_404() 相同,但是它用 filter() 取代了 get() 。如果列表为空,它将引发 Http404 。
回归原始的SQL操作
如果你需要写一个SQL查询,但是用Django的数据库映射来实现的话太复杂了,那么你可以考虑使用原始的SQL语句。
解决这个问题的比较好的方法是,给模块写一个自定义的方法或者管理器方法来执行查询。尽管在Django中,数据库查询在模块中没有任何存在的 必要性 ,但是这种解决方案使你的数据访问在逻辑上保持一致,而且从组织代码的角度讲也更灵活。操作指南见附录B。
最后,请记住Django的数据库层仅仅是访问数据库的一个接口,你可以通过其他的工具、编程语言或者数据库框架来访问数据库,它并不是特定于Django使用的。