附录C 数据库API参考

Django数据库API是附录B中讨论过的数据模型API的另一部分。一旦定义了数据模型,你将会在任何要访问数据库的时候使用数据库API。你已经在本书中看到了很多数据库API的例子,这篇附录对数据库API的各种变化详加阐释。

和附录B中讨论的数据模型API时一样,尽管认为这些API已经很稳定,Django开发者一直在增加各种便捷方法。因此,查看最新的在线文档是个好方法,在线文档可以在 http://www.djangoproject.com/documentation/0.96/db-api/ 找到.

贯穿这个参考文档,我们都会提到下面的这个models。它或许来自于一个简单的博客程序。

from django.db import models

 

class Blog(models.Model):

    name = models.CharField(max_length=100)

    tagline = models.TextField()

 

    def __str__(self):

        return self.name

 

class Author(models.Model):

    name = models.CharField(max_length=50)

    email = models.EmailField()

 

    def __str__(self):

        return self.name

 

class Entry(models.Model):

    blog = models.ForeignKey(Blog)

    headline = models.CharField(max_length=255)

    body_text = models.TextField()

    pub_date = models.DateTimeField()

    authors = models.ManyToManyField(Author)

 

    def __str__(self):

        return self.headline

创建对象

要创建一个对象, 用模型类使用关键字参数实例化它, 接着调用 save() 将它保存到数据库中:

>>> from mysite.blog.models import Blog

>>> b = Blog(name='Beatles Blog', tagline='All the latest Beatles news.')

>>> b.save()

这会在后台执行一个SQL语句. 如果您不显式地调用 save() , Django不会保存到数据库.

save() 方法没有返回值.

要在一个步骤中创建并保存一个对象, 参见会稍后讨论的 create 管理者方法,

当您保存的时候发生了什么?

当您保存一个对象的时候, Django执行下面的步骤:

发出一个预存信号。 它发出一个将要存储一个对象的通知。你可以注册一个监听程序,在信号发出的时候就会被调用。到本书出版时,这些信号仍在开发中并且没有文档化,请查看在线文档来获得最新的消息。

预处理数据. 对于对象的每个字段,将根据需要进行自动的数据修改。

大部分字段并不预处理,它们会保持它们原来的样子。预处理仅仅用在那些有特殊性质的字段,比如文件字段。

为数据库准备数据。 每一个字段先要把当前值转化成数据库中可以保存的数据的类型。

大多数字段的数据不需要预先准备。简单的数据类型,比如整型和字符串等python对象可以直接写进数据库。然而,更复杂的数据类型需要做一些修改。比如, DateFields 使用pythondatetime 对象来存储数据。数据库并不能存储 datetime 对象,所以该字段要存入数据库先要把值转化为符合ISO标准的日期字符串。

向数据库中插入数据。 经过预处理准备好的数据然后会组合成一条SQL语句来插入数据库。

发出存毕信号。 与预存信号类似,存毕信号在对象成功保存之后发出。同样,这些信号也还没有文档化。

自增主键

为了方便,每个数据库模型都会添加一个自增主键字段,即 id 。除非你在某个字段属性中显式的指定 primary_key=True (参见附录B中题为AutoField的章节)。

如果你的数据库模型中包括 AutoField ,这个自增量的值将会在你第一次调用 save() 时作为对象的一个属性计算得出并保存起来。

>>> b2 = Blog(name='Cheddar Talk', tagline='Thoughts on cheese.')

>>> b2.id     # Returns None, because b doesn't have an ID yet.

None

 

>>> b2.save()

>>> b2.id     # Returns the ID of your new object.

14

在调用 save() 方法之前没有办法知道ID的值,因为这个值是数据库计算出来的,不是Django

如果你想在一个新数据存储时,定义其 AutoField 字段值,而不依赖于数据库自动分配,明确赋值即可。

>>> b3 = Blog(id=3, name='Cheddar Talk', tagline='Thoughts on cheese.')

>>> b3.id

3

>>> b3.save()

>>> b3.id

3

如果你手动指定自增主键的值,要确保这个主键在数据库中不存在!如果你显式地指定主键来创建新对象,而这个主键在数据库中已经存在的话,Django会认为你要更改已经存在的那条记录,而不是创建一个新的。

以前面的 'Cheddar Talk' blog为例,下面的例子会覆盖数据库中已经存在的记录:

>>> b4 = Blog(id=3, name='Not Cheddar', tagline='Anything but cheese.')

>>> b4.save()  # Overrides the previous blog with ID=3!

如果你确信不会产生主键冲突的话,当需要保存大量对象的时候,明确指定自增主键的值是非常有用的。

保存对对象做的修改

要保存一个已经在数据库中存在的对象的变更, 使用 save() .

假定 b5 这个 Blog 实例已经保存到数据库中,下面这个例子更改了它的名字,并且更新了它在数据库中的记录:

>>> b5.name = 'New name'

>>> b5.save()

这个例子在后台执行了 UPDATE 这一SQL语句。再次声明,Django在你显式地调用 save() 之前是不会更新数据库的。

Django如何得知何时 UPDATE ,何时 INSERT 

你可能已经注意到Django数据库对象在创建和更改对象时,使用了同一个 save() 函数。Django抽象化了对SQL语句中的INSERT  UPDATE 的需求,当你调用 save() 的时候,Django会遵守下面的原则:

§                     如果对象的主键属性被设置成相当于 True 的值(比如 None 或者空字符串之外的值),Django会执行一个 SELECT 查询来检测是否已存在一个相同主键的记录。

§                     如果已经存在一个主键相同的记录,Django就执行 UPDATE 查询。

§                     如果对象的主键属性 没有 被设置,或者被设置但数据库中没有与之同主键的记录,那么Django就会执行 INSERT 查询。

正因如此,如果你不能确信数据库中不存在主键相同的记录的话,你应该避免没有明确指定主键的值。

更新 ForeignKey 字段原理是一样的,只是要给这个字段赋予正确类型的对象就行了。

>>> joe = Author.objects.create(name="Joe")

>>> entry.author = joe

>>> entry.save()

如果你把一个错误类型的对象赋给它,Django会警报的。

获取对象

在这本书中,获取对象都使用下面这样的代码实现的:

>>> blogs = Blog.objects.filter(author__name__contains="Joe")

在这幕后会有相当多的步骤:当你从数据库中获取对象的时候,你实际上用 Manager 模块构造了一个 QuerySet ,这个 QuerySet知道怎样去执行SQL语句并返回你想要的对象。

附录B从模块定义的角度讨论了这两个对象,现在让我们研究一下它们是怎么工作的。

QuerySet 代表了你的数据库中的对象的一个集合。它根据所给参数可以构造若干个 过滤器 来缩小这个集合的规模。用SQL术语来讲,一个 QuerySet 就相当于一个 SELECT 语句,过滤器相当于诸如 WHERE 或者 LIMIT 的限定语。

你通过模块的 Manager 就可以得到一个 QuerySet 。每个模块至少有一个 Manager ,默认名称是 objects 。可以通过模块类来直接访问它,比如:

>>> Blog.objects

<django.db.models.manager.Manager object at 0x137d00d>

为了强制分离数据表级别的操作和数据记录级别的操作, Manager 只能通过模块类而不是模块实例来访问:

>>> b = Blog(name='Foo', tagline='Bar')

>>> b.objects

Traceback (most recent call last):

  File "<stdin>", line 1, in <module>

AttributeError: Manager isn't accessible via Blog instances.

对一个模块来讲, Manager  QuerySets 的主要来源。它就像一个根本的 QuerySet ,可以对模块的数据库表中的所有对象进行描述。比如, Blog.objects 就是包含着数据库中所有的 Blog 对象的一个根本的 QuerySet 

缓存与查询集

为了减少数据库访问次数,每个 QuerySet 包含一个缓存,要写出高效的代码,理解这一点很重要。

在刚被创建的 QuerySet 中,缓存是空的。当 QuerySet 第一次被赋值,就是执行数据库查询的时候,Django会把查询结果保存到这个 QuerySet 的缓存中,并返回请求结果(例如, QuerySet 迭代结束的时候,就会返回下一条记录)。再次使用 QuerySet的值的话会重复使用缓存中的内容。

要时刻记住这种缓存机制,因为如果你不正确的使用 QuerySet 的话,可能会遇到麻烦。例如,下面这段代码会分别产生两个QuerySet ,计算出来然后丢弃。

print [e.headline for e in Entry.objects.all()]

print [e.pub_date for e in Entry.objects.all()]

这就意味着相同的数据库的查询会被执行两次,使数据库的负载加倍。而且这两个列表包含的数据可能不同,因为在两次查询的间隙,可能有一个 Entry 被添加或是删除了。

避免这个问题,简单的方法是保存这个 QuerySet 并且重用它。

queryset = Poll.objects.all()

print [p.headline for p in queryset] # Evaluate the query set.

print [p.pub_date for p in queryset] # Reuse the cache from the evaluation.

过滤器对象

从数据表中获取对象的最简单的方法就是得到所有的对象,就是调用一个 Manager  all() 方法。

>>> Entry.objects.all()

all() 方法返回一个包含数据库的所有对象的 QuerySet 

但是通常情况下,只需要从所有对象中请求一个子集,这就需要你细化一下刚才的 QuerySet ,加一些过滤条件。用 filter() exclude() 方法可以实现这样的功能:

>>> y2006 = Entry.objects.filter(pub_date__year=2006)

>>> not2006 = Entry.objects.exclude(pub_date__year=2006)

filter()  exclude() 方法都接受 字段查询 参数,我们稍后会详细讨论。

级联过滤器

细化过的 QuerySet 本身就是一个 QuerySet ,所以可以进一步细化,比如:

>>> qs = Entry.objects.filter(headline__startswith='What')

>>> qs = qs..exclude(pub_date__gte=datetime.datetime.now())

>>> qs = qs.filter(pub_date__gte=datetime.datetime(2005, 1, 1))

这样,我们把最初过的数据库中所有内容的一个 QuerySet 经过添加一个过滤器、一个反向过滤器和另外一个过滤器,得到一个最终的 QuerySet ,最终结果中包含了所有标题以“What”开头的2005年至今的出版的条目。

这里需要指出的一点是,创建一个 QuerySet 并不会牵涉到任何数据库动作。事实上,上面的三行并不会产生 任何的 数据库调用。就是说你可以连接任意多个过滤器,只要你不把这个 QuerySet 用于赋值的话,Django是不会执行查询的。

你可以用下面的方法来计算 QuerySet 的值:

迭代  QuerySet 是可以迭代的,它会在迭代结束的时候执行数据库查询。例如,下面的这个QuerySet for循环迭代完毕之前,是不会被赋值的:

qs = Entry.objects.filter(pub_date__year=2006)

qs = qs.filter(headline__icontains="bill")

for e in qs:

    print e.headline

它会打印2006年所有包含bill的标题,但只会触发一次数据库访问。

打印 :对 QuerySet 使用 repr() 方法时,它是会被赋值的。这是为了方便Python的交互解释器,这样在交互环境中使用API时就会立刻看到结果。

切片  在接下来的限量查询集一节中就会解释这一点, QuerySet 是可以用Python的数组切片的语法来切片的。通常切片过的 QuerySet 会返回另外一个(尚未赋值的) QuerySet ,但是如果在切片时使用步长参数的话,Django会执行数据库查询的。

转化成列表 :对 QuerySet 调用 list() 方法的话,就可以对它强制赋值,比如:

>>> entry_list = list(Entry.objects.all())

但是,需要警告的是这样做会导致很大的内存负载,因为Django会把列表的每一个元素加载到内存。相比之下,对 QuerySet 进行迭代会利用数据库来加载数据,并且在需要的时候才会把对象实例化。

过滤过的查询集是独一无二的

你每次细化一个 QuerySet 都会得带一个崭新的 QuerySet ,绝不会与之前的 QuerySet 有任何的瓜葛。每次的细化都会创建一个各自的截然不同的 QuerySet ,可以用来存储、使用和重用。

q1 = Entry.objects.filter(headline__startswith="What")

q2 = q1.exclude(pub_date__gte=datetime.now())

q3 = q1.filter(pub_date__gte=datetime.now())

这三个 QuerySet 是无关的。第一个基础查询集包含了所有标题以What开始的条目。第二个查询集是第一个的子集,只是过滤掉了 pub_date 比当前时间大的记录。第三个查询集也是第一个的子集,只保留 pub_date 比当前时间大的记录。初始的 QuerySet q1 )是不受细化过程的影响。

限量查询集

可以用Python的数据切片的语法来限定 QuerySet 的结果数量,这和SQL中的 LIMIT  OFFSET 语句是一样的。

比如,这句返回前五个条目( LIMIT 5 ):

>>> Entry.objects.all()[:5]

这句返回第六到第十个条目( OFFSET 5 LIMIT 5 ):

>>> Entry.objects.all()[5:10]

一般地,对 QuerySet 进行切片会返回一个新的 QuerySet ,但并不执行查询。如果你在Python切片语法中使用步长参数的话,就会出现特例。例如,要返回前十个对象中的偶序数对象的列表时,实际上会执行查询:

>>> Entry.objects.all()[:10:2]

要得到 单个 对象而不是一个列表时(例如 SELECT foo FROM bar LIMIT 1 ),可以不用切片而是使用下标。例如,这样就会返回数据库中对标题进行字母排序后的第一个 Entry 

>>> Entry.objects.order_by('headline')[0]

刚才这句和下面的大致相当:

>>> Entry.objects.order_by('headline')[0:1].get()

但是要记住,如果没有符合条件的记录的话,第一种用法会导致 IndexError ,而第二种用法会导致 DoesNotExist 

返回新的 QuerySets 查询方法

Django提供了一系列的 QuerySet 细化方法,既可以修改 QuerySet 返回的结果的类型,又可以修改对应的SQL查询的执行方法。这就是这一节我们要讨论的内容。其中有一些细化方法会接收字段查询参数,我们稍后会详细讨论。

filter(**lookup)

返回一个新的 QuerySet ,包含匹配参数lookup的对象。

exclude(**kwargs)

返回一个新的 QuerySet ,包含不匹配参数kwargs的对象。

order_by(*fields)

By default, results returned by a QuerySet are ordered by the ordering tuple given by the ordering option in the models metadata (see Appendix B). You can override this for a particular query using the order_by()method:

>> Entry.objects.filter(pub_date__year=2005).order_by('-pub_date', 'headline')

This result will be ordered by pub_date descending, then by headline ascending. The negative sign in front of"-pub_date" indicates descending order. Ascending order is assumed if the - is absent. To order randomly, use"?" , like so:

>>> Entry.objects.order_by('?')

distinct()

Returns a new QuerySet that uses SELECT DISTINCT in its SQL query. This eliminates duplicate rows from the query results.

By default, a QuerySet will not eliminate duplicate rows. In practice, this is rarely a problem, because simple queries such as Blog.objects.all() dont introduce the possibility of duplicate result rows.

However, if your query spans multiple tables, its possible to get duplicate results when a QuerySet is evaluated. Thats when youd use distinct() .

values(*fields)

Returns a special QuerySet that evaluates to a list of dictionaries instead of model-instance objects. Each of those dictionaries represents an object, with the keys corresponding to the attribute names of model objects:

# This list contains a Blog object.

>>> Blog.objects.filter(name__startswith='Beatles')

[Beatles Blog]

 

# This list contains a dictionary.

>>> Blog.objects.filter(name__startswith='Beatles').values()

[{'id': 1, 'name': 'Beatles Blog', 'tagline': 'All the latest Beatles news.'}]

values() takes optional positional arguments, *fields , which specify field names to which the SELECT should be limited. If you specify the fields, each dictionary will contain only the field keys/values for the fields you specify. If you dont specify the fields, each dictionary will contain a key and value for every field in the database table:

>>> Blog.objects.values()

[{'id': 1, 'name': 'Beatles Blog', 'tagline': 'All the latest Beatles news.'}],

>>> Blog.objects.values('id', 'name')

[{'id': 1, 'name': 'Beatles Blog'}]

This method is useful when you know youre only going to need values from a small number of the available fields and you wont need the functionality of a model instance object. Its more efficient to select only the fields you need to use.

dates(field, kind, order)

Returns a special QuerySet that evaluates to a list of datetime.datetime objects representing all available dates of a particular kind within the contents of the QuerySet .

The field argument must be the name of a DateField or DateTimeField of your model. The kind argument must be either "year" , "month" , or "day" . Each datetime.datetime object in the result list is truncated to the giventype :

§                     "year" returns a list of all distinct year values for the field.

§                     "month" returns a list of all distinct year/month values for the field.

§                     "day" returns a list of all distinct year/month/day values for the field.

order , which defaults to 'ASC' , should be either 'ASC' or 'DESC' . This specifies how to order the results.

Here are a few examples:

>>> Entry.objects.dates('pub_date', 'year')

[datetime.datetime(2005, 1, 1)]

 

>>> Entry.objects.dates('pub_date', 'month')

[datetime.datetime(2005, 2, 1), datetime.datetime(2005, 3, 1)]

 

>>> Entry.objects.dates('pub_date', 'day')

[datetime.datetime(2005, 2, 20), datetime.datetime(2005, 3, 20)]

 

>>> Entry.objects.dates('pub_date', 'day', order='DESC')

[datetime.datetime(2005, 3, 20), datetime.datetime(2005, 2, 20)]

 

>>> Entry.objects.filter(headline__contains='Lennon').dates('pub_date', 'day')

[datetime.datetime(2005, 3, 20)]

select_related()

Returns a QuerySet that will automatically follow foreign key relationships, selecting that additional related-object data when it executes its query. This is a performance booster that results in (sometimes much) larger queries but means later use of foreign key relationships wont require database queries.

The following examples illustrate the difference between plain lookups and select_related() lookups. Heres standard lookup:

# Hits the database.

>>> e = Entry.objects.get(id=5)

 

# Hits the database again to get the related Blog object.

>>> b = e.blog

And heres select_related lookup:

# Hits the database.

>>> e = Entry.objects.select_related().get(id=5)

 

# Doesn't hit the database, because e.blog has been prepopulated

# in the previous query.

>>> b = e.blog

select_related() follows foreign keys as far as possible. If you have the following models:

class City(models.Model):

    # ...

 

class Person(models.Model):

    # ...

    hometown = models.ForeignKey(City)

 

class Book(models.Model):

    # ...

    author = models.ForeignKey(Person)

then a call to Book.objects.select_related().get(id=4) will cache the related Person and the related City :

>>> b = Book.objects.select_related().get(id=4)

>>> p = b.author         # Doesn't hit the database.

>>> c = p.hometown       # Doesn't hit the database.

 

>>> b = Book.objects.get(id=4) # No select_related() in this example.

>>> p = b.author         # Hits the database.

>>> c = p.hometown       # Hits the database.

Note that select_related() does not follow foreign keys that have null=True .

Usually, using select_related() can vastly improve performance because your application can avoid many database calls. However, in situations with deeply nested sets of relationships, select_related() can sometimes end up following too many relations and can generate queries so large that they end up being slow.

extra()

Sometimes, the Django query syntax by itself cant easily express a complex WHERE clause. For these edge cases, Django provides the extra() QuerySet modifier a hook for injecting specific clauses into the SQL generated by a QuerySet .

By definition, these extra lookups may not be portable to different database engines (because youre explicitly writing SQL code) and violate the DRY principle, so you should avoid them if possible.

Specify one or more of params , select , where , or tables . None of the arguments is required, but you should use at least one of them.

The select argument lets you put extra fields in the SELECT clause. It should be a dictionary mapping attribute names to SQL clauses to use to calculate that attribute:

>>> Entry.objects.extra(select={'is_recent': "pub_date > '2006-01-01'"})

As a result, each Entry object will have an extra attribute, is_recent , a Boolean representing whether the entrys pub_date is greater than January 1, 2006.

The next example is more advanced; it does a subquery to give each resulting Blog object an entry_countattribute, an integer count of associated Entry objects:

>>> subq = 'SELECT COUNT(*) FROM blog_entry WHERE blog_entry.blog_id = blog_blog.id'

>>> Blog.objects.extra(select={'entry_count': subq})

(In this particular case, were exploiting the fact that the query will already contain the blog_blog table in itsFROM clause.)

You can define explicit SQL WHERE clauses perhaps to perform nonexplicit joins by using where . You can manually add tables to the SQL FROM clause by using tables .

where and tables both take a list of strings. All where parameters are ANDed to any other search criteria:

>>> Entry.objects.extra(where=['id IN (3, 4, 5, 20)'])

The select and where parameters described previously may use standard Python database string placeholders: '%s' to indicate parameters the database engine should automatically quote. The paramsargument is a list of any extra parameters to be substituted:

>>> Entry.objects.extra(where=['headline=%s'], params=['Lennon'])

Always use params instead of embedding values directly into select or where because params will ensure values are quoted correctly according to your particular database.

Heres an example of the wrong way:

Entry.objects.extra(where=["headline='%s'" % name])

Heres an example of the correct way:

Entry.objects.extra(where=['headline=%s'], params=[name])

QuerySet Methods That Do Not Return QuerySets

The following QuerySet methods evaluate the QuerySet and return something otherthan a QuerySet a single object, value, and so forth.

get(**lookup)

Returns the object matching the given lookup parameters, which should be in the format described in the Field Lookups section. This raises AssertionError if more than one object was found.

get() raises a DoesNotExist exception if an object wasnt found for the given parameters. The DoesNotExistexception is an attribute of the model class, for example:

>>> Entry.objects.get(id='foo') # raises Entry.DoesNotExist

The DoesNotExist exception inherits from django.core.exceptions.ObjectDoesNotExist , so you can target multipleDoesNotExist exceptions:

>>> from django.core.exceptions import ObjectDoesNotExist

>>> try:

...     e = Entry.objects.get(id=3)

...     b = Blog.objects.get(id=1)

... except ObjectDoesNotExist:

...     print "Either the entry or blog doesn't exist."

create(**kwargs)

This is a convenience method for creating an object and saving it all in one step. It lets you compress two common steps:

>>> p = Person(first_name="Bruce", last_name="Springsteen")

>>> p.save()

into a single line:

>>> p = Person.objects.create(first_name="Bruce", last_name="Springsteen")

get_or_create(**kwargs)

This is a convenience method for looking up an object and creating one if it doesnt exist. It returns a tuple of (object, created) , where object is the retrieved or created object and created is a Boolean specifying whether a new object was created.

This method is meant as a shortcut to boilerplate code and is mostly useful for data-import scripts, for example:

try:

    obj = Person.objects.get(first_name='John', last_name='Lennon')

except Person.DoesNotExist:

    obj = Person(first_name='John', last_name='Lennon', birthday=date(1940, 10, 9))

    obj.save()

This pattern gets quite unwieldy as the number of fields in a model increases. The previous example can be rewritten using get_or_create() like so:

obj, created = Person.objects.get_or_create(

    first_name = 'John',

    last_name  = 'Lennon',

    defaults   = {'birthday': date(1940, 10, 9)}

)

Any keyword arguments passed to get_or_create() except an optional one called defaults will be used in aget() call. If an object is found, get_or_create() returns a tuple of that object and False . If an object is notfound, get_or_create() will instantiate and save a new object, returning a tuple of the new object and True . The new object will be created according to this algorithm:

defaults = kwargs.pop('defaults', {})

params = dict([(k, v) for k, v in kwargs.items() if '__' not in k])

params.update(defaults)

obj = self.model(**params)

obj.save()

In English, that means start with any non-'defaults' keyword argument that doesnt contain a double underscore (which would indicate a nonexact lookup). Then add the contents of defaults , overriding any keys if necessary, and use the result as the keyword arguments to the model class.

If you have a field named defaults and want to use it as an exact lookup in get_or_create() , just use'defaults__exact' like so:

Foo.objects.get_or_create(

    defaults__exact = 'bar',

    defaults={'defaults': 'baz'}

)

Note

As mentioned earlier, get_or_create() is mostly useful in scripts that need to parse data and create new records if existing ones arent available. But if you need to use get_or_create() in a view, please make sure to use it only in POST requests unless you have a good reason not to. GET requests shouldnt have any effect on data; use POST whenever a request to a page has a side effect on your data.

count()

Returns an integer representing the number of objects in the database matching the QuerySet . count() never raises exceptions. Heres an example:

# Returns the total number of entries in the database.

>>> Entry.objects.count()

4

 

# Returns the number of entries whose headline contains 'Lennon'

>>> Entry.objects.filter(headline__contains='Lennon').count()

1

count() performs a SELECT COUNT(*) behind the scenes, so you should always use count() rather than loading all of the records into Python objects and calling len() on the result.

Depending on which database youre using (e.g., PostgreSQL or MySQL), count() may return a long integer instead of a normal Python integer. This is an underlying implementation quirk that shouldnt pose any real-world problems.

in_bulk(id_list)

Takes a list of primary key values and returns a dictionary mapping each primary key value to an instance of the object with the given ID, for example:

>>> Blog.objects.in_bulk([1])

{1: Beatles Blog}

>>> Blog.objects.in_bulk([1, 2])

{1: Beatles Blog, 2: Cheddar Talk}

>>> Blog.objects.in_bulk([])

{}

IDs of objects that dont exist are silently dropped from the result dictionary. If you pass in_bulk() an empty list, youll get an empty dictionary.

latest(field_name=None)

Returns the latest object in the table, by date, using the field_name provided as the date field. This example returns the latest Entry in the table, according to the pub_date field:

>>> Entry.objects.latest('pub_date')

If your models Meta specifies get_latest_by , you can leave off the field_name argument to latest() . Django will use the field specified in get_latest_by by default.

Like get() , latest() raises DoesNotExist if an object doesnt exist with the given parameters.

Field Lookups

Field lookups are how you specify the meat of an SQL WHERE clause. Theyre specified as keyword arguments to the QuerySet methods filter() , exclude() , and get() .

Basic lookup keyword arguments take the form field__lookuptype=value (note the double underscore). For example:

>>> Entry.objects.filter(pub_date__lte='2006-01-01')

translates (roughly) into the following SQL:

SELECT * FROM blog_entry WHERE pub_date <= '2006-01-01';

If you pass an invalid keyword argument, a lookup function will raise TypeError .

The supported lookup types follow.

exact

Performs an exact match:

>>> Entry.objects.get(headline__exact="Man bites dog")

This matches any object with the exact headline Man bites dog.

If you dont provide a lookup type that is, if your keyword argument doesnt contain a double underscore the lookup type is assumed to be exact .

For example, the following two statements are equivalent:

>>> Blog.objects.get(id__exact=14) # Explicit form

>>> Blog.objects.get(id=14) # __exact is implied

This is for convenience, because exact lookups are the common case.

iexact

字符串比较(大小写无关)

>>> Blog.objects.get(name__iexact='beatles blog')

This will match 'Beatles Blog' , 'beatles blog' , 'BeAtLes BLoG' , and so forth.

contains

Performs a case-sensitive containment test:

Entry.objects.get(headline__contains='Lennon')

This will match the headline 'Today Lennon honored' but not 'today lennon honored' .

SQLite doesnt support case-sensitive LIKE statements; when using SQLite,``contains`` acts like icontains .

Escaping Percent Signs and Underscores in LIKE Statements

The field lookups that equate to LIKE SQL statements (iexact , contains , icontains , startswith , istartswith ,endswith , and iendswith ) will automatically escape the two special characters used in LIKE statements the percent sign and the underscore. (In a LIKE statement, the percent sign signifies a multiple-character wildcard and the underscore signifies a single-character wildcard.)

This means things should work intuitively, so the abstraction doesnt leak. For example, to retrieve all the entries that contain a percent sign, just use the percent sign as any other character:

Entry.objects.filter(headline__contains='%')

Django takes care of the quoting for you. The resulting SQL will look something like this:

SELECT ... WHERE headline LIKE '%\%%';

The same goes for underscores. Both percentage signs and underscores are handled for you transparently.

icontains

Performs a case-insensitive containment test:

>>> Entry.objects.get(headline__icontains='Lennon')

Unlike contains , icontains will match 'today lennon honored' .

gt, gte, lt, and lte

These represent greater than, greater than or equal to, less than, and less than or equal to:

>>> Entry.objects.filter(id__gt=4)

>>> Entry.objects.filter(id__lt=15)

>>> Entry.objects.filter(id__gte=0)

These queries return any object with an ID greater than 4, an ID less than 15, and an ID greater than or equal to 1, respectively.

Youll usually use these on numeric fields. Be careful with character fields since character order isnt always what youd expect (i.e., the string 4 sorts after the string 10).

in

Filters where a value is on a given list:

Entry.objects.filter(id__in=[1, 3, 4])

This returns all objects with the ID 1, 3, or 4.

startswith

Performs a case-sensitive starts-with:

>>> Entry.objects.filter(headline__startswith='Will')

This will return the headlines Will he run? and Willbur named judge, but not Who is Will? or will found in crypt.

istartswith

Performs a case-insensitive starts-with:

>>> Entry.objects.filter(headline__istartswith='will')

This will return the headlines Will he run?, Willbur named judge, and will found in crypt, but not Who is Will?

endswith and iendswith

Perform case-sensitive and case-insensitive ends-with:

>>> Entry.objects.filter(headline__endswith='cats')

>>> Entry.objects.filter(headline__iendswith='cats')

range

Performs an inclusive range check:

>>> start_date = datetime.date(2005, 1, 1)

>>> end_date = datetime.date(2005, 3, 31)

>>> Entry.objects.filter(pub_date__range=(start_date, end_date))

You can use range anywhere you can use BETWEEN in SQL for dates, numbers, and even characters.

year, month, and day

For date/datetime fields, perform exact year, month, or day matches:

# Year lookup

>>>Entry.objects.filter(pub_date__year=2005)

 

# Month lookup -- takes integers

>>> Entry.objects.filter(pub_date__month=12)

 

# Day lookup

>>> Entry.objects.filter(pub_date__day=3)

 

# Combination: return all entries on Christmas of any year

>>> Entry.objects.filter(pub_date__month=12, pub_date_day=25)

isnull

Takes either True or False , which correspond to SQL queries of IS NULL and IS NOT NULL , respectively:

>>> Entry.objects.filter(pub_date__isnull=True)

__isnull=True vs. __exact=None

There is an important difference between __isnull=True and __exact=None . __exact=None will always return an empty result set, because SQL requires that no value is equal to NULL . __isnull determines if the field is currently holding the value of NULL without performing a comparison.

search

A Boolean full-text search that takes advantage of full-text indexing. This is like contains but is significantly faster due to full-text indexing.

Note this is available only in MySQL and requires direct manipulation of the database to add the full-text index.

The pk Lookup Shortcut

For convenience, Django provides a pk lookup type, which stands for primary_key.

In the example Blog model, the primary key is the id field, so these three statements are equivalent:

>>> Blog.objects.get(id__exact=14) # Explicit form

>>> Blog.objects.get(id=14) # __exact is implied

>>> Blog.objects.get(pk=14) # pk implies id__exact

The use of pk isnt limited to __exact queries any query term can be combined with pk to perform a query on the primary key of a model:

# Get blogs entries  with id 1, 4, and 7

>>> Blog.objects.filter(pk__in=[1,4,7])

 

# Get all blog entries with id > 14

>>> Blog.objects.filter(pk__gt=14)

pk lookups also work across joins. For example, these three statements are equivalent:

>>> Entry.objects.filter(blog__id__exact=3) # Explicit form

>>> Entry.objects.filter(blog__id=3) # __exact is implied

>>> Entry.objects.filter(blog__pk=3) # __pk implies __id__exact

Complex Lookups with Q Objects

Keyword argument queries in filter() and so on are ANDed together. If you need to execute more complex queries (e.g., queries with OR statements), you can use Q objects.

A Q object (django.db.models.Q ) is an object used to encapsulate a collection of keyword arguments. These keyword arguments are specified as in the Field Lookups section.

For example, this Q object encapsulates a single LIKE query:

Q(question__startswith='What')

Q objects can be combined using the & and | operators. When an operator is used on two Q objects, it yields a new Q object. For example, this statement yields a single Q object that represents the OR of two"question__startswith" queries:

Q(question__startswith='Who') | Q(question__startswith='What')

This is equivalent to the following SQL WHERE clause:

WHERE question LIKE 'Who%' OR question LIKE 'What%'

You can compose statements of arbitrary complexity by combining Q objects with the & and | operators. You can also use parenthetical grouping.

Each lookup function that takes keyword arguments (e.g., filter() , exclude() , get() ) can also be passed one or more Q objects as positional (not-named) arguments. If you provide multiple Q object arguments to a lookup function, the arguments will be ANDed together, for example:

Poll.objects.get(

    Q(question__startswith='Who'),

    Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6))

)

roughly translates into the following SQL:

SELECT * from polls WHERE question LIKE 'Who%'

    AND (pub_date = '2005-05-02' OR pub_date = '2005-05-06')

Lookup functions can mix the use of Q objects and keyword arguments. All arguments provided to a lookup function (be they keyword arguments or Q objects) are ANDed together. However, if a Q object is provided, it must precede the definition of any keyword arguments. For example, the following:

Poll.objects.get(

    Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)),

    question__startswith='Who')

would be a valid query, equivalent to the previous example, but this:

# INVALID QUERY

Poll.objects.get(

    question__startswith='Who',

    Q(pub_date=date(2005, 5, 2)) | Q(pub_date=date(2005, 5, 6)))

would not be valid.

You can find some examples online athttp://www.djangoproject.com/documentation/0.96/models/or_lookups/.

关系对象

When you define a relationship in a model (i.e., a ForeignKey , OneToOneField , or ManyToManyField ), instances of that model will have a convenient API to access the related object(s).

For example, an Entry object e can get its associated Blog object by accessing the blog attribute e.blog .

Django also creates API accessors for the other side of the relationship the link from the related model to the model that defines the relationship. For example, a Blog object b has access to a list of all related Entryobjects via the entry_set attribute: b.entry_set.all() .

All examples in this section use the sample Blog , Author , and Entry models defined at the top of this page.

Lookups That Span Relationships

Django offers a powerful and intuitive way to follow relationships in lookups, taking care of the SQL JOIN s for you automatically behind the scenes. To span a relationship, just use the field name of related fields across models, separated by double underscores, until you get to the field you want.

This example retrieves all Entry objects with a Blog whose name is 'Beatles Blog' :

>>> Entry.objects.filter(blog__name__exact='Beatles Blog')

This spanning can be as deep as youd like.

It works backward, too. To refer to a reverse relationship, just use the lowercase name of the model.

This example retrieves all Blog objects that have at least one Entry whose headline contains 'Lennon' :

>>> Blog.objects.filter(entry__headline__contains='Lennon')

外键关系

如果一个模型里面有一个 ForeignKey 字段,那么它的实例化对象可以很轻易的通过模型的属性来访问与其关联的关系对象,例如:

e = Entry.objects.get(id=2)

e.blog # Returns the related Blog object.

你可以通过外键属性来获取并设置关联的外键对象。如你所料,单纯修改外键的操作是不能马上将修改的内容同步到数据库中的,你还必须调用 save() 方法才行,例如:

e = Entry.objects.get(id=2)

e.blog = some_blog

e.save()

如果一个 ForeignKey 字段设置了 null=True 选项(允许 NULL 值)时,你可以将 None 赋给它(译注:但纯设置null=True其实还是不行的,会抛出异常的,还不须把blank=True也设了才行,不知道什么原因,我一直以来都有点怀疑这是个BUG)

e = Entry.objects.get(id=2)

e.blog = None

e.save() # "UPDATE blog_entry SET blog_id = NULL ...;"

Forward access to one-to-many relationships is cached the first time the related object is accessed. Subsequent accesses to the foreign key on the same object instance are cached, for example:

e = Entry.objects.get(id=2)

print e.blog  # Hits the database to retrieve the associated Blog.

print e.blog  # Doesn't hit the database; uses cached version.

Note that the select_related() QuerySet method recursively prepopulates the cache of all one-to-many relationships ahead of time:

e = Entry.objects.select_related().get(id=2)

print e.blog  # Doesn't hit the database; uses cached version.

print e.blog  # Doesn't hit the database; uses cached version.

select_related() is documented in the QuerySet Methods That Return New QuerySets section.

Reverse Foreign Key Relationships

Foreign key relationships are automatically symmetrical a reverse relationship is inferred from the presence of a ForeignKey pointing to another model.

If a model has a ForeignKey , instances of the foreign key model will have access to a Manager that returns all instances of the first model. By default, this Manager is named FOO_set , where FOO is the source model name, lowercased. This Manager returns QuerySets , which can be filtered and manipulated as described in the Retrieving Objects section.

Heres an example:

b = Blog.objects.get(id=1)

b.entry_set.all() # Returns all Entry objects related to Blog.

 

# b.entry_set is a Manager that returns QuerySets.

b.entry_set.filter(headline__contains='Lennon')

b.entry_set.count()

You can override the FOO_set name by setting the related_name parameter in the ForeignKey() definition. For example, if the Entry model was altered to blog = ForeignKey(Blog, related_name='entries') , the preceding example code would look like this:

b = Blog.objects.get(id=1)

b.entries.all() # Returns all Entry objects related to Blog.

 

# b.entries is a Manager that returns QuerySets.

b.entries.filter(headline__contains='Lennon')

b.entries.count()

You cannot access a reverse ForeignKey Manager from the class; it must be accessed from an instance:

Blog.entry_set # Raises AttributeError: "Manager must be accessed via instance".

In addition to the QuerySet methods defined in the Retrieving Objects section, the ForeignKey Manager has these additional methods:

add(obj1, obj2, ...) : Adds the specified model objects to the related object set, for example:

b = Blog.objects.get(id=1)

e = Entry.objects.get(id=234)

b.entry_set.add(e) # Associates Entry e with Blog b.

create(**kwargs) : Creates a new object, saves it, and puts it in the related object set. It returns the newly created object:

b = Blog.objects.get(id=1)

e = b.entry_set.create(headline='Hello', body_text='Hi', pub_date=datetime.date(2005, 1, 1))

# No need to call e.save() at this point -- it's already been saved.

This is equivalent to (but much simpler than) the following:

b = Blog.objects.get(id=1)

e = Entry(blog=b, headline='Hello', body_text='Hi', pub_date=datetime.date(2005, 1, 1))

e.save()

Note that theres no need to specify the keyword argument of the model that defines the relationship. In the preceding example, we dont pass the parameter blog to create() . Django figures out that the new Entry objects blog field should be set to b .

remove(obj1, obj2, ...) : Removes the specified model objects from the related object set:

b = Blog.objects.get(id=1)

e = Entry.objects.get(id=234)

b.entry_set.remove(e) # Disassociates Entry e from Blog b.

In order to prevent database inconsistency, this method only exists on ForeignKey objects where null=True . If the related field cant be set to None (NULL ), then an object cant be removed from a relation without being added to another. In the preceding example, removing efrom b.entry_set() is equivalent to doing e.blog = None , and because the blog ForeignKeydoesnt have null=True , this is invalid.

clear() : Removes all objects from the related object set:

b = Blog.objects.get(id=1)

b.entry_set.clear()

Note this doesnt delete the related objects it just disassociates them.

Just like remove() , clear() is only available on ForeignKey``s where ``null=True .

通过给关联集分配一个可迭代的对象可以实现一股脑的把多个对象赋给它

b = Blog.objects.get(id=1)

b.entry_set = [e1, e2]

If the clear() method is available, any pre-existing objects will be removed from the entry_set before all objects in the iterable (in this case, a list) are added to the set. If the clear() method is not available, all objects in the iterable will be added without removing any existing elements.

Each reverse operation described in this section has an immediate effect on the database. Every addition, creation, and deletion is immediately and automatically saved to the database.

多对多关系

在多对多关系的两端,都可以通过相应的API来访问另外的一端。 API的工作方式跟前一节所描述的反向一对多关系差不多。

唯一的不同在于属性的命名:定义了``ManyToManyField``model的实例使用属性名称本身,另外一端的model的实例则使用model名称的小写加上``_set``来活得关联的对象集(就跟反向一对多关系一样)

用例子来说明一下大家会更容易理解:

e = Entry.objects.get(id=3)

e.authors.all() # Returns all Author objects for this Entry.

e.authors.count()

e.authors.filter(name__contains='John')

 

a = Author.objects.get(id=5)

a.entry_set.all() # Returns all Entry objects for this Author.

Like ForeignKey , ManyToManyField can specify related_name . In the preceding example, if the ManyToManyField inEntry had specified related_name='entries' , then each Author instance would have an entries attribute instead of entry_set .

How Are the Backward Relationships Possible?

Other object-relational mappers require you to define relationships on both sides. The Django developers believe this is a violation of the DRY (Dont Repeat Yourself) principle, so Django requires you to define the relationship on only one end. But how is this possible, given that a model class doesnt know which other model classes are related to it until those other model classes are loaded?

The answer lies in the INSTALLED_APPS setting. The first time any model is loaded, Django iterates over every model in INSTALLED_APPS and creates the backward relationships in memory as needed. Essentially, one of the functions of INSTALLED_APPS is to tell Django the entire model domain.

Queries Over Related Objects

Queries involving related objects follow the same rules as queries involving normal value fields. When specifying the value for a query to match, you may use either an object instance itself or the primary key value for the object.

For example, if you have a Blog object b with id=5 , the following three queries would be identical:

Entry.objects.filter(blog=b) # Query using object instance

Entry.objects.filter(blog=b.id) # Query using id from instance

Entry.objects.filter(blog=5) # Query using id directly

Deleting Objects

The delete method, conveniently, is named delete() . This method immediately deletes the object and has no return value:

e.delete()

You can also delete objects in bulk. Every QuerySet has a delete() method, which deletes all members of thatQuerySet . For example, this deletes all Entry objects with a pub_date year of 2005:

Entry.objects.filter(pub_date__year=2005).delete()

When Django deletes an object, it emulates the behavior of the SQL constraint ON DELETE CASCADE in other words, any objects that had foreign keys pointing at the object to be deleted will be deleted along with it, for example:

b = Blog.objects.get(pk=1)

# This will delete the Blog and all of its Entry objects.

b.delete()

Note that delete() is the only QuerySet method that is not exposed on a Manager itself. This is a safety mechanism to prevent you from accidentally requesting Entry.objects.delete() and deleting all the entries. If you do want to delete all the objects, then you have to explicitly request a complete query set:

Entry.objects.all().delete()

Extra Instance Methods

In addition to save() and delete() , a model object might get any or all of the following methods.

get_FOO_display()

For every field that has choices set, the object will have a get_FOO_display() method, where FOO is the name of the field. This method returns the human-readable value of the field. For example, in the following model:

GENDER_CHOICES = (

    ('M', 'Male'),

    ('F', 'Female'),

)

class Person(models.Model):

    name = models.CharField(max_length=20)

    gender = models.CharField(max_length=1, choices=GENDER_CHOICES)

每一个 Person 实例都将有一个 get_gender_display() 方法:

>>> p = Person(name='John', gender='M')

>>> p.save()

>>> p.gender

'M'

>>> p.get_gender_display()

'Male'

get_next_by_FOO(**kwargs) and get_previous_by_FOO(**kwargs)

For every DateField and DateTimeField that does not have null=True , the object will have get_next_by_FOO() andget_previous_by_FOO() methods, where FOO is the name of the field. This returns the next and previous object with respect to the date field, raising the appropriate DoesNotExist exception when appropriate.

Both methods accept optional keyword arguments, which should be in the format described in the Field Lookups section.

Note that in the case of identical date values, these methods will use the ID as a fallback check. This guarantees that no records are skipped or duplicated. For a full example, see the lookup API samples athttp://www.djangoproject.com/documentation/0.96/models/lookup/.

get_FOO_filename()

For every FileField , the object will have a get_FOO_filename() method, where FOO is the name of the field. This returns the full filesystem path to the file, according to your MEDIA_ROOT setting.

Note that ImageField is technically a subclass of FileField , so every model with an ImageField will also get this method.

get_FOO_url()

For every FileField , the object will have a get_FOO_url() method, where FOO is the name of the field. This returns the full URL to the file, according to your MEDIA_URL setting. If the value is blank, this method returns an empty string.

get_FOO_size()

For every FileField , the object will have a get_FOO_size() method, where FOO is the name of the field. This returns the size of the file, in bytes. (Behind the scenes, it uses os.path.getsize .)

save_FOO_file(filename, raw_contents)

For every FileField , the object will have a save_FOO_file() method, where FOO is the name of the field. This saves the given file to the filesystem, using the given file name. If a file with the given file name already exists, Django adds an underscore to the end of the file name (but before the extension) until the file name is available.

get_FOO_height() and get_FOO_width()

For every ImageField , the object will have get_FOO_height() and get_FOO_width() methods, where FOO is the name of the field. This returns the height (or width) of the image, as an integer, in pixels.

Shortcuts

As you develop views, you will discover a number of common idioms in the way you use the database API. Django encodes some of these idioms as shortcuts that can be used to simplify the process of writing views. These functions are in the django.shortcuts module.

get_object_or_404()

One common idiom to use get() and raise Http404 if the object doesnt exist. This idiom is captured byget_object_or_404() . This function takes a Django model as its first argument and an arbitrary number of keyword arguments, which it passes to the default managers get() function. It raises Http404 if the object doesnt exist, for example:

# Get the Entry with a primary key of 3

e = get_object_or_404(Entry, pk=3)

When you provide a model to this shortcut function, the default manager is used to execute the underlyingget() query. If you dont want to use the default manager, or if you want to search a list of related objects, you can provide get_object_or_404() with a Manager object instead:

# Get the author of blog instance e with a name of 'Fred'

a = get_object_or_404(e.authors, name='Fred')

 

# Use a custom manager 'recent_entries' in the search for an

# entry with a primary key of 3

e = get_object_or_404(Entry.recent_entries, pk=3)

get_list_or_404()

get_list_or_404 行为与 get_object_or_404() 相同,但是它用 filter() 取代了 get() 。如果列表为空,它将引发 Http404 

回归原始的SQL操作

如果你需要写一个SQL查询,但是用Django的数据库映射来实现的话太复杂了,那么你可以考虑使用原始的SQL语句。

解决这个问题的比较好的方法是,给模块写一个自定义的方法或者管理器方法来执行查询。尽管在Django中,数据库查询在模块中没有任何存在的 必要性 ,但是这种解决方案使你的数据访问在逻辑上保持一致,而且从组织代码的角度讲也更灵活。操作指南见附录B

最后,请记住Django的数据库层仅仅是访问数据库的一个接口,你可以通过其他的工具、编程语言或者数据库框架来访问数据库,它并不是特定于Django使用的。